Winning the Citadel Data Open


December 13th, 2022

Introduction

Last week three friends and I competed in the 2022 Citadel Data Open Championship, a premier Machine Learning and Data Science competition with a $100,000 cash grand prize. Yesterday, our team was humbled to be named the 1st place global champions. Today, I decided to share a brief summary of our experience.

Champions with big check
Champions!

After winning Citadel’s West Coast Fall Datathon over a year ago with an NLP and transfer-learning powered SVM clickbait classifer, my friends and I earned ourselves a ticket to the global Data Open Championship. Twenty-one other top teams from across ten regional Datathons were invited to compete among us for the $100,000 grand prize. The competition has been described as "the largest and most prestigious university-level data science competition in the world" with previous top teams including PhD students from MIT, Stanford, Berkeley, Yale, and Cambridge.


On November 30th the competition’s problem statement and dataset were finally released: to analyze America’s postsecondary educational institutions using the IPEDS dataset, which includes a depth of granular information about school demographics, finances, admissions, and completions. We were then given four days to pose a novel research question, answer it using mathematic, statistical, and computational modeling, then write a technical report detailing our methods and findings.


Data Open at the New York Stock Exchange

Conclusion

The Data Open Championship was an incredible opportunity to put our love of applied mathematical analysis to work exploring important questions about our world. We had an amazing time with this series of competitions and are excited for what future adventures lie ahead! I hope you enjoyed hearing a bit about our process. If you'd like to chat, please feel free to connect!


Acknowledgements

Huge thanks to my teammates David “X” Chen, Milo “Garrett” Knell, and Alan “🐿️” Wu for making the grind a fun and unforgettable experience. Special thanks to Professor Linus Yamane who beautifully taught us many of the techniques we used in our investigation and inspired our appreciation for traditional statistical modeling. We also appreciate my previous research supervisor Professor Lucas Bang for kindly letting us work in his lab during the competition. Shout out to our friend and mentor Sahil Rane whose team managed to snag the runner up title. And of course thank you Citadel and Correlation One for organizing the events.

Cake Celebration!
Cake Celebration!