A Weekend of Data Crunching

Duke holds third annual DataFest
April 28, 2014

In late March, newly renovated Gross Hall played host to the 2014 DataFest, a forty-eight-hour team competition in big data analytics. The event attracted close to 130 undergraduate and graduate students from a variety of disciplines, including statistics, computer science, and engineering. Duke was the best represented school, but participants also made the trip from the University of North Carolina at Chapel Hill, North Carolina State University, and even Dartmouth College.

Mine Çetinkaya-Rundel, assistant professor of the practice in Duke’s statistics department, served as the head organizer of the event. “The goal is to get creative with data in a way that students might be scared to do in the classroom if there were a grade or rubric attached,” says Çetinkaya-Rundel. “I want students to realize the complex nature of data, understand that they can glean something useful from it, and leave here wanting to learn a bit more.”

On Friday night, it was revealed that teams would be analyzing a data set provided by GridPoint. The energy management company advises businesses on how to increase energy savings and reduce carbon emissions by installing sensors that monitor everything from lighting efficiency to the performance of HVAC units to weather conditions. After a briefing by GridPoint senior vice president of marketing Mark Straton and senior product manager Kyle McCarter B.S.E. ’05, teams set to work deconstructing the data, staying late into the night and reconvening early in the morning. Faculty members, Ph.D. students, and industry consultants were on hand all weekend to help troubleshoot with computational challenges and to talk teams through their research questions.

On Sunday afternoon, teams presented their insights back to GridPoint, and awards were given in three categories. In the best visualization category, the two winning teams were Cougar Bait (“Prediction of Main Load Usage”) and Spectral (“Mining for Meaning: Exploring Energy Use Data”). In the best use of outside data category, the three winners were Team Clairvoyant (“How Much Energy Have We Saved?”), The Fantastic 5+1 (“Exploring Market Potential: The Fresh Market Inc.”), and DataCruncherz (“Disaster Management: A case study of Hurricane Irene”). And in the best in show category, the two winners were Wolfram Alphas (“Analytic Insight for the Facilities Manager”) and D 4 C (“Energy Efficiency Policy and Effectiveness”). Çetinkaya-Rundel says the competition offers all participants the chance to showcase their ability to work in teams and deliver meaningful results under time constraints. “I’ve had several participants tell me that the DataFest experience was one they used in job and internship interviews.”