Abstract by Heehoon Jung
Track and Field: Data preparation for analysis
Of interest to collegiate coaches is athletic potential as young adults grow and develop. College coaches assume potential recruits will improve in college, but there is great uncertainty about how much improvement will occur. The ultimate goal of this research is to build performance curves to predict the future performance of high school athletes. We have roughly 4.5 million marks for high school and college track and field athletes from 1999 to 2014. These data come from Dyestat (high school) and Tfrrs (collegiate) open source websites, but in different forms and units of measure. We demonstrate methods to standardize these data and link athletes across data sets in order to analyze performance curves over time. Preparation of these data for the analysis include unit conversion, standardization of records, and record linkage across different sources of data. We present some of the work done with the data in preparation for performance curve analysis.