Do Uniforms Affect Scores?
Dr. Stokes and three Samford University Fellows students collected boxscore and uniform data from two groups of teams over the last six years.
Sample (teams included in the study)
For all the current SEC teams, we collected on every regular season game from the 2010 season through the 2015 season. The second group is a composite of the Top 25 non-SEC teams over the five year period from 2010-2014. In order to be included in the group, a team had to appear in the final AP Top 25 at least three times over the five year period. Notre Dame was also included in this group because of their participation in the 2012 BCS National Championship game, even though they made only two final AP Top 25s. A few notable teams that “just missed the cut” were UCLA, UCF, Michigan, Virginia Tech, and Arizona State. For the composite Top 25 group we collected data only on home games. The list of teams is below:
SEC
Alabama
Arkansas
Auburn
Florida
Georgia
Kentucky
LSU
Mississippi State
Missouri
Ole Miss
South Carolina
Tennessee
Texas A&M
Vanderbilt
Non-SEC Composite Top 25 (2010-2014)
Baylor
Boise State
Clemson
FSU
Kansas State
Louisville
Michigan State
Nebraska
Notre Dame
Ohio State
Oklahoma
Oklahoma State
Oregon
Southern California
Stanford
TCU
Wisconsin
Data and Measures (Variables)
For each game we collected the following information:
Team name (the team in the study)
Opponent name
Game location (Home/Away)
Season (year)
Date of game
Spread at kickoff
Final score
Score at the end of the first quarter
From the raw data we created the following analytical variables:
Team number (teamnum) – Each team in the study was assigned a unique identifying number. Note that each observation in the dataset includes data from the perspective of a single team (the “focal” team) that is included in the study. Non-study teams also appear in the data, but only as opponents of the study teams.
Game sequence (gameseq) – A number that indicates the order of the game within its season. For instance the first game of the season for Tennessee in 2010 would be assigned “1” for game sequence and the first game of the season in 2011 for Tennessee would similarly be assigned a “1.”
Home game (homegame) – An indicator variable was created with a “1” if the focal team was the home team and a “0” if the focal team is away.
Final straight up (finalsu) – This variable is the final score differential from the perspective of the focal team. Thus, if Alabama is the focal team and they defeated LSU by 10 points, this variable would be coded “10.” The same game with LSU as the focal team would have the variable coded “-10.”
Final against the spread (finalats) – The variable is the score differential against the spread from the perspective of the focal team. For example, if Baylor was favored to defeat TCU by 5 points (spread at kickoff) but only won by 3, this variable would be coded “-2” for Baylor and “2” for TCU.
The spread at kickoff was obtained from historical data at Covers.com (http://www.covers.com/Sports/ncaaf/Matchups) and, when data were missing from Covers.com, from Sunshine Forecast (http://www.repole.com/sun4cast/data.html).
First quarter straight up (q1score) – This variable is analogous to the “final straight up” variable, except that it represents the score differential at the end of the first quarter.
First quarter against the spread (q1ats) – This variable is analogous to the “final straight up” variable, except that it represents the score differential at the end of the first quarter. The expected spread for the first quarter is the final spread divided by four.
Game ID (gameid) – Each game was assigned a unique identifier. Note that this is necessary because some games appear twice in the dataset. For instance, Georgia vs. Auburn 2011 will appear once from the perspective of Auburn and once from the perspective of Georgia. Although most of the other variables will have different values because of the different perspectives (e.g. finalats is -26.5 for Auburn and homegame is 0, while finalats for UGA is 26.5 and homegame is 1), the values are statistically linked, making it necessary to either conduct analyses separately by home and away games, or statistically account for the observations being clustered within games.
Alternate uniform (uniform) – This was the most difficult variable to collect and operationalize. Simply, the variable is coded “0” if the team wore a “standard” uniform and “1” if they wore an “alternate” uniform. With the proliferation of uniform combinations, however, it was difficult to determine what a standard uniform was for some teams. One possible approach, which we decided against, is to use the traditional uniform palette as the standard. This would define standard home uniform as colored jersey, white pants, and “normal” helmet with the standard away uniform being white jersey, white pants, and standard helmet. That would leave anything that deviated from the traditional uniform palette as an alternate. But we discovered that so many teams have moved away from the traditional palette that this definition would obscure alternate uniforms which are truly unusual for the team. Therefore, using the traditional palette described above only as a reference point, we sought to establish the “standard” home and away uniforms for each team within each season (many teams changed their uniform palettes from season to season). To qualify as the standard, a uniform combo had to be worn in the clear majority of home or away games. For example, if Ole Miss wore all-white road uniforms (with the same helmet each of these times) in three of their six away games for 2013, and then used different away combos in the other three games, the all-white combo would be their road standard (and be coded “0”) while all the other combos would be considered alternates (coded “1”). Some teams used so many different combinations that there was no apparent standard uniform. Not surprisingly, Oregon rarely repeated uniform combinations. In these cases, we coded all uniforms as alternates, though a case could be made that variety is the standard for teams such as Oregon and all the different combos should be coded as standard.
Analytical Considerations
Analyses involving both home and away games will involve some games that appear twice in the data, each observation is from the perspective of the focal team. Such observations violate assumptions of statistical independence inherent in most analytical methods. Analyzing home and away games separately is the safest approach, though other advanced statistical methods are available to deal with this problem.
With this data, it is possible to conduct some analyses where the entire “population” of data is available, negating the need for the statistical considerations associated with sampling. For instance, if the researcher wanted to present evidence about how uniform choices have impacted the regular-season performance of SEC teams over the last six years, there is no need to address confidence intervals or sampling error because we have all the regular-season games of SEC teams for the last six years. Similarly, a statement about the home-game performance of our Composite Top 25 would need no sample-based qualifications because we have all the home games for those teams. If, however, the researcher wanted to comment about all of CFB, she would need to apply methods that account for sample bias.