Small Business Resources, Business Advice and Forms from AllBusiness.com

Performance Measurement & Matching: The Market for Football Coaches

By Brown, Todd,Farrell, Kathleen A,Zorn, Thomas
Publication: Quarterly Journal of Business and Economics
Date: Monday, January 1 2007
HEADNOTE

The matching hypothesis asserts that it is the matching of the employee and the firm rather than the qualifications of the employee alone that matter. Using a unique and large data set of college football coaches, we perform two different tests of the matching hypothesis.

The relative accuracy with which performance in college football is observed allows for a direct test of matching. We find that matching is a significant factor in team performance. A good match accounts, on average, for approximately a 5 percent improvement in performance. We also find that the hazard rate is increasing for the first five years and then subsequently decreasing over a coach's tenure. Our evidence is consistent with a four- or five-year contracting period for college football coaches.

Introduction

College football coaches occupy a unique position in college athletics and athletics in general. College football is the most prominent college sport. Unlike professional football, a successful coach over time is identified more strongly with the team than the athletes who leave after a few years. Their influence in some cases extends beyond athletics. Tom Osborne's successful career as Nebraska's coach, for example, led to a seat in Congress. College football coaches act as the CEOs of sizeable organizations. As such, they provide an example of a labor market where a high level manager's performance is measured with relative ease.

Finding the best person for a particular position easily can be confused with merely finding the best person. The matching hypothesis, first proposed by Jovanovic (1979), suggests it is not the most able person that should be placed in a particular position, but the best person-to-finn match that should be hired. An individual's productivity is influenced by the way in which he or she fits with the firm. This fit can be value-increasing for the firm by allowing an average employee to achieve above average results; conversely, a bad match for the firm will result in below average results.

For the firm, fit involves factors such as the individual's ability to be productive in a specific work environment. Higher level positions are multidimensional, and a simple one-dimensional metric does not capture the factors that determine a fit. As important as it is for the firm to find an employee that fits the firm, it is just as important for an employee to find employment that is optimal given his or her alternatives. The labor literature stresses the importance of employers attracting and retaining qualified employees. This implies designing competitive compensation packages that include pay, incentives, retirement, and health benefits. There are also non-pecuniary aspects of a position that the firm may or may not control completely such as the location and prestige of the firm. If a firm cannot at least meet the next best alternative available to the employee, the employee is likely to quit.

In the college coaches market, fit involves a complex set of interrelated aspects of the coach's personality, the school, alumni, student body, the athletes, the community, the local press, and the school's football tradition. While turnovers either initiated by the school or the coach are common in college football, some tenures are lengthy; Penn State coach Joe Paterno is only one example. Lengthy tenure is ex post evidence of a good match.

Information costs imply that the matching process in the case of higher level positions requires search and post-hire experience on both sides of the market. When an organization hires or fills a particular position, there is no guarantee, even after an extensive search process, that they have found an individual that fits the organization. Conversely, an employee will not be fully aware of the characteristics of a position prior to on-the-job experience.

Testing the matching hypothesis is difficult in traditional labor markets because the theory requires knowledge of unobservable characteristics of the worker, the firm, and the match (Garen, 1999). We use the approach of a number of studies that attempt to overcome at least some of these difficulties by analyzing an example from the sports market-namely, the market for college football coaches. In this market it is easier to observe the match effect. We also are able to make use of a large sample. While this is not the first study that analyzes sports teams to test the matching hypothesis, the football coaches market is particularly suited for examining the matching hypothesis as a determinant of labor market mobility. We find evidence of a coach-team match effect on team performance for a large sample of college football coaches from 1968 to 2003. We also find that the hazard rate increases early in a coach's tenure and subsequently decreases. Our results suggest, on average, that a good match results in a five percent improvement in a team's winning percentage.

The Matching Hypothesis

A good match between an individual and a firm can take many forms. The firm obviously needs an employee with the proper skills to perform the job. Moreover, it needs someone to fit the corporate culture. Conversely, an employee's preferences include factors such as compensation (pay and benefits), as well as job satisfaction. When both sides of this employment equation are satisfied, it can be expected that the tenure and productivity of the employee will be elevated, leading to greater rewards for both worker and organization.

Jovanovic (1979) theorizes that the probability of employee turnover increases at first as information is revealed about the quality of the match between the employee and the firm and then decreases as the likelihood of a good match increases. Another implication of the matching hypothesis is that if an employee is a match for the firm, his or her productivity should increase in proportion to the strength of the match. Those who are well suited for the job will receive increased compensation and will be more likely to stay. Conversely, a poorly matched employee of a firm will fail to reach his or her potential level of productivity, will fail to receive wage increases, and likely will be terminated or quit.

Match theory has been used to explain empirical results relating to the positive relation between wages and tenure and the negative association of turnover and tenure. [See Garen (1999) for a review of the matching hypothesis empirical literature.] Hersch and Reagan (1990) find that a match does have a positive effect on pay. Their study, however, relies on data gathered from 18 firms in Oregon that have few observed firm changes for employees. That study also fails to control for uniform employment characteristics of the sampled employees. To overcome this obstacle, more recent literature has used data from sports teams as a proxy for firms. Sports provide the opportunity to test the matching hypothesis using a relatively homogeneous sample of firms and employees.

Chapman and Southwick (1991) and Ohtake and Ohkusa (1994) use professional baseball data (American and Japanese, respectively) to test the impact of performance on manager-team matches. They attempt to isolate the effect of the match by controlling for player quality with statistics for hitting, fielding, and pitching. Both studies test for a match effect by using an interaction term between the team and the coach. While Chapman and Southwick (1991) find support for a match effect, Ohtake and Ohkusa (1994) do not. Prisinzano (2000) extend the data set of Chapman and Southwick (1991) and analyze the team's expected wins minus their actual wins to proxy for the managerial effect on the team.1 Borland and Lye (1996) extend Chapman and Southwick (1991) by analyzing the match effect in Australian Rules football. They use player awards to proxy for player quality and coaching experience to control for learning on the job in an effort to improve the isolation of the match variable.

The Labor Market of College Football Coaches

The college football coach market has some distinct advantages in testing the matching hypothesis. In professional sports the division of responsibilities between the coach, the general manager, and the owner can vary greatly. The legendary Chicago Bears coach George Halas ran the entire operation for much of his career; Al Davis, the general manager of the Oakland Raiders, was often credited as the genius behind the success of the team; and Jerry Jones, the charismatic owner of the Dallas Cowboys, was reputed to have a great deal of influence concerning on-the-field decisions. The players that are drafted by a professional team and other important personnel decisions are not entirely in the hands of the head coach or manager. In contrast, the responsibilities of a college football coach tend to be similar at most universities.

The lack of any front office personnel in collegiate football allows the coach to control ongoing operations of the team. Although a head coach in college football serves under a university president and the athletic director, the responsibility for the conduct, success, or failure rests with the head football coach. An athletic director or other university administrator rarely is involved except when a coaching change is made. Thus, a coach has identical responsibilities at different universities. The sample is therefore more homogeneous in college athletics than in professional athletics, a clear methodological advantage.

To test for the match effect between a coach and a team, it is necessary to isolate his or her performance, subject to the abilities of the team's players as well as the inherent capabilities of the program. Previous researchers analyzing professional sports have used player statistics as a proxy for player ability (e.g., Chapman and Southwick, 1991). Coaching decisions can impact the statistics of a player, reducing the significance of the coach-team match. This study uses the college draft to proxy for a player's talent level, which is only marginally influenced by coaching decisions. National Football League (NFL) teams are likely to choose a talented player over a statistically successful player when they decide who to select in the draft.2 In college athletics, team talent is more under the control of the coach than is that in professional sports. Coach Bill Parcells, expressing his frustration at a lack of control over which players he had to coach, famously said "If you're going to prepare the meal, you should be able to buy the groceries."3 Many college coaches are known as outstanding recruiters rather than as on field generals4. We therefore also examine the match hypothesis ignoring player talent, because talent in college athletics may be a primary responsibility of the coach5.

Both Chapman and Southwick (1991) and Ohtake and Ohkusa (1994) concede that their model suffers from a simultaneous equation bias from the mid-season trading of professional baseball players around the league. This can lead to a contamination of the variable capturing the match effect. College football prohibits mid-season student transfers, and only rarely do players change schools even in the off season.

Data

Data on college football coaches and their respective teams were obtained through the College Football Data Warehouse at http://www.cfbdatawarehouse.com/. The entire coaching lineage from all current Division I-A football schools was available. To maintain homogeneity among coaches, we only include coaches that were members of Division I-A schools. Other divisions have to comply with different rules such as the number of scholarships that may be awarded. We gather data on schools beginning in 1925 because many schools began football programs after World War I. Our data set continues through 2003. Some data, however, are not available over the entire sample period. Because we include a player draft variable in Tables 3 and 4, we use data starting with the merger of the NFL and the American Football League in 1967 in order to avoid problems associated with two drafts. Table 1 below reports descriptive statistics for the sample period 1968-2003 used in the primary regression analysis shown in Tables 3 and 4.

To test for the existence of a coach-team match, it is necessary to look at output differences of a coach with different teams. We measure a coach's output with the winning percentage of his team. The average winning percentage in the sample is 0.51 (Table 1), which includes wins over non-Division I teams. The highest and lowest overall winning percentage based on the sample between 1925 and 2003 is Tennessee (0.729) and Kent State (0.319), respectively. A team's performance also can be affected by the quality of the opponents a team faces during a season, by the quality of players on the team, and the inherent traits of the particular program.

IMAGE TABLE1

Table 1-Means and Standard Deviations for Division I-A Schools 1968-2003

Because teams play against a subset of the Division 1-A teams, the schedule can have a significant effect on team performance. We specify a strength-of-schedule variable to capture opponent difficulty.6 As shown in Table 1, the strength of schedule varies from 0.05 to 0.95, with the average strength of schedule equal to 0.56. Notre Dame has the strongest strength of schedule, averaging 0.739 during the sample period from 1925-2003.

Because high school athletes have a choice of which college to attend, there is often an imbalance of talent among college teams.7 We use data on the NFL draft, obtained in the Football Encyclopedia, as a measure of player talent. While previous research focused on statistics as a proxy of player skill, the NFL draft order permits us to measure talent independent of coaching decisions. Players selected at the top of the draft typically receive far greater compensation, indicating the relative scarcity of their exceptional skills relative to players selected toward the bottom of the draft. We assume that talent level of the drafted players is nonlinear and is subject to far greater estimation error toward the later rounds of the draft. We, therefore, model the players drafted as an exponential decay function, namely e^sup -0.01(Draft Choice)^.8 To avoid problems associated with two drafts, we collect data starting with the merger of the NFL and the American Football League in 1967.9

The average number of players drafted in the sample is 2.29, as shown in Table 1. Joe Paterno has the most players drafted (215) over his last 35 years of tenure. His overall tenure in the sample is 38 years; however, we restricted the sample to 1968 and beyond for the draft variable. Larry Coker had 9.34 players drafted per year, on average, over his first three years of tenure at Miami. USC has the most players drafted overall of 230. Additional stats regarding the draft variable for coaches and teams are presented in Table 2.

IMAGE TABLE2

Table 2-NFL Draft Data by Coach and by Team for Division I-A Schools 1968-2003

We assume that the major difference in talent exists between star players and journeyman players. Among the journeyman players, talent differences tend to be minor. Many journeyman players are, to a considerable extent, interchangeable. Because recruiting is a major component of a college coach's job and talent is not independent of coaching, we also conduct the analysis without controlling for player talent.

Previous studies (Chapman and Southwick, 1991; Borland and Lye, 1996) have tried to test for the existence of team fixed effects. This controls for the possibility that any coach-team match can be confused with inherent differences in team abilities. Neither study finds a team-influence effect. Fort (2003), however, did find a significant team effect. He shows that there is a winning percent imbalance in college football that has persisted for at least 30 years. We therefore use a tradition variable based on a moving average of a team's previous ten years winning percentage.10 The tradition variable controls for coaches that switch between schools with sharply different traditions and allows for changes in a team's fixed effect in the model.

Not surprisingly, the average tradition variable shown in Table 1 is close to fifty percent, but the maximum is 0.924 (Oklahoma in 1958) which suggests that some colleges in the sample have strong winning traditions. Despite the high strength of schedule, Notre Dame also has the highest average tradition of 0.747 calculated over the sample period 1925-2003. Kent State has the lowest average tradition of 0.36, although Texas-El Paso had the lowest tradition variable in a given year (1985) of 0.122.

Finally, we control for the fixed effect of each coach's ability. This captures the difference in a coach's output, positive or negative, across different teams. We create a dummy variable for each coach and an experience variable to control for coaches with different levels of experience.

IMAGE FORMULA3IMAGE FORMULA4

The results from estimating Equations 1 and 2 are reported in Table 3. The coefficients of the dummy variables have been suppressed for brevity. All nondummy variables are significant at the 1 percent level in both equations, except years of coaching experience in the first regression. The coefficients of the strength of schedule and players drafted are consistent with the expectation that a weaker set of opponents and higher quality players results in a higher winning percentage. The tradition variable, however, changes signs when the match dummy variables are included in Equation (2). Because each match dummy variable is associated with only one team, it influences the effect of the tradition variable.

The players drafted variable assumes that very talented players have a significantly greater impact on performance. The case also can be made that the players drafted variable should not be included because it reflects one of the important funclions of a successful college coach, namely recruiting player talent. We, therefore, show the results of excluding the draft variable in columns four and five of Table 3. The exclusion of the player draft variable does not have a large impact on the remaining variables.12

IMAGE TABLE5

Table 3-OLS Regressions on Winning Percentage for Division I-A Schools 1968-2003

We calculate a Wald statistic as shown in the final row of Table 3 to test the significance of the inclusion of the match dummy variables. The Wald test rejects the hypothesis that the match dummy variables do not explain the variation in winning percentage at the one percent level in all model specifications. As expected, the model fit is also enhanced by including match dummy variables and increases the R squared from 0.477 to 0.525 in columns two and three with similar improvements in fit across columns four and five.13

Because the average winning percentage for all teams during any given year must equal roughly fifty percent, the cross-sectional errors for each team will be correlated.14 This will cause the OLS estimates to be inefficient and may lead to upward-biased estimates of their variance. To correct for this bias, we re-estimate both Equations (1) and (2) using generalized least squares (GLS) for all model specifications in Table 3 and report the results in Table 4. The coefficients are similar to the OLS regressions, but the WaId test statistic is somewhat lower. The test statistic for the impact of the inclusion of the match dummy variables remains significant at the 1 percent level. Our evidence presented in Tables 3 and 4 is consistent with the matching hypothesis and robust to alternative specifications.

IMAGE TABLE6

Table 4-GLS Regressions on Winning Percentage for Division I-A Schools 1968-2003

An Alternative Test of the Coach-Team Match

In his description of the matching model, Jovanovic (1979) proposes that employers and employees learn about match quality over time. Initially, the probability of job separation will increase early in a worker's tenure. As information is revealed regarding match quality, the probability of job separation subsequently will decline.

We analyze the probability of separation by calculating the hazard rates for college football coaches over the sample period of 1968 through 2003 and document a pattern consistent with that predicted by Jovanovic (1979). Figure 1 displays college football coaches' hazard rates by tenure. An examination reveals that the hazard rate is increasing for the first five years and then subsequently decreasing over a coach's tenure. The hypothesis that the hazard rate is constant over tenure is rejected. Our evidence is consistent with a four or five year contracting period for college football coaches. Given the length of a college football player's eligibility, schools often allow a coach enough time to recruit his own players to decide on his coaching effectiveness. This result is also consistent with the findings of Allgood and Farrell (2003) in their study of the CEO labor market where the peak hazard rate for CEO tenure is also five years.

IMAGE GRAPH7

Figure 1-Hazard Rates by Tenure

While the peak does occur at the same time, the severity of the peak for college coaches is more than twice that of CEOs. Furthermore, both Chapman and Southwick (1991) and Borland and Lye (1996) find a peak of over 40 percent at the three year point in tenure with their data on baseball and Australian Rules football, respectively. This is consistent with the notion that performance is easier to measure in sports than it is in the corporate context, which accelerates the determination of a successful match.

To measure the effect of matching, we draw on the information provided by the hazard rate. The hazard indicates that if a coach's tenure endures beyond five years, the coach-team combination is a good match. To further analyze this relation, we calculate the average winning percentage for all coaches in the sample from 1925 to 2003 who have completed tenures that last between one and five years. Eight hundred and twenty-six coaches have tenures between one and five years in the sample and have an average winning percentage of 0.44. We also calculate the average winning percentage for all coaches that have a tenure that lasts beyond five years (these include incomplete spells). We have 484 coaches in the sample with tenures greater than five years and an average winning percentage across all years equal to 0.56. The average winning percentage of 0.56 for coaches with tenures greater than five years is significantly greater than the average winning percentage of 0.44 for coaches with tenures of five years or less, at the 1 percent level.

Based on the preceding analysis, we define a dummy variable equal to one if a coach's tenure is greater than five years with a given college. The dummy variable takes the value of one for every year the coach-team match is in the sample.15 If a coach is with a team for five years or less and then departs, we define the dummy variable equal to zero. We have a right censoring issue because some coaches have been with a team less than five years, but have not left the team at the end of our sample period. To capture a sufficient number of good and bad matches in the sample, we use the entire sample period to include all Division I-A schools from 1925-2003. Given our data for the players drafted variable begin in 1968, we drop this variable from the analysis.

In Table 5, we show the regression of winning percentage on the good match dummy variable, strength of schedule, tradition, coaching experience, and coaching experience squared. The signs and significance levels of the control variables are consistent with expectations, except an insignificant relation between coaching experience and winning percentage. The good match dummy variable is positive and significant in all model specifications. The coefficients in Table 5 suggest that a good match between a coach and team increases the winning percentage of a team 5.5 percent or 4.5 percent, according to the OLS (column two) or the GLS (column three) specification, respectively.16

The good match variable is correlated with the tradition variable because all good matches are defined as those that last longer than five years, and tradition is a ten year moving average. Therefore, we exclude the tradition variable from the analysis in Table 5. The significance levels of the good match variables are slightly higher in columns four and five relative to columns two and three of Table 5, although the inferences remain virtually unchanged. These results suggest that a good match results in a 5 percent improvement in a team's winning percentage.

IMAGE TABLE8

Table 5-Regressions on Winning Percentage while Controlling for a Good Match for Division I-A Schools from 1925-2003

We caution that it is difficult to quantify the coach-team match effect. For example, in our original model specifications (Tables 3 and 4), our coach dummy variables have information concerning the match with the deleted coach-team combination. Also, the match dummy variable represents the quality of the coach-team match relative to the omitted coach-team match for that individual. Also, it is worth noting that coach-team matches such as Tom Osborne from Nebraska and Joe Paterno from Penn State will not have match dummies because both coaches only have one match in the sample period. Thus, following Chapman and Southwick (1991), we avoid ranking coaches from best to worst because a ranking would combine information about both the coach and about his coach-team match.

Conclusion

Using a large data set for football coaches, a direct test of the matching hypothesis shows that the productivity of coaches varies across teams. Specifically, we find that a coach-team match significantly impacts the variation in collegiate football teams winning percentages. We also show that coaches tend to face an increasing probability of turnover early in their tenure and then subsequently face a decreasing probability of turnover once they have been with a team for approximately four or five years, consistent with the pattern predicted by Jovanovic (1979). Defining a good match as one that endures beyond five years, we estimate the value of a good match as an improvement in a team's winning percentage of approximately 5 percent. We conclude that mobility in the market for college football coaches follows a process similar to that predicted by the matching model.

FOOTNOTE

1 This variable attempts to capture coaching efficiency. Expected wins is computed using a complex formula with such variables as the amount of runs scored and runs allowed. These numbers can be skewed due to the nature of a game, such as whether or not the game is close at the end. Managers tend to make different strategic decisions depending on the score that affect the number of runs scored/allowed.

2 The Heisman Trophy winner, presumably the best college football player, is often not the top draft choice. Performance of players in college is influenced, to a considerable extent, by coaching.

3 http://www.usatoday.com/sports/football/nfl/cowboys/2003-01-02-parcells-cover1_x.htm

4 For example, Florida State coach Bobby Bowden's success often is attributed in large part to his recruiting skills.

5 We are indebted to an anonymous referee for this point.

6 Strength of schedule is a common statistic in college football computer models used to determine the national champion. For our study, we used James Howell's calculation of SOS at http://www.jhowell.net/cf/cfindex.htm.

7 In professional sports, however, drafts and salary caps maintain a balance of talent.

8 This assumption is consistent with NFL teams using draft value charts. A draft value chart determines the numeric value of a draft pick to aid teams in trading draft picks. For example, one team's draft value chart assigns the top pick a value of 3000, the second pick 2600, third pick 2200, and fourth pick 1800. Then pick values fall less rapidly with the sixteenth pick worth 1000 points and the thirtieth pick worth 620. To see the decay of the function, the one hundredth pick is worth 100 points. See, for example, http://www.nfl.com/draft/story/9341444 for a discussion of one NFL team's value chart obtained by the NFL. Our choice of the decay function closely resembles this team's value chart.

9 The draft when the two leagues were competing not only reflected the player's talent, but also the likelihood that the team would be able to sign the player.

FOOTNOTE

10 Results are robust to alternative proxies for the tradition variable, including calculating average winning percentage over five and fifteen years.

11 A quadratic is used because learning frequently is modeled as increasing at a decreasing rate.

12 We also expand the sample to include the data from 1925 through 2003 because the draft variable is the primary reason for restricting the data to post 1967. The results are qualitatively the same as those reported in Tables 3 and 4 and are available from the authors upon request.

13 The introduction of match dummies in our model increases the R squared in a comparable manner as that shown in Table 2 of Chapman and Southwick (1991) whose R squared increases from 0.765 to 0.823.

14 Due to schools playing teams from other divisions as well as some schools moving from Division I-A to lower divisions, the actual average winning percentage of the current Division I-A teams is 0.51, as shown in Table 1.

15 Allgood and Farrell (2003) define a good match variable equal to one if a CEO's tenure is greater than or equal to five years. They only analyze performance of the firm over the first three years of the CEO's tenure, not performance over the CEO's entire career. Their unit of observation is a CEO-firm match.

16 As a robustness test, we restrict the data to the sample period 1968-2003. The good match dummy variable remains significant, but the coefficient is between 4 percent and 3.5 percent depending on the model specification.

REFERENCE

References

1. Allgood, S., and K.A. Farrell, "The Match between CEO and Firm," Journal of Business, 76, no. 2 (2003), pp. 317-342.

2. Borland, J., and J. Lye, "Matching and Mobility in the Market for Australian Rules Football Coaches," Industrial and Labor Relations Review, 50, no. 1 (1996), pp. 143-158.

3. Chapman, K.S., and L. Southwick, Jr., "Testing the Matching Hypothesis: The case of Major-League Baseball," American Economic Review, 81 (December 1991), pp. 1352-1360.

4. Fort, R.D., Sports Economics (Upper Saddle River, New Jersey: Prentice Hall, 2003).

5. Garen, John E., "Empirical Studies of the Job Matching Hypothesis," Research in Labor Economics, 9 (1999), pp. 187-224.

6. Hersch, J., and P. Reagan, "Job Match, Tenure and Wages Paid by Firms," Economic Inquiry, 28 (1990), pp. 488-507.

7. Jovanovic, B., "Job Matching and the Theory of Turnover," Journal of Political Economy, 87 (1979), pp. 972-990.

8. Ohtake, F., and Y. Ohkusa, "Testing the Matching Hypothesis: The Case of Professional Baseball in Japan with Comparisons to the United States," Journal of the Japanese and International Economies, S (1994), pp. 204-219.

9. Prisinzano, R., "Investigation of the Matching Hypothesis: The Case of Major League Baseball," Journal of Sports Economics, 1 (August 2000), pp. 277-298.

AUTHOR_AFFILIATION

Todd Brown*

Stephen F. Austin State University

Kathleen A. Farrell

University of Nebraska-Lincoln

Thomas Zorn

University of Nebraska-Lincoln

AUTHOR_AFFILIATION

* We appreciate the helpful comments and suggestions of Richard DeFusco, Mostafa Mashayekhi, Manferd Peterson, Colin Ramsay, Warren Luckner, and seminar participants at the University of Nebraska-Lincoln.