No sport has a more contentious championship structure, in all the world, than American college football. We give control over the championship to a complicated structure called the “BCS” which combines the result of two subjective polls with a bunch of complicated computer ratings which no one knows how they work and wouldn’t be able to understand them anyway. This system eventually spits out two teams who are supposed to be “the best” and play each other, and we call the winner the champion.
It’s a lot better than the old system, where we just took a poll to determine the champion. USC-Texas in 2005-06 would never have happened under that system; USC would have played in the Rose Bowl and Texas in the Cotton or Fiesta bowls. Unfortunately, years like that are the exception and not the rule. When there are exactly two undefeated teams, the BCS’ job is easy. When there isn’t, controversy is basically unavoidable. Everyone thinks we should have a real playoff, but no one can get it done.
In the meantime, I have my own addition to the college football rating pantheon.
We can’t trust polls. Polls have short memories, are biased, are impressed by running up the score, are sentimental, and are often based on things other than what happens on the field. In the first few years of the BCS, people blamed computer polls for problems picking champions, in part because almost all computer poll formulae are proprietary. Often they shrugged off strength of schedule as though we should reward teams for playing a bunch of scrubs. Auburn’s inability to make the national title game in 2004-05, despite going undefeated, showed that polls can cause problems as well. A computer ranking can at least claim a modicum of objectivity by being based in fairly sound mathematical principles.
Of course, I don’t have enough grounding in mathematics to have a good grasp of sound mathematical principles, but I have read a number of resources. Many of them are here. Some articles on the thinking going into many of these systems are here. Soren Sorensen’s thinking on these matters, which has affected my own judgment, is here.
My rating is a three-part system that aims to unify and minimize the problem with various systems.
A Rating. This is a basic rating on a scale of 0 to 1. 0 means you’ve been shut out in all games, while 1 means you’ve been beaten in all games. When I was first formulating this I had the results effectively multiplied based on the team’s Coaches and AP Poll results. I would add 1/r (where r is the rank) times the A Rating. The result was a 0 to 3 scale. I dumped it due to increased disillusionment with the polls and the fact that a scrub team was actually helped in the B Rating by getting blown out by a team with an A Rating over 1.
The A Rating is calculated as a team’s winning percentage times a team’s modified average score ratio. According to Sorensen, a team’s score ratio in a given game is the margin of victory divided by the winning score. For the loser, score ratio is the negative margin of loss divided by the losing score. Since the score ratio for shutouts is always 1, and the score ratio for blowouts approaches 1, score ratio serves as a check to running up the score. (However, it also is somewhat biased towards defense. If you’re beating up your opponent 50-3 and your opponent kicks another field goal, you have to get to 100 points to make up the score ratio lost!)
For A Rating purposes, the average score ratio is modified to a 0-1 scale instead of a -1 to 1 scale. Under this system, ties would have score ratio of .5.
B Rating. B Rating is calculated by multipling a team’s A Rating by its total B Points. If the total B Points are negative, teams would be helped by lower A Ratings, so the A Rating is subtracted from 1 before multipling. Because having positive B Points results in a “purer” calculation, I give special recognition to all such teams on my report.
B Points are earned on a game-by-game basis and are supposed to be determined by the following game-by-game formula: MoV*AR+/-1
where MoV reflects the margin of victory or loss (negative for a loss), AR reflects the opponent’s A Rating (subtracted from 1 for a loss), and the +/- 1 factor is a home field modifier. It adds 1 to games played on the road and subtracts 1 from games played at home. For games played on a neutral site, B Points are simply MoV*AR. B Points are recalculated from scratch every week.
This uses “pure” MoV, but it still mitigates the effect of RUTS by multipling it by the A Rating. Who you have a given result against matters. I believe ratings should relate MoV to quality of teams beaten. If you beat up on a terrible team, the B Points you receive for it will be negligible. If you RUTS on a one-loss team with fantastic score ratio, just the fact you were able to run up the score on a terrific team says volumes about the quality of your team. (Most computer rankings, in their zeal to curb RUTS, will give most of this game’s impact to the quality of win.)
However, in practice, this is not the actual formula. I use Access 2003 to calculate the ratings and for some unknown reason, it highballs the ratings to a ridiculous extent. I have isolated the problem to the summation of the B Points, to prepare them for calculation in the B Rating. At this point, an unknown factor will cause the summation to be far higher than the individual games’ B Points would indicate. (It is related to the existence of multiple games, as the B Points sum correctly when there’s only one of them, but skyrocket immediately after a second game appears.) I would like to believe the results scale to what they should be but I am concerned about undervaluing the A Rating in calculating B Rating. As an example, consider the B Points earned by Ohio State in the 2006 season. I have manually sorted the results by date and rounded B Points to the hundredths place.
OSU def. Northern Illinois 35-12: 5.81 points
OSU def. Texas 24-7: 9.69 points
OSU def. Cincinatti 37-7: 9.44 points (Cincinnati had a rather strong season and Texas, while clearly better, wasn’t at championship form without Vince Young)
OSU def. Penn State 28-6: 8.37 points
OSU def. Iowa 38-17: 6.14 points
OSU def. Bowling Green 35-7: 2.46 points (the value of B Points in curbing RUTS against weak opposition should be obvious)
OSU def. Michigan State 38-7: 5.54 points
OSU def. Indiana 44-3: 5.48 points
OSU def. Minnesota 44-0: 9.33 points
OSU def. Illinois 17-10: 1.42 points (that’s what you get for keeping an absolutely atrocious team within a touchdown)
OSU def. Northwestern 54-10: 6.94 points
OSU def. Michigan 42-39: .86 points (yes, Michigan was undefeated at the time, but thumbs down to letting them get within a field goal at home – B Points are capped at MoV)
Florida def. Ohio State 41-14: -8.92 points (for destroying what was to that point the best team in the land, Florida received nearly 20 points for this game)
These B Points should add up to 62.57 points. But Access records OSU’s total B Points as 94162.35. (The final B Rating was 69160.71. After Week 3, OSU’s B Rating was 4237.39.) The only thing I tell Access to do in the query in question is sum up the B Points. For reference, OSU’s A Rating was .735, and their opponents received the following B Points for their OSU games: Northern Illinois -5.10, Texas -5.51, Cincinnati -6.95, Penn State -4.83, Iowa -6.57, Bowling Green -6.42, Michigan State -9.22, Indiana -9.87, Minnesota -10.66 (OSU shutting out Minnesota hurt the Gophers more than it helped the Buckeyes), Illinois -2.86, Northwestern -12.66, Michigan .20 (it is possible to earn positive B Points for losing, but it has to be on the road), Florida 19.84. If anyone can point out what I can do differently to get Access to calculate total B Points correctly, let me know. (My query that calculates individual game B Points has one field for the team itself, and to aid Access in association, two fields for the opponent, one of which is taken from the base list of Division I-A teams. I am willing to e-mail my Access file to anyone interested in tackling the problem. A link to my e-mail should be available from the profile link at right.)
C Rating. B Points do not take into account the unbalanced college football schedule. A team in a non-BCS conference can crush a bunch of scrubs and have its B Rating artificially inflated because the scrubs win more than they deserve by playing other scrubs in conference. This reduces the RUTS-mitigating effect of B Points. C Rating is a simple modification of B Rating that takes into account conference strength.
Each conference has a conference rating, which is simply the average of its component teams’ B Ratings. Independents are considered their own individual conferences, except Army and Navy, which are considered to comprise a “military” conference. (For clarification, the other two independents, Notre Dame and Western Kentucky, are their own one-team conferences, named after themselves.)
To calculate C Rating, take the difference between a team’s B Rating and its conference’s rating. Multiply that number by n/120, where n is the number of teams in conference. (The significance of 120 is that 120 is the total number of teams in Division I-A. Thus the fraction represents the portion of Division I-A that the conference takes up.) Drag the B Rating towards the conference rating by that amount. (If the B Rating is bigger, subtract. If the conference rating is bigger, add.)
Note that, to take my comment on the OSU-Minnesota game above, this serves as another curb on RUTS. RUTS too much in conference and you are liable to hurt the conference rating by punishing your opponent’s B Rating, and thus hurt your own C Rating.
All three of these algorithms have their faults. A Rating does not factor in SoS at all, B Points theoretically give some non-diminishing reward for RUTS, and the C Rating algorithm only makes sense as part of a larger system. But taken together, I believe they make a rather strong rating system that aims to crown a champion by C Rating at the end of the season. Last season, it crowned Louisville, with all the warts on the B Rating system, thanks to a woefully underrated Big East that had the highest conference rating. OSU had the best pure B Rating even after losing to Florida. Florida was third and Boise State fourth, separated by only ten points in the C Ratings – 51169.57 to 51159.34.
I won’t release my ratings for the 2007 season until Week 4, the soonest any team can be linked to any other team by connecting a series of games (Team A played Team B played Team C played Team D…). It’s a little arbitrary for my system compared to other systems for which this sort of thing matters, but let’s face it, the ratings are positively meaningless after Week 1 and only slowly coalesce into place. Last year the Week 3 ratings, which occurred after the cutoff point, were almost random, and the Week 4 ratings were more sensical but still a little wild near the bottom. Ratings will be posted on the Web site when they’re ready.