College Football Rankings

Author

Affiliation

Coby Darer

Loading the Necessary Packages and Data

library(ggplot2)
library(readxl)
library(dplyr)
library(knitr)
college_games <- read_excel("College_Football_2025_Season_Games.xlsx")

Introduction

This project will create a ranking system for college football and rank teams for the 2025 season, using data through week ten. It will also predict the winner of some week eleven games and compare it to the actual outcome.

Ranking System

For my ranking system I created variables that I thought are important and assigned a point value to each one. Then ranked the teams based on who had the most points.

If you win the game:

Win: 1 point
MOV: MOV / 30 * 0.4, 0.4 points max
Opponent’s current record: opponent win% * 0.6, 0.6 points max
Opponent’s rank: (26 - opponent rank) / 25 * 0.85, 0.85 points max

If you lose

Team that beats you rank: (26 - winner rank) / 25 * 1.2, 1.2 points max

Margin of Victory and Recent Games

I considered margin of victory in my rankings system because being able to beat a team by a large margin demonstrates that a team is better than if the majority of games they play are close. For my ranking system, I assigned teams extra points depending on their margin of victory. I capped the points assigned for margin of victory at 30. Winning by 30 and by 50 is the same amount of extra points, for example. Also I did not consider margin of victory for FCS games.

I gave equal weight to all games in my ranking system regardless of when the game was played.

Rankings vs AP top 25 (Week 11)

Code

# Find each teams overall record
home_record <- college_games %>%
  group_by(Home_Team) %>%
  mutate(wins = ifelse(Home_Team_Win == 1, 1, 0),
         losses = ifelse(Home_Team_Win == 1, 0, 1)) %>%
  summarize(wins = sum(wins),
            losses = sum(losses)) %>%
  rename(Team = Home_Team)

away_record <- college_games %>%
  group_by(Away_Team) %>%
  mutate(wins = ifelse(Home_Team_Win == 1, 0, 1),
         losses = ifelse(Home_Team_Win == 1, 1, 0)) %>%
  summarize(wins = sum(wins),
            losses = sum(losses)) %>%
  rename(Team = Away_Team)

team_record <- bind_rows(home_record, away_record) %>%
  group_by(Team) %>%
  summarize(Wins = sum(wins),
            Losses = sum(losses),
            win_pct = Wins / (Wins + Losses))


# Create variables for winner, loser, and the losing team win percentage
college_games <- college_games %>%
  mutate(winner = ifelse(Home_Team_Win == 1, Home_Team, Away_Team),
         loser = ifelse(Home_Team_Win == 1, Away_Team, Home_Team)) %>%
  left_join(team_record %>% select(Team, win_pct), by = c("loser" = "Team")) %>%
  rename(opp_win_pct = win_pct)
  
# create and use the variables for a win, MOV, the losing teams current record, and the losing teams rank (the points a team gets for winning)
win_points <- college_games %>%
  mutate(MOV = ifelse(Away_Team_Conference != "FCS", 
                ifelse(Home_Team_Margin_of_Victory <= 30, abs(Home_Team_Margin_of_Victory), 30), 0),
         win_points = 1 + ((MOV / 30) * 0.4),
         opponent_rank = as.numeric(ifelse(Home_Team_Win == 1, Away_Team_Rank, Home_Team_Rank)),
         opp_record_points = round(opp_win_pct * 0.6, 2),
         rank_points = ifelse(!is.na(opponent_rank), (26 - opponent_rank) / 25 * 0.85, 0),
         total_win_points = win_points + rank_points + opp_record_points) %>%
  group_by(winner) %>%
  summarize(overall_win_points = sum(total_win_points)) %>%
  arrange(desc(overall_win_points)) %>%
  mutate(Team = winner)
  
# Give points if a team lost to a ranked team
loss_points <- college_games %>%
  mutate(winner_rank = as.numeric(ifelse(Home_Team_Win == 1, Home_Team_Rank, Away_Team_Rank)),
         rank_loss_points = ifelse(!is.na(winner_rank), (26 - winner_rank) / 25 * 1.2, 0)) %>%
  group_by(loser) %>%
  summarize(overall_loss_points = sum(rank_loss_points)) %>%
  arrange(desc(overall_loss_points)) %>%
  mutate(Team = loser)

# combine and sort the overall points each team got
rankings <- left_join(win_points, loss_points, by = "Team") %>%
  mutate(overall_loss_points = ifelse(is.na(overall_loss_points), 0, overall_loss_points),
         overall_points = overall_win_points + overall_loss_points) %>%
  arrange(desc(overall_points)) %>%
  mutate(Rank = row_number()) %>%
  select(Rank, Team, overall_points)
  
colnames(rankings) <- c("Rank", "Team", "Points in my System")

rankings %>%
  head(n=25) %>%
  kable()

Rank	Team	Points in my System
1	Indiana	15.80667
2	Ohio State	13.74600
3	Ole Miss	13.59600
4	Texas	12.79867
5	Texas Tech	12.78000
6	Texas A&M	12.70267
7	Alabama	12.69533
8	Utah	12.62267
9	Vanderbilt	12.57733
10	Oklahoma	12.16533
11	Virginia	11.94733
12	Oregon	11.86067
13	Georgia	11.85333
14	Memphis	11.76533
15	North Texas	11.70600
16	Georgia Tech	11.46600
17	BYU	11.39533
18	Notre Dame	10.88200
19	Louisville	10.79533
20	Michigan	10.73733
21	Illinois	10.54600
22	Tennessee	10.48400
23	South Florida	10.44400
24	USC	10.43533
25	Houston	10.27800

Actual Ap Rankings

Code

include_graphics("top_rank.jpeg")

Notable Results

Underrated: North Texas was ranked 30 in the AP poll and jumped to 15 in my rankings
Overrated: Miami was ranked 18 in the AP poll and dropped to 28 in my rankings

When comparing my rankings to the AP poll, Oregon, Georgia, BYU, and Notre Dame dropped a noticeable amount of spots but all stayed in the top 25

Prediction for Week 11 Games Based on my Rankings

I chose seventeen games with teams ranked in the AP top 25 poll or ranked in the top 25 of my poll. I predicted the winner of each game based on which team ranked higher in my system. My ranking system accurately predicted the winner 76.47% of the time.

Code

library(kableExtra)

home_team <- c("Purdue", "Penn State", "Missouri", "Alabama", "Miss State", "Ole Miss", "Texas Tech", "Notre Dame", "Virginia", "Louisville", "Vanderbilt", "Miami", "USC", "Memphis", "Wisconsin", "Iowa", "South Florida")
away_team <- c("Ohio State", "Indiana", "Texas A&M", "LSU", "Georgia", "The Citadel", "BYU", "Navy", "Wake Forest", "California", "Auburn", "Syracuse", "Northwestern", "Tulane", "Washington", "Oregon", "UTSA")
predicted_winner <- c("Ohio State", "Indiana", "Texas A&M", "Alabama", "Georgia", "Ole Miss", "Texas Tech", "Notre Dame", "Virginia", "Louisville", "Vanderbilt", "Miami", "USC", "Memphis", "Washington", "Oregon", "South Florida")
actual_outcome <- c("Ohio State", "Indiana", "Texas A&M", "Alabama", "Georgia", "Ole Miss", "Texas Tech", "Notre Dame", "Wake Forest", "California", "Vanderbilt", "Miami", "USC", "Tulane", "Wisconsin", "Oregon", "South Florida")
df <- data.frame(home_team, away_team, predicted_winner, actual_outcome)

colnames(df) <- c("Home Team", "Away Team", "Predicted Winner", "Actual Winner")

colors <- c(
  "lightgreen", "lightgreen", "lightgreen", "lightgreen",
  "lightgreen", "lightgreen", "lightgreen", "lightgreen",
  "salmon", "salmon", "lightgreen", "lightgreen",
  "lightgreen", "salmon", "salmon", "lightgreen", "lightgreen"
)

df %>%
  kable() %>%
  kable_paper("striped") %>%
  column_spec(4, background = colors)

Home Team	Away Team	Predicted Winner	Actual Winner
Purdue	Ohio State	Ohio State	Ohio State
Penn State	Indiana	Indiana	Indiana
Missouri	Texas A&M	Texas A&M	Texas A&M
Alabama	LSU	Alabama	Alabama
Miss State	Georgia	Georgia	Georgia
Ole Miss	The Citadel	Ole Miss	Ole Miss
Texas Tech	BYU	Texas Tech	Texas Tech
Notre Dame	Navy	Notre Dame	Notre Dame
Virginia	Wake Forest	Virginia	Wake Forest
Louisville	California	Louisville	California
Vanderbilt	Auburn	Vanderbilt	Vanderbilt
Miami	Syracuse	Miami	Miami
USC	Northwestern	USC	USC
Memphis	Tulane	Memphis	Tulane
Wisconsin	Washington	Washington	Wisconsin
Iowa	Oregon	Oregon	Oregon
South Florida	UTSA	South Florida	South Florida