← Back to Home

Model Methodology

Learn how our Premier League prediction models work.

Machine Learning Models

We use two primary machine learning algorithms to predict Premier League match outcomes:

K-Nearest Neighbors (KNN)

A simple, interpretable algorithm that classifies matches based on the K nearest historical matches. We optimize K through cross-validation.

XGBoost

A gradient boosting ensemble method that captures complex patterns in match data. We use hyperparameter tuning to maximize predictive accuracy.

Features

Our models use historical and pre-match statistics including:

  • Betting odds (home win, away win, draw)
  • Points per game (PPG) for home and away teams
  • Expected Goals (xG) pre-match estimates
  • Average goals per match
  • Cumulative team statistics (shots, possession, etc.)

Predictions

We primarily predict:

  • Over/Under 2.5 Goals - Total goals in a match ≥ 2.5
  • Over/Under 3.5 Goals - Total goals in a match ≥ 3.5
  • Over/Under 4.5 Goals - Total goals in a match ≥ 4.5

Model Analysis

View detailed performance metrics and visualizations for our models:

View Model Analysis and Charts →

Want More Details?

Check the R Markdown file in our GitHub repository for the full analysis and model development process.

View Current Predictions →