Model Methodology
Learn how our Premier League prediction models work.
Machine Learning Models
We use two primary machine learning algorithms to predict Premier League match outcomes:
K-Nearest Neighbors (KNN)
A simple, interpretable algorithm that classifies matches based on the K nearest historical matches. We optimize K through cross-validation.
XGBoost
A gradient boosting ensemble method that captures complex patterns in match data. We use hyperparameter tuning to maximize predictive accuracy.
Features
Our models use historical and pre-match statistics including:
- Betting odds (home win, away win, draw)
- Points per game (PPG) for home and away teams
- Expected Goals (xG) pre-match estimates
- Average goals per match
- Cumulative team statistics (shots, possession, etc.)
Predictions
We primarily predict:
- Over/Under 2.5 Goals - Total goals in a match ≥ 2.5
- Over/Under 3.5 Goals - Total goals in a match ≥ 3.5
- Over/Under 4.5 Goals - Total goals in a match ≥ 4.5
Model Analysis
View detailed performance metrics and visualizations for our models:
View Model Analysis and Charts →Want More Details?
Check the R Markdown file in our GitHub repository for the full analysis and model development process.
View Current Predictions →