I improved my old project "ScoreCast" after 3 years

costas_81 pts1 comments

GitHub - Costasgk/ScoreCast: https://costas.pythonanywhere.com/ ยท GitHub

/" data-turbo-transient="true" />

Skip to content

Search or jump to...

Search code, repositories, users, issues, pull requests...

-->

Search

Clear

Search syntax tips

Provide feedback

--><br>We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Cancel

Submit feedback

Saved searches

Use saved searches to filter your results more quickly

-->

Name

Query

To see all available qualifiers, see our documentation.

Cancel

Create saved search

Sign in

/;ref_cta:Sign up;ref_loc:header logged out"}"<br>Sign up

Appearance settings

Resetting focus

You signed in with another tab or window. Reload to refresh your session.<br>You signed out in another tab or window. Reload to refresh your session.<br>You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

{{ message }}

Costasgk

ScoreCast

Public

Notifications<br>You must be signed in to change notification settings

Fork

Star<br>13

main

BranchesTags

Go to file

CodeOpen more actions menu

Folders and files<br>NameNameLast commit message<br>Last commit date<br>Latest commit

History<br>23 Commits<br>23 Commits

Scripts

Scripts

.gitignore

.gitignore

CONTRIBUTING.md

CONTRIBUTING.md

LICENSE

LICENSE

Procfile

Procfile

README.md

README.md

SESSION_NOTES.md

SESSION_NOTES.md

ScoreCast.png

ScoreCast.png

requirements-web.txt

requirements-web.txt

requirements.txt

requirements.txt

wsgi.py

wsgi.py

View all files

Repository files navigation

ScoreCast

A football match prediction web app powered by the Dixon-Coles Poisson model โ€” covering 11 leagues across Europe, South America, and Asia.

๐ŸŒ Live at costas.pythonanywhere.com

What it does

ScoreCast scrapes historical match data from FBref, fits a Dixon-Coles model per league, and generates predictions for all upcoming fixtures. For each match it produces:

Win / Draw / Loss probabilities

Expected goals (xG) for each team

Full scoreline distribution (0-0 through 5-5)

Most likely score & top 3 scorelines

Both Teams to Score %

Over 1.5 / 2.5 / 3.5 goals %

Leagues covered

League<br>Country

Premier League<br>England

Serie A<br>Italy

La Liga<br>Spain

Ligue 1<br>France

Bundesliga<br>Germany

Super League<br>Greece

Serie A<br>Brazil

Serie B<br>Brazil

Eliteserien<br>Norway

Veikkausliiga<br>Finland

J1 League<br>Japan

Features

Best Picks โ€” fixtures where the model has 65%+ confidence, grouped by date

Simulator โ€” pick any two teams from any league and run a custom matchup

Rankings โ€” model strength ranking vs actual league standings, with over/underperformer highlights

Team DNA โ€” attack/defence ratings, win %, goals scored/conceded, last 5 form

Model Accuracy โ€” backtested accuracy across all leagues for the last 12 months

Visitor Stats โ€” lightweight analytics dashboard at /stats

How it works

Scrapping.py โ†’ Cleaning.py โ†’ ScorelineModel.py โ†’ Flask web app<br>FBref clean & Dixon-Coles serves<br>scraper validate Poisson fit predictions

Scraping โ€” undetected_chromedriver bypasses Cloudflare on FBref; incremental saves per season; persistent Chrome profile so Cloudflare only needs solving once

Cleaning โ€” normalises columns, filters bad rows, exports per-league CSVs

Modelling โ€” fits attack/defence parameters per team via maximum likelihood (L-BFGS-B); exponential time-decay (half-life ~107 days) weights recent matches far more than old ones; Dixon-Coles low-score correction adjusts 0-0, 1-0, 0-1, 1-1 probabilities

Pipeline โ€” pipeline.py orchestrates all three steps with smart staleness detection

Running locally

# 1. Clone and set up environment<br>git clone https://github.com/Costasgk/ScoreCast.git<br>cd ScoreCast<br>python -m venv env<br>env\Scripts\activate # Windows<br>pip install -r requirements.txt

# 2. Run the pipeline (scrape โ†’ clean โ†’ predict)<br>cd Scripts<br>python pipeline.py

# 3. Start the web app<br>cd WebApp<br>python app.py

The app runs at http://127.0.0.1:5000.

Note: First-time scraping takes several hours (FBref rate limits). Subsequent runs are incremental and much faster.

Project structure

Scripts/<br>โ”œโ”€โ”€ Scrapping.py # FBref scraper<br>โ”œโ”€โ”€ Cleaning.py # data cleaning<br>โ”œโ”€โ”€ ScorelineModel.py # Dixon-Coles model<br>โ”œโ”€โ”€ pipeline.py # orchestrates scrape โ†’ clean โ†’ predict<br>โ””โ”€โ”€ WebApp/<br>โ”œโ”€โ”€ app.py # Flask app<br>โ”œโ”€โ”€ templates/ # HTML templates<br>โ””โ”€โ”€ static/ # CSS, fonts, favicon

Datasets/<br>โ”œโ”€โ”€ Scrapped Datasets/ # raw FBref output<br>โ”œโ”€โ”€ Cleaned Datasets/ # cleaned per-league CSVs<br>โ”œโ”€โ”€ Predictions/ # upcoming fixture predictions<br>โ””โ”€โ”€ Models/ # fitted model parameters (JSON)

Deployment

Hosted on PythonAnywhere (free tier). To redeploy after regenerating predictions:

Rebuild the deployment zip locally

Upload to PythonAnywhere Files tab

Unzip and hit Reload on the Web tab

License

MIT โ€” free to use, modify, and...

scorecast league model fbref search scripts

Related Articles