Skip to content
Talacote.com

Methodology & scientific sources

This page documents how the Talacote tools work under the hood: the three statistical models powering predictions, the academic papers that ground them, the data sources we consume, and the limitations we openly acknowledge. Everything here is verifiable. No black box.

Three public statistical models

Talacote does not deal in gut-feel tipping. Every win/draw/loss probability is computed from three statistical models published in peer-reviewed journals. Below is the role of each model and the exact academic reference you can download to verify it yourself.

1. Bivariate Poisson model

The Poisson distribution models the number of goals scored by each team based on their seasonal attack/defense averages. For a given match, the model computes the probability of every possible scoreline (0-0, 1-0, 2-1, …) then sums them per outcome (1, X, 2). Best suited to sports with low discrete scores — football typically.

References :

  • Maher, M.J. (1982). Modelling Association Football Scores. Statistica Neerlandica, 36(3), 109-118. DOI
  • Karlis, D., & Ntzoufras, I. (2000). On modelling soccer data. Student, 3(4), 229-244. Author page

2. Adjusted ELO rating system

The ELO system (originally designed for chess player rankings by Arpad Elo in 1960) assigns a strength rating to each team, updated after every match based on the result and the opponent's strength. Talacote uses a football-calibrated variant with a K-factor (learning rate) tuned empirically for the sporting context. Hvattum & Arntzen validated this adaptation against 8 alternative rating methods.

References :

  • Hvattum, L.M., & Arntzen, H. (2010). Using ELO ratings for match result prediction in association football. International Journal of Forecasting, 26(3), 460-470. DOI
  • Elo, A.E. (1978). The Rating of Chessplayers, Past and Present. Arco Pub. (texte fondateur du système, applicable bien au-delà des échecs).

3. Dixon-Coles correction

The bivariate Poisson model under-estimates the frequency of low scorelines (0-0, 1-0, 0-1, 1-1) because it treats both teams' goals as independent — which is false at low scores (teams tighten their defense). The Dixon-Coles correction adds a τ correction term that re-balances those four scorelines based on empirical data. Without this correction, exact-score bets on low scores would be systematically mispriced.

References :

  • Dixon, M.J., & Coles, S.G. (1997). Modelling Association Football Scores and Inefficiencies in the Football Betting Market. Journal of the Royal Statistical Society. Series C (Applied Statistics), 46(2), 265-280. DOI · JSTOR

Data sources

No data is fabricated or scraped illegally. Talacote consumes only public-official or licensed sources. Here is the complete list and usage terms.

  • football-data.org — Schedules, results, standings, team statistics for 12 major competitions (Premier League, La Liga, Bundesliga, Serie A, Ligue 1, etc.).
    football-data.org · License : CC BY 4.0
  • the-odds-api.com — Real-time bookmaker odds for multi-operator comparison and Value Bet detection. Commercial plan subscribed, no scraping.
    the-odds-api.com · License : Commercial API (paid access key)
  • IP geolocation — ipapi.co (free, anonymized) to adapt Premium pricing to the visitor's currency zone. No IP is stored — the request runs in memory and the result is cached for 24h.

Tech stack

No opaque ML framework. No model trained on a proprietary dataset. Calculations are run in PHP server-side (Poisson, ELO, Dixon-Coles are closed-form formulas, not learning), and the result is rendered client-side in vanilla JavaScript.

  • WordPress + thème custom child d'Astra (~80 fichiers PHP)
  • PHP 8.x pour les modèles statistiques + l'API REST + le rendu serveur
  • JavaScript vanilla côté client (pas de React, pas de Vue, pas de jQuery global)
  • Stripe Payment Links pour le paiement Premium (PCI-DSS niveau 1, aucune donnée bancaire stockée chez nous)
  • MySQL via le `$wpdb` WordPress (utilisateurs, prédictions cachées, log inscriptions)
  • JSON multilingue pour l'i18n (17 langues, fallback chain langue → FR → EN → clé brute)

Acknowledged limitations

Intellectual honesty: a statistical model is not an oracle. Below is what the Talacote tools CANNOT do, and what you should keep in mind before basing a stake on their output.

  1. Last-minute injuries and events. The models train on aggregated results. An injury announced 30 minutes before kickoff won't be reflected until the seasonal data is updated.
  2. Motivational context. A qualified team fielding their reserves at the end of the season, a derby where motivation transcends the statistical balance: our models do not capture these signals.
  3. Variance and sample size. An estimated probability of 60% means that across 100 comparable matches, around 60 would end in a win — not that THIS specific match will. Streaks of 5-10 consecutive losses are normal and statistically expected.
  4. Calibration vs overconfidence. Model strength depends on training data quality. On major competitions (Big-5 European leagues), the models are well-calibrated. On minor or exotic leagues, data is sparser and reliability decreases mechanically.

Talacote is a decision-support tool, not an automated betting system or a profit guarantee. Sports betting carries real financial risk. Bet responsibly.

Why this is auditable

The three models above have been published in peer-reviewed journals for 25 to 50 years. Anyone with an undergraduate-level statistics background can download the papers, reconstruct the formulas from the mathematical appendices, and obtain the same result as Talacote on an identical dataset. That is the definition of a reproducible method. No secret sauce, no hidden parameter.

If you are a researcher, journalist or student wanting to dig deeper (re-implementation, comparison with another model, reproducibility audit), email contact@talacote.com — we will gladly share implementation details, validation sets used, and limitations observed in production.

Scroll to Top