Import-Pipeline
Die Import-Pipeline nutzt ein 5-Tier-Scheduling-System mit automatischer Orchestrierung ueber Celery Beat. Jede Stufe bedient unterschiedliche Frische-Anforderungen — von monatlichen Komplett-Imports bis zu 10-Sekunden Live-Deltas.
Architektur-Ueberblick
Das 5-Tier-System stellt sicher, dass jeder Datentyp mit der passenden Frequenz aktualisiert wird — ohne unnoetige API-Calls.
| Tier | Command | Schedule | Zweck |
|---|---|---|---|
| 1 | sync_all --full | 1. Sonntag/Monat, 02:00 UTC | Komplett-Import aller Daten |
| 2 | sync_all --daily | Taeglich, 04:00 UTC | Aktuelle-Saison Refresh |
| 3a | sync_all --hot | Alle 3h tagsüber (06–21 UTC) | Hot Window ±48h |
| 3b | import_fixture_details | Alle 15 Min (10–23 UTC) | Details fuer kuerzlich beendete Spiele |
| 4 | import_fixtures_latest | Alle 10 Sekunden | Live-Score Delta-Sync |
Datenfluss
Jede Tier-Stufe verarbeitet unterschiedliche Datenbereiche — von kompletten Referenzdaten bis zu Live-Deltas.
Tier 1 (monatlich): ALLE Referenzdaten → ALLE Fixtures → ALLE Details → ALLE Standings → ALLE Odds → ALLE Features
Tier 2 (taeglich): Leagues → Structure → Teams → Fixtures → Players → Coaches → Details → Standings → Odds → Features
Tier 3a (3h): Fixtures ±48h → Details (letzte 48h) → Standings → Odds → Features
Tier 3b (15min): Fixture Details (Anpfiff < 4h her)
Tier 4 (10s): fixtures/latest → Delta-Upsert (nur getrackte Ligen)FK-Abhaengigkeitsreihenfolge
Alle Commands muessen diese Reihenfolge respektieren. sync_all handhabt dies automatisch.
| # | Command | Entities | Abhaengigkeiten |
|---|---|---|---|
| 1 | import_core | Countries, Types, States, Cities | Keine |
| 2 | import_leagues | Leagues, Seasons | Countries |
| 3 | import_season_structure | Stages, Rounds | Seasons, Leagues, Types |
| 4 | import_teams | Teams, Venues | Seasons, Countries, Cities |
| 5 | import_fixtures | Fixtures | Seasons, Rounds, Teams, Venues |
| 6 | import_players | Players, TeamSquads | Seasons, Teams |
| 7 | import_coaches | Coaches | Countries |
| 8 | import_referees | Referees | Countries |
| 9 | import_fixture_details | Stats, Events, Lineups, Coaches, Sidelined | Fixtures, Players, Teams |
| 10 | import_standings | Standings | Seasons, Teams, Rounds |
| 11 | import_topscorers | TopScorers | Seasons, Players, Teams |
| 12 | import_odds_reference | Markets, Bookmakers | Keine |
| 13 | import_odds | OddValues | Fixtures, Markets, Bookmakers |
| 14 | import_transfers | Transfers | Players, Teams |
| 15 | compute_features | FixtureFeatures, EloRatings | Fixtures, Odds, Standings |
sync_all --full
Komplett-Import aller Daten. Laeuft am 1. Sonntag im Monat um 02:00 UTC.
| Phase | Command | Scope |
|---|---|---|
| 1 | import_core | Alle Lookup-Daten |
| 1 | import_leagues | Alle Leagues + Seasons |
| 1 | import_season_structure | Alle Seasons (Stages + Rounds) |
| 1 | import_teams | Alle Seasons (Teams + Venues) |
| 1 | import_fixtures | Alle getrackten Seasons |
| 1 | import_players | Alle Seasons (Players + Squads) |
| 1 | import_coaches | Alle Coaches |
| 1 | import_referees | Alle Referees |
| 2 | import_fixture_details | Alle beendeten Fixtures ohne Details (nach Datum) |
| 3 | import_standings --all-tracked | Alle Seasons getrackter Ligen |
| 3 | import_topscorers --all-tracked | Alle Seasons getrackter Ligen |
| 4 | import_odds_reference | Markets + Bookmakers |
| 4 | import_odds --backfill | Alle Fixtures mit Odds |
| 4 | import_transfers | Alle Transfers |
| 5 | compute_features --backfill | Odds Features |
| 6 | compute_features --elo-backfill | Elo Ratings |
| 7 | compute_features --form-backfill | Form (PPG) Features |
| 7 | compute_features --goals-backfill | Average Goals Features |
| 7 | compute_features --rates-backfill | CS/FTS/BTTS Rates |
| 7 | compute_features --context-backfill | Rest Days + Season Progress |
sync_all --daily
Aktuelle-Saison-Sync. Laeuft taeglich um 04:00 UTC.
| Phase | Command | Scope |
|---|---|---|
| 1 | import_leagues | Refresh is_current/finished Flags |
| 2 | import_season_structure --current-only | Neue Rounds fuer aktuelle getrackte Seasons |
| 3 | import_teams --current-only | Neue/aktualisierte Teams fuer aktuelle getrackte Seasons |
| 4 | import_fixtures --current-only | Alle Fixtures fuer aktuelle getrackte Seasons |
| 5 | import_players --current-only | Neue Zugaenge fuer aktuelle getrackte Seasons |
| 6 | import_coaches | Alle Coaches (global, guenstig) |
| 6 | import_referees | Alle Referees (global, guenstig) |
| 7 | import_fixture_details | Beendete Fixtures ohne Details (aktuelle Seasons) |
| 8 | import_standings | Aktuelle getrackte Seasons |
| 8 | import_topscorers | Aktuelle getrackte Seasons |
| 9 | import_odds --days-ahead 3 | Odds fuer naechste 3 Tage |
| 10 | compute_features | Inkrementelle ML Features |
sync_all --hot
Hot-Window-Sync fuer das ±48h-Fenster. Laeuft alle 3 Stunden tagsüber (06, 09, 12, 15, 18, 21 UTC).
| Phase | Command | Scope |
|---|---|---|
| 1 | import_fixtures | ±48h Datumsbereich |
| 2 | import_fixture_details | Beendete Fixtures der letzten 48h |
| 3 | import_standings | Aktuelle getrackte Seasons |
| 4 | import_odds --days-ahead 3 | Odds fuer naechste 3 Tage |
| 5 | compute_features | Inkrementelle ML Features |
Automatisches Scheduling
Celery Beat steuert alle automatischen Imports. Die Schedule ist in config/settings/base.py definiert.
| Task | Schedule | Celery Task |
|---|---|---|
| Full Monthly Sync | 1. Sonntag/Monat, 02:00 UTC | sync_monthly |
| Daily Sync | Taeglich, 04:00 UTC | sync_daily |
| Hot Window ±48h | Alle 3h (06,09,12,15,18,21 UTC) | sync_hot_window |
| Recent Details | Alle 15 Min (10–23 UTC) | sync_recent_details |
| Live Delta-Sync | Alle 10 Sekunden | sync_fixtures_latest |
Typischer Tagesablauf (UTC)
02:00 [1. So/Monat] Full Import startet (sync_monthly)
04:00 Daily Sync startet (sync_daily)
06:00 Hot Window Sync (sync_hot_window)
09:00 Hot Window Sync
10:00 Recent Details starten alle 15 Min
10:00 Live Delta-Sync laeuft alle 10 Sekunden (kontinuierlich)
12:00 Hot Window Sync
15:00 Hot Window Sync
18:00 Hot Window Sync
21:00 Hot Window Sync
23:00 Recent Details und Live Delta-Sync enden fuer den TagLeague Tracking (is_tracked)
Das is_tracked Flag auf League steuert, welche Ligen von automatischen Commands importiert werden.
| Szenario | Verhalten |
|---|---|
| Bulk Import (ohne --league-id) | Setzt is_tracked nicht. Neue Ligen bekommen is_tracked=False. |
| Single Import (--league-id 271) | Setzt is_tracked=True fuer diese Liga. |
| Fixture Imports | Verarbeiten nur getrackte Ligen. |
| Standings/Topscorers (Standard) | Nur aktuelle Seasons getrackter Ligen. |
| Standings/Topscorers (--all-tracked) | Alle Seasons getrackter Ligen. |
Neue Liga tracken
python manage.py import_leagues --league-id 271 # Importiert + setzt is_tracked=True
python manage.py sync_all --daily # Wird beim naechsten Daily Sync erfasstCommand-Referenz
Tier 0: Core Lookup Data
| Command | Optionen | Entities |
|---|---|---|
| import_core | --skip-cities, --only {countries,types,states,cities} | Countries, Types, States, Cities |
Tier 1: League Structure
| Command | Optionen | Entities |
|---|---|---|
| import_leagues | --league-id ID | Leagues, Seasons |
| import_season_structure | --season-id ID, --current-only | Stages, Rounds |
Tier 2: Entities
| Command | Optionen | Entities |
|---|---|---|
| import_teams | --season-id ID, --current-only | Teams, Venues |
| import_players | --season-id ID, --current-only | Players, TeamSquads |
| import_coaches | --country-id ID | Coaches |
| import_referees | --country-id ID | Referees |
Tier 3: Fixtures
| Command | Optionen | Entities |
|---|---|---|
| import_fixtures | --season-id, --date, --date-from/--date-to, --current-only, --light-check | Fixtures |
| import_fixtures_latest | Keine (nur --task-id) | Fixtures (Delta) |
Tier 4: Fixture Enrichment
| Command | Optionen | Entities |
|---|---|---|
| import_fixture_details | --date, --since-hours N | Stats, Events, Lineups, Coaches, Sidelined |
| import_fixture_stats | --season-id, --date, --batch-size | FixtureStatistic |
| import_fixture_events | --season-id, --date | FixtureEvent |
| import_fixture_enrichment | --season-id, --date | Lineups, Formations, Coaches, Sidelined |
Tier 5: Standings & Top Scorers
| Command | Optionen | Entities |
|---|---|---|
| import_standings | --season-id, --all-tracked | Standings |
| import_topscorers | --season-id, --all-tracked | TopScorers |
Tier 6: Odds
| Command | Optionen | Entities |
|---|---|---|
| import_odds_reference | Keine | Markets, Bookmakers |
| import_odds | --days-ahead N, --fixture-id, --date, --backfill | OddValues |
Tier 7: Transfers
| Command | Optionen | Entities |
|---|---|---|
| import_transfers | Keine (nur --task-id) | Transfers |
Tier 8: ML Features
| Command | Optionen | Entities |
|---|---|---|
| compute_features | --backfill, --fixture-id, --date, --elo-backfill, --form-backfill, --goals-backfill, --rates-backfill, --context-backfill, --force, --batch-size | FixtureFeatures, EloRatings |
Hinweise
- Idempotenz: Alle Commands nutzen
update_or_create()und koennen sicher mehrfach ausgefuehrt werden. - FK-Validierung: Alle Commands pruefen die Existenz von Foreign Keys vor dem Setzen von Referenzen.
- ImportLog: Alle Commands akzeptieren
--task-idfuer die ImportLog-Integration (automatisch durch Celery Tasks uebergeben). - Rate Limits: Sportmonks Rate Limits sind pro Entity-Typ (3000 Calls/Stunde). Verschiedene Entity-Typen haben separate Budgets.
- --current-only: Filtert auf
Season.is_current=True AND League.is_tracked=True. - --all-tracked: Filtert auf
League.is_tracked=True(alle Seasons, nicht nur aktuelle).