CSV Fallback Bug Fix — Data Reliability Restoration
Ticket #139: Silent CSV Fallback Causing Stale Portfolio Prices
Type: Debugging / Fix / Data Integrity
Affected Component: code_source_simule/pipeline.py, tests/test_pipeline.py, Database (historique table)
1. Context and Symptoms
Portfolio prices in the UI remained frozen at their initial CSV values for 5+ consecutive days (March 23–27, 2026), despite live market data being available. For example, VWAGY's cad_value stayed at 14.85 CAD across all dates, which is inconsistent with expected daily market volatility.
2. Investigation Process
Database query confirmed the stale values:
SELECT ticker, date_releve, valeur, cad_value FROM historique
WHERE ticker_id = (SELECT id FROM titres WHERE ticker = 'VWAGY')
ORDER BY date_releve DESC LIMIT 5;
valeur = 10.61 USD and cad_value = 14.85 CAD for March 23–27.
Mathematical correlation: The CSV file contains VWAGY Price = 10.61 USD. With config.ini setting usd_to_cad_rate = 1.40, this gives precisely 10.61 × 1.40 = 14.854 CAD. The stale value matched CSV data perfectly.
API availability: Manual Marketstack API testing for the same dates confirmed live data was available (e.g., close = 10.08 for March 27, 2026). The data never reached the database.
Root cause: The cron job runs at 05:03 UTC (intended: 21:05 UTC). At 05:03 UTC, end-of-day data for the current day does not yet exist on Marketstack's servers → the API returns {"data": []} (empty response). The pipeline silently substituted these missing values with CSV prices instead of alerting the user.
3. Root Causes Identified
| # | Cause | Impact | Status |
|---|---|---|---|
| 1 | Cron timing off by 16 hours (05:03 UTC vs 21:05 UTC) |
API unavailable for current-day EOD data, returns empty | Out of scope — requires manual VPS cron fix |
| 2 | Silent CSV fallback in enrich_data_with_marketstack() |
Empty API results trigger fillna(df['price']), inserting stale CSV data without alerts |
Fixed |
| 3 | No alerting mechanism | Users unaware that API data was unavailable | Fixed — email alerts added |
4. Implemented Solutions
4.1 - Remove CSV Fallback
Removed the fallback logic that silently substituted CSV prices for missing API data:
# BEFORE (buggy)
df['market_price'].fillna(df['price'], inplace=True)
# AFTER (fixed)
# No fallback — missing API prices remain NaN
df['market_price'] = df['marketstack_ticker'].map(all_prices)
missing = df[df['market_price'].isna()]['marketstack_ticker'].dropna().tolist()
if missing:
print(f"AVERTISSEMENT: {len(missing)} ticker(s) sans prix API...")
4.2 - Add Email Alert Function
Implemented send_alert_email() to notify the operator when tickers lack API prices. Requires .env configuration:
- SMTP_HOST, SMTP_PORT, SMTP_USER, SMTP_PASSWORD
- ALERT_EMAIL_TO
If SMTP is not configured, the function logs a warning instead.
4.3 - Skip Database Insertion for Missing Prices
Updated insert_data() to skip any ticker without a valid API price:
market_price = row.get('market_price')
if pd.isna(market_price):
continue # Skip this ticker — no insertion
4.4 - Test Lock-In
Added regression test test_pipeline_does_not_fallback_to_csv_when_api_key_exists (tc-pipe-nocsv01):
- Given: CSV with prices + API key configured + API returns empty
- Then: No rows inserted into historique (fallback prevented)
- Status: Test passes ✓
5. Data Cleanup
All historical data from January 20, 2026 onwards was purged to eliminate records created under the buggy fallback logic:
Result: 11,977 rows deleted
Rationale: These records contain CSV prices masquerading as market prices, making them unreliable for future analysis.
6. Verification and Results
- Local test suite: 11/11 tests passing (including new
tc-pipe-nocsv01). - Data consistency: Confirmed API data availability (manual requests successful) vs. empty pipeline results (root cause validated).
- Fallback prevention: New test confirms pipeline does not fall back to CSV when API key is present.
7. Documentation Synchronization (EN/FR)
As part of this bugfix closure, project documentation was reviewed and synchronized in both languages to match actual runtime behavior:
- Temporal behavior updated from D-1 wording to current processing date where applicable.
- Pipeline behavior updated to reflect: no CSV fallback when API key is present, missing prices skipped, operator alerting path.
- Operational cron guidance updated to recommend post-market execution timing.
- Test documentation updated to include the new regression test (tc-pipe-nocsv01) and related activity-report links.
8. Dependencies and Next Steps
Immediate (manual VPS action):
- Correct cron schedule:
03 05 * * 1-5→05 21 * * 1-5(16-hour shift) - Configure SMTP variables in
.envfor alert emails (optional but recommended)
Follow-up discussions:
- Review import frequency and date-targeting logic (same-day vs. D-1)
- Establish completeness thresholds (e.g., alert if <95% of tickers have API prices)