Published: March 18, 2026 • Part of the Philippine Food Price Prediction series
This post documents the upgraded system as deployed on March 18, 2026. For what changed vs. the original, see the upgrade changelog. For the original baseline, see the old system post.
System Overview (March 18, 2026)
The upgraded Philippine Food Price Prediction system is a multi-model ensemble analysis with real-time exogenous data integration, climate scenario modeling, and an automated early warning system. It processes the same 153,404 WFP price observations but now augments them with NOAA climate data, exchange rates, and global food price indices.
Model Architecture
| Component | Specification |
|---|---|
| Traditional ML | 5 scikit-learn families (Random Forest, Gradient Boosting, Extra Trees, Ridge, SVR) — 25 variants trained in parallel |
| LSTM Network | PyTorch 2-layer LSTM: hidden_size=128, dropout=0.2, AdamW optimizer, HuberLoss, ReduceLROnPlateau scheduler, early stopping (patience=10), gradient clipping (max_norm=1.0), 12-month sliding windows, 6 input features, batch_size=64 |
| Ensemble | StackingRegressor: GradientBoosting (200 trees, depth 4, lr=0.05) + ExtraTrees (200 trees, depth 10) + RandomForest (200 trees, depth 10) → Ridge(α=1.0) meta-learner, 5-fold CV |
| Features | Price history + cyclical month (sin/cos) + normalized year + region/commodity encodings + NOAA ONI + USD/PHP + FAO FPI + lagged values (1–6 months) + ENSO state dummies + momentum indicators |
| Exogenous Sources | NOAA Oceanic Niño Index (ENSO), Frankfurter USD/PHP exchange rates, FAO Food Price Index — all cached locally with configurable TTL |
| Validation | 174 automated tests (pytest) across 4 test files, cross-validated by independent AI agents |
Complete Capability Stack
1. LSTM Deep Learning (lstm_model.py — 639 lines)
Per-commodity LSTM models trained on 12-month sliding windows with 6 engineered features. Cyclical month encoding (sin/cos transforms) captures seasonal patterns. HuberLoss provides robustness to price spike outliers. ReduceLROnPlateau scheduling and gradient clipping ensure stable training. Models saved with full state dict + training metadata for reproducibility.
2. Exogenous Feature Integration (exogenous_features.py — ~500 lines)
Three external APIs feed real-time data into forecasting: NOAA ONI for ENSO state, Frankfurter for USD/PHP exchange rates, FAO for global food price index. Derived features include 1–6 month lags, ENSO state dummies (El Niño/La Niña/Neutral), and momentum indicators. Local JSON caching with configurable TTL minimizes API calls. Embedded reference data for offline operation.
3. Climate Scenario Analysis (climate_scenarios.py — ~430 lines)
Correlation modeling across 7 ENSO phases (Strong/Moderate/Weak × El Niño/La Niña + Neutral). Historical price change analysis by commodity and region validates statistical significance. Impact lag testing at 0, 3, and 6 months captures delayed climate effects. Scenario projections with 95% confidence ranges estimate future price impacts under different climate conditions.
4. Early Warning System (early_warning.py — ~460 lines)
Four anomaly detectors running simultaneously:
- Spike Detector — month-over-month price surge identification
- Year-over-Year Detector — annual trend anomaly detection
- Regional Divergence Detector — cross-region price consistency monitoring
- Model Divergence Detector — actual vs. predicted price gap alerts
FEWS NET-inspired 4-level severity: low / medium / high / critical. Each level maps to specific policy recommendations. Interactive Leaflet.js map dashboard (early_warning.html) visualizes 949 alerts geographically with drill-down panels.
5. Ensemble Stacking (ensemble_model.py — 577 lines)
Three diverse base estimators (GB, ExtraTrees, RF) feed a Ridge meta-learner through 5-fold cross-validation. The ensemble captures complementary strengths: gradient boosting for sequential error correction, extra trees for high-variance exploration, random forest for bootstrap stability. Ridge regression as meta-learner provides L2 regularization against overfitting the stacked predictions.
6. REST API (api_server.py — ~300 lines)
Seven endpoints on port 8787 with 30-second cache TTL:
| Endpoint | Purpose |
|---|---|
/api/predictions |
Current model predictions with confidence intervals |
/api/commodities |
Available commodity types and metadata |
/api/regions |
Philippine region list with statistics |
/api/model-info |
Model architecture and training metadata |
/api/health |
Server health check |
/api/alerts |
Early warning alerts (active) |
/api/climate |
ENSO scenario projections |
7. Infrastructure Upgrades
- Dark/Light Theme — CSS variable-based toggle with localStorage persistence
- Progressive Web App — manifest.json + service worker for offline capability
- Multi-format Export — Excel (.xlsx via SheetJS), CSV, PNG (html2canvas), JSON
- Interactive Charts — zoom/pan via Chart.js zoom plugin, tooltip enhancements
- Parallel Training — 25 model variants across 5 families trained concurrently
- Daily Auto-Update — WFP data refresh with retry logic (3 attempts, exponential backoff) and data quality gates
- Data Quality Dashboard — 8 validation metrics including completeness, freshness, distribution checks
Test Suite (174 Tests)
| Test File | Tests | Coverage |
|---|---|---|
| test_retrain_model.py | 48 | Core RF training, parallel pipeline, feature engineering |
| test_daily_update.py | 43 | Data download, retry logic, quality gates |
| test_data_quality.py | 47 | Validation metrics, freshness, completeness |
| test_ensemble.py | 36 | Stacking regressor, LSTM integration, model report |
Bugs Found and Fixed During Cross-Validation
Four independent AI agents (Alpha, Beta, Gamma, Delta) cross-validated each other’s code. Three bugs were caught:
| Bug | File | Found By | Fix |
|---|---|---|---|
| La Niña ONI mask inverted | climate_scenarios.py | Alpha | Corrected ONI threshold from >-0.5 to <-0.5 |
| Ensemble NameError | ensemble_model.py | Delta (self-fix) | Fixed undefined variable in stacking pipeline |
| TimeSeriesSplit incompatible with stacking | ensemble_model.py | Delta (self-fix) | Switched to KFold(5) for StackingRegressor compatibility |
File Inventory (12 files, ~5,000 lines)
| File | Lines | Purpose | Status |
|---|---|---|---|
| lstm_model.py | 639 | PyTorch LSTM forecaster | NEW |
| ensemble_model.py | 577 | Stacking ensemble | NEW |
| exogenous_features.py | ~500 | NOAA/Frankfurter/FAO integration | NEW |
| climate_scenarios.py | ~430 | ENSO correlation analysis | NEW |
| early_warning.py | ~460 | 4-detector anomaly system | NEW |
| early_warning.html | ~300 | Leaflet.js alert map | NEW |
| api_server.py | ~300 | REST API (7 endpoints) | NEW |
| model_report.py | 465 | Automated model comparison | NEW |
| retrain_model.py | ~600 | RF training + parallel pipeline | UPGRADED |
| daily_update.py | ~400 | Auto-update with retry + quality gates | UPGRADED |
| index.html | ~1200 | Dashboard (dark mode, PWA, export) | UPGRADED |
| README.md | 355 | Full API reference + 15 sections | UPGRADED |
What Comes Next
- Satellite imagery (NDVI) integration for agricultural yield prediction
- FastAPI migration for real-time rolling forecasts
- Philippine ePrice system integration for automated early warning
- Transformer-based architectures (temporal fusion transformers) for multi-horizon forecasting
- Automated model retraining on FEWS NET severity classification updates