A global LightGBM model forecasting daily sales quantity for 20,000+ SKUs across five product categories, trained on 2+ years of transaction history with 8 promotion features.
Five models ranging from classical statistical methods to deep learning, each with distinct mathematical foundations. All share the same 22-feature input set and use recursive multi-step forecasting over the 46-day horizon.
Gradient boosted decision trees. At each round a new tree fits the gradient of the loss, iteratively correcting prior mistakes. Leaf-wise (best-first) growth and GOSS sampling keep training fast.
Long Short-Term Memory network. Three learned gates control what the cell forgets, writes, and reads at each time step, enabling it to capture long-range seasonal dependencies across a 28-day input window.
Additive decomposition model (Meta). Separates the signal into a piecewise linear trend with automatic changepoints, a Fourier-series weekly seasonality, and optional holiday effects.
Triple exponential smoothing. Three exponentially weighted equations track level, trend, and seasonality separately. Optimised smoothing parameters α, β, γ are fit per SKU.
Simple averaging of selected model outputs. When individual models make uncorrelated errors, the ensemble variance shrinks as 1/K, producing a more stable and robust forecast.
The current 22-feature set is hand-crafted from domain knowledge. The next phase integrates a large language model as an automatic feature engineering agent: given a natural-language description of the forecasting task and the available data schema, the LLM proposes, evaluates, and selects the most predictive feature combinations — without manual specification.
LLM reads the data schema and task description, then generates a ranked candidate feature list — lag windows, interaction terms, external signals — in natural language.
Each candidate feature is automatically constructed and scored via cross-validated SHAP importance or permutation importance on a held-out fold.
The LLM synthesises the evaluation results and outputs the final feature set, with a plain-English rationale for each inclusion or exclusion decision.
F = weekly forecast total per SKU · A = weekly actual total per SKU
Weighted aggregation (3 steps):
Select a model, SKU and date range — run live predictions via the local Python server, or browse pre-computed results in static mode.
uvicorn api_server:app --port 8000 for live multi-model predictions
Upload your actual sales file. The FA metric is computed in-browser using the exact POC formula — no data leaves your machine.
Any column order. Auto-detected column names:
日期 / date ·
条码 / barcode / 条形码 ·
当天全部销售数量 / 销量 / quantity
| Period | Weighted FA | SKUs | Forecast Σ | Actual Σ | Bias |
|---|