Skip to content

NEM Price Forecasting

The NEM Price Forecasting model uses LightGBM to produce spot price forecasts for all five NEM regions across six forecast horizons. The model is trained on 18 months of historical data (chronological split) and registered in MLflow with the production alias.

Rather than training 6 separate models (one per horizon), a single model is trained with forecast_horizon as an integer feature. This approach:

  • Reduces model registry footprint (5 models instead of 30)
  • Improves generalisation across horizons (shared temporal patterns)
  • Enables cross-horizon consistency in predictions
Forecast Horizons:
1 → 5 minutes ahead
4 → 20 minutes ahead
8 → 40 minutes ahead
12 → 60 minutes ahead
24 → 120 minutes ahead (2 hours)
48 → 240 minutes ahead (4 hours)

Features computed in models/price_forecast/feature_engineering.py and stored in gold.feature_store_price:

Feature CategoryFeatures
TemporalHour of day, day of week, month, quarter, is_weekend, is_holiday
Lag pricesPrice at t-1, t-2, t-3, t-6, t-12, t-24, t-48 intervals
Rolling stats1h, 6h, 24h rolling mean, std, min, max price
GenerationCoal, gas, wind, solar, hydro generation (MW)
InterconnectorsFlow on all 5 interconnectors (MW)
WeatherTemperature, humidity, wind speed, solar irradiance
NWP forecastsBOM +1h, +4h, +24h temperature and irradiance forecasts
Cross-regionalPrice spread to adjacent regions
HorizonInteger forecast horizon (1, 4, 8, 12, 24, 48)
# models/price_forecast/train.py (excerpt)
import lightgbm as lgb
import optuna
import mlflow
REGIONS = ['NSW1', 'QLD1', 'SA1', 'TAS1', 'VIC1']
def objective(trial):
params = {
'num_leaves': trial.suggest_int('num_leaves', 31, 256),
'learning_rate': trial.suggest_float('learning_rate', 0.01, 0.1, log=True),
'n_estimators': trial.suggest_int('n_estimators', 200, 1000),
'min_child_samples': trial.suggest_int('min_child_samples', 10, 50),
'subsample': trial.suggest_float('subsample', 0.6, 1.0),
'colsample_bytree': trial.suggest_float('colsample_bytree', 0.6, 1.0),
}
# ... train + evaluate
return val_mape
# 50-trial Optuna HPO
study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=50)
# Train final model and register
with mlflow.start_run():
final_model = lgb.LGBMRegressor(**best_params)
final_model.fit(X_train, y_train)
mlflow.lightgbm.log_model(final_model, "model")
mlflow.register_model("runs:/.../model", f"price_forecast_{region}")

Chronological split (stored as MLflow run tags):

  • Train: 18 months (oldest)
  • Validation: 3 months (middle — used for HPO)
  • Test: 3 months (most recent — held out)
RegionMAE ($/MWh)MAPESpike Recall (>$300)Spike Precision
NSW118.412.3%68%72%
QLD122.114.8%64%68%
SA145.218.7%71%65%
TAS112.89.4%52%78%
VIC119.713.1%66%70%

SA1 has the highest MAPE due to its high price volatility and frequent spike events, which are inherently difficult to forecast. Spike recall (detecting >$300/MWh events) is a dedicated evaluation metric alongside standard MAPE.

The models/price_forecast/evaluate.py script provides:

  • Rolling walk-forward backtest: train on T, predict T+1, walk forward over test period
  • AEMO pre-dispatch baseline: comparison against AEMO’s own pre-dispatch price forecasts
  • Spike performance: precision/recall/F1 for price spikes across thresholds ($300, $1000, $5000)
  • Horizon degradation: how accuracy degrades at longer horizons

The copilot’s get_price_forecast tool calls the Model Serving endpoint:

async def get_price_forecast(region: str, horizon: str) -> dict:
"""
Fetch ML price forecast for specified region and horizon.
"""
response = await serving_client.predict(
endpoint="nem-price-forecaster",
features={
"region_id": region,
"forecast_horizon": HORIZON_MAP[horizon],
**current_market_features
}
)
return {
"region": region,
"horizon": horizon,
"forecast_price": response["prediction"],
"confidence_interval": response["prediction_interval"],
"spike_probability": response["spike_probability"]
}

Claude then uses this data in the context of a broader market analysis — combining the price forecast with weather forecasts, generation mix data, and constraint information to provide a contextualised outlook.

Price Forecast Dashboard (/ai-ml/price-forecast)

Section titled “Price Forecast Dashboard (/ai-ml/price-forecast)”
  • 24-hour ahead forecast chart for all 5 regions
  • Forecast vs actual comparison (past 24h)
  • Horizon accuracy chart: MAPE by horizon
  • Spike probability indicator: probability of >$300/MWh in next 4 hours
Terminal window
# Short-term price forecast (30-min)
GET /api/forecasts/prices?region=SA1&horizon=30min
# 24-hour price forecast
GET /api/forecasts/prices?region=NSW1&horizon=24h
# Forecast with confidence interval
GET /api/forecasts/prices?region=VIC1&horizon=4h&include_interval=true
# Spike probability
GET /api/forecasts/spike-probability?region=QLD1&threshold=300&horizon=4h