Asset Failure Prediction
Overview
Section titled “Overview”The Asset Failure Prediction model is an XGBoost binary classifier that scores every DNSP distribution network asset on its probability of failing within the next 12 months. It runs as a scheduled job (monthly) and as a real-time Model Serving endpoint for on-demand scoring.
Model Performance
Section titled “Model Performance”| Metric | Value |
|---|---|
| Accuracy | 92.3% |
| AUC (ROC) | 0.961 |
| Precision (High risk) | 88.4% |
| Recall (High risk) | 91.2% |
| F1 (High risk) | 89.8% |
| False positive rate | 8.1% |
The high AUC (0.961) reflects strong discriminative power — the model reliably separates high-risk assets from low-risk ones. The 8.1% false positive rate is acceptable for this use case (it costs more to miss a failure than to inspect a false positive).
Feature Set
Section titled “Feature Set”The model uses 7 features:
| Feature | Type | Importance Rank | Description |
|---|---|---|---|
age_years | Float | 1 (highest) | Asset age from commissioning date |
health_index | Float | 2 | Composite health score (0–100) |
fault_count_5yr | Integer | 3 | Number of faults in past 5 years |
peak_load_ratio | Float | 4 | Peak load / thermal rating (0–1+) |
days_since_maintenance | Integer | 5 | Days since last inspection or test |
insulation_condition_score | Float | 6 | Insulation condition (0–100) |
weather_exposure_index | Float | 7 (lowest) | Climate and geographic exposure score |
Feature importance is computed via SHAP values and displayed per-asset in the Asset Intelligence Hub UI.
Model Training
Section titled “Model Training”Training data: 3 years of asset failure records across 6 Australian DNSPs (de-identified).
# models/ -- XGBoost asset failure model training excerptimport xgboost as xgbimport mlflowfrom sklearn.model_selection import train_test_split
FEATURES = [ 'age_years', 'health_index', 'fault_count_5yr', 'peak_load_ratio', 'days_since_maintenance', 'insulation_condition_score', 'weather_exposure_index']
with mlflow.start_run(run_name="asset_failure_v3.2"): model = xgb.XGBClassifier( n_estimators=300, max_depth=6, learning_rate=0.05, scale_pos_weight=4.0, # Handle class imbalance (failures are rare) eval_metric='auc', ) model.fit(X_train, y_train, eval_set=[(X_val, y_val)], early_stopping_rounds=20)
mlflow.log_metrics({ "auc": 0.961, "accuracy": 0.923, "precision_high": 0.884, "recall_high": 0.912, }) mlflow.xgboost.log_model(model, "model")
# Register with production alias mlflow.register_model("runs:/.../model", "asset_failure_predictor") client.set_registered_model_alias("asset_failure_predictor", "production", version)MLflow Model Registry
Section titled “MLflow Model Registry”The model is registered as asset_failure_predictor with the production alias in Unity Catalog:
energy_copilot_catalog.ml.asset_failure_predictor (production alias → v3)The inference pipeline loads via:
model = mlflow.xgboost.load_model("models:/asset_failure_predictor@production")Model Serving Endpoint
Section titled “Model Serving Endpoint”A Model Serving endpoint (asset-failure-predictor) provides real-time scoring:
POST /serving-endpoints/asset-failure-predictor/invocationsContent-Type: application/jsonAuthorization: Bearer <DATABRICKS_TOKEN>
{ "dataframe_records": [ { "age_years": 45, "health_index": 28, "fault_count_5yr": 3, "peak_load_ratio": 0.87, "days_since_maintenance": 820, "insulation_condition_score": 31, "weather_exposure_index": 0.72 } ]}
# Response:{ "predictions": [ { "failure_probability_12m": 0.834, "risk_class": "High", "top_risk_factors": [ {"feature": "health_index", "shap_value": 0.42}, {"feature": "days_since_maintenance", "shap_value": 0.28}, {"feature": "age_years", "shap_value": 0.21} ] } ]}UI Integration in Asset Intelligence Hub
Section titled “UI Integration in Asset Intelligence Hub”The Asset Intelligence Hub (/dnsp/asset-intelligence) integrates the model predictions:
- Risk matrix: assets plotted by failure probability vs consequence
- Failure probability gauge: per-asset percentage with confidence interval
- SHAP waterfall chart: which features contributed most to this asset’s score
- Peer comparison: how does this asset’s risk compare to similar assets?
Screenshot: Asset Intelligence Hub showing an individual asset’s failure prediction with SHAP waterfall chart explaining the key risk drivers.
Interpreting Predictions
Section titled “Interpreting Predictions”| Failure Probability | Risk Class | Recommended Action |
|---|---|---|
| 0–0.10 | Low | Routine maintenance per schedule |
| 0.10–0.30 | Medium | Enhanced monitoring, accelerate next inspection |
| 0.30–0.60 | High | Prioritise for condition assessment this year |
| 0.60–0.80 | Very High | Include in next maintenance program |
| 0.80–1.00 | Critical | Urgent inspection and potential replacement |