STPIS Anomaly Detection
Overview
Section titled “Overview”The STPIS Anomaly Detection model identifies unusual reliability performance patterns in SAIDI, SAIFI, and MAIFI metrics. It uses an ensemble of Isolation Forest (unsupervised anomaly detection) and Z-score statistical testing to distinguish genuine performance outliers from normal variation across the 6-DNSP peer group.
The model serves two purposes:
- Data quality assurance: identifying reporting errors (impossible values, duplicated data)
- Performance monitoring: detecting genuine reliability degradation or unusually good performance that should be investigated
Model Architecture
Section titled “Model Architecture”Isolation Forest
Section titled “Isolation Forest”Isolation Forest isolates anomalies by randomly splitting the feature space:
from sklearn.ensemble import IsolationForestfrom sklearn.preprocessing import StandardScaler
FEATURES = ['saidi_monthly', 'saifi_monthly', 'maifi_monthly', 'planned_pct', 'storm_events_count', 'peer_saidi_percentile', 'peer_saifi_percentile']
pipeline = Pipeline([ ('scaler', StandardScaler()), ('iso_forest', IsolationForest( contamination=0.05, # Assume 5% of observations are anomalous n_estimators=200, random_state=42 ))])Isolation Forest produces an anomaly score (more negative = more anomalous). The threshold is calibrated on labelled historical data (known reporting errors and genuine events).
Z-Score Testing
Section titled “Z-Score Testing”Z-scores are computed against the 6-DNSP peer group for each metric:
Z_SAIDI = (SAIDI_dnsp - Mean_SAIDI_peer_group) / StdDev_SAIDI_peer_groupFlags when |Z| > 2.5 standard deviations:
def z_score_anomaly(value, peer_values): mean = np.mean(peer_values) std = np.std(peer_values) z = (value - mean) / std return abs(z) > 2.5, zEnsemble Decision
Section titled “Ensemble Decision”def ensemble_decision(iso_score, z_score_flag): """ Flag as anomaly if BOTH methods agree (reduces false positives) OR if Isolation Forest score is extreme (catches edge cases) """ iso_anomaly = iso_score < -0.3 if iso_anomaly and z_score_flag: return "ANOMALY_HIGH_CONFIDENCE" elif iso_anomaly and not z_score_flag: return "ANOMALY_LOW_CONFIDENCE" elif not iso_anomaly and z_score_flag: return "STATISTICAL_OUTLIER" else: return "NORMAL"Model Performance
Section titled “Model Performance”| Metric | Value |
|---|---|
| Detection rate (True Positive Rate) | 93.4% |
| False positive rate | 4.7% |
| True negative rate | 95.3% |
| Precision | 89.1% |
| F1 | 91.2% |
| Z-score threshold | 2.5σ |
| Contamination parameter | 0.05 |
Evaluated on 2 years of labelled STPIS data (labelled by domain experts for known errors and genuine events).
6-DNSP Peer Group
Section titled “6-DNSP Peer Group”The peer group used for Z-score benchmarking:
| DNSP | Customers | Network Length (km) | Climate Zone |
|---|---|---|---|
| Ausgrid | 1.7M | 51,000 | Temperate/Summer |
| Endeavour Energy | 1.0M | 48,000 | Temperate/Summer |
| Essential Energy | 870K | 187,000 | Mixed/Rural |
| Energex | 1.4M | 52,000 | Subtropical |
| Ergon Energy | 730K | 155,000 | Mixed/Tropical |
| AusNet Services | 720K | 97,000 | Temperate |
Anomaly Categories
Section titled “Anomaly Categories”| Category | Description | Typical Cause |
|---|---|---|
| Reporting error | Value statistically impossible | Data entry mistake, system error |
| Major storm event | SAIDI spike tracked to storm data | Cyclone, bushfire, flood |
| Equipment failure | High SAIDI from single feeder failure | Major fault on critical asset |
| Sustained degradation | Multi-month SAIDI above target | Deferred maintenance catching up |
| Outperformance | Significant improvement vs peer group | Network investment, new automation |
Revenue Impact Quantification
Section titled “Revenue Impact Quantification”When an anomaly is detected, the model estimates the revenue impact:
def estimate_revenue_impact( saidi_delta_minutes: float, customer_count: int, incentive_rate_per_customer_minute: float) -> float: """ Estimate STPIS revenue adjustment from SAIDI deviation. """ return saidi_delta_minutes * customer_count * incentive_rate_per_customer_minuteFor a typical DNSP (1M customers), each minute of SAIDI deviation ≈ $100,000–$200,000 in STPIS revenue impact.
Dashboard Pages
Section titled “Dashboard Pages”Anomaly Feed (/dnsp/stpis/anomalies)
Section titled “Anomaly Feed (/dnsp/stpis/anomalies)”- Live feed of detected anomalies with confidence level
- Anomaly type classification (reporting error vs genuine event)
- Revenue impact estimate
- Recommended investigation actions
Peer Benchmarking (/dnsp/stpis/peer-benchmark)
Section titled “Peer Benchmarking (/dnsp/stpis/peer-benchmark)”- Scatter plot: all 6 DNSPs plotted by SAIDI vs SAIFI (current year)
- Z-score distribution: shows where each DNSP sits relative to peer group
- Historical trend: peer group performance over 5 years
API Endpoints
Section titled “API Endpoints”# Latest anomaly detectionsGET /api/dnsp/stpis/anomalies?dnsp=ausgrid&months=3
# Specific anomaly detailGET /api/dnsp/stpis/anomalies/{anomaly_id}
# Peer group Z-scoresGET /api/dnsp/stpis/peer-scores?metric=saidi&year=2025
# Revenue impact estimateGET /api/dnsp/stpis/revenue-impact?dnsp=ergon&saidi_actual=108&saidi_target=120