STPIS Anomaly Detection

Overview

The STPIS Anomaly Detection model identifies unusual reliability performance patterns in SAIDI, SAIFI, and MAIFI metrics. It uses an ensemble of Isolation Forest (unsupervised anomaly detection) and Z-score statistical testing to distinguish genuine performance outliers from normal variation across the 6-DNSP peer group.

The model serves two purposes:

Data quality assurance: identifying reporting errors (impossible values, duplicated data)
Performance monitoring: detecting genuine reliability degradation or unusually good performance that should be investigated

Model Architecture

Isolation Forest

Isolation Forest isolates anomalies by randomly splitting the feature space:

from sklearn.ensemble import IsolationForest
from sklearn.preprocessing import StandardScaler

FEATURES = ['saidi_monthly', 'saifi_monthly', 'maifi_monthly',
            'planned_pct', 'storm_events_count',
            'peer_saidi_percentile', 'peer_saifi_percentile']

pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('iso_forest', IsolationForest(
        contamination=0.05,  # Assume 5% of observations are anomalous
        n_estimators=200,
        random_state=42
    ))
])

Isolation Forest produces an anomaly score (more negative = more anomalous). The threshold is calibrated on labelled historical data (known reporting errors and genuine events).

Z-Score Testing

Z-scores are computed against the 6-DNSP peer group for each metric:

Z_SAIDI = (SAIDI_dnsp - Mean_SAIDI_peer_group) / StdDev_SAIDI_peer_group

Flags when |Z| > 2.5 standard deviations:

def z_score_anomaly(value, peer_values):
    mean = np.mean(peer_values)
    std = np.std(peer_values)
    z = (value - mean) / std
    return abs(z) > 2.5, z

Ensemble Decision

def ensemble_decision(iso_score, z_score_flag):
    """
    Flag as anomaly if BOTH methods agree (reduces false positives)
    OR if Isolation Forest score is extreme (catches edge cases)
    """
    iso_anomaly = iso_score < -0.3
    if iso_anomaly and z_score_flag:
        return "ANOMALY_HIGH_CONFIDENCE"
    elif iso_anomaly and not z_score_flag:
        return "ANOMALY_LOW_CONFIDENCE"
    elif not iso_anomaly and z_score_flag:
        return "STATISTICAL_OUTLIER"
    else:
        return "NORMAL"

Model Performance

Metric	Value
Detection rate (True Positive Rate)	93.4%
False positive rate	4.7%
True negative rate	95.3%
Precision	89.1%
F1	91.2%
Z-score threshold	2.5σ
Contamination parameter	0.05

Evaluated on 2 years of labelled STPIS data (labelled by domain experts for known errors and genuine events).

6-DNSP Peer Group

The peer group used for Z-score benchmarking:

DNSP	Customers	Network Length (km)	Climate Zone
Ausgrid	1.7M	51,000	Temperate/Summer
Endeavour Energy	1.0M	48,000	Temperate/Summer
Essential Energy	870K	187,000	Mixed/Rural
Energex	1.4M	52,000	Subtropical
Ergon Energy	730K	155,000	Mixed/Tropical
AusNet Services	720K	97,000	Temperate

Anomaly Categories

Category	Description	Typical Cause
Reporting error	Value statistically impossible	Data entry mistake, system error
Major storm event	SAIDI spike tracked to storm data	Cyclone, bushfire, flood
Equipment failure	High SAIDI from single feeder failure	Major fault on critical asset
Sustained degradation	Multi-month SAIDI above target	Deferred maintenance catching up
Outperformance	Significant improvement vs peer group	Network investment, new automation

Revenue Impact Quantification

When an anomaly is detected, the model estimates the revenue impact:

def estimate_revenue_impact(
    saidi_delta_minutes: float,
    customer_count: int,
    incentive_rate_per_customer_minute: float
) -> float:
    """
    Estimate STPIS revenue adjustment from SAIDI deviation.
    """
    return saidi_delta_minutes * customer_count * incentive_rate_per_customer_minute

For a typical DNSP (1M customers), each minute of SAIDI deviation ≈ $100,000–$200,000 in STPIS revenue impact.

Dashboard Pages

Anomaly Feed (`/dnsp/stpis/anomalies`)

Live feed of detected anomalies with confidence level
Anomaly type classification (reporting error vs genuine event)
Revenue impact estimate
Recommended investigation actions

Peer Benchmarking (`/dnsp/stpis/peer-benchmark`)

Scatter plot: all 6 DNSPs plotted by SAIDI vs SAIFI (current year)
Z-score distribution: shows where each DNSP sits relative to peer group
Historical trend: peer group performance over 5 years

API Endpoints

# Latest anomaly detections
GET /api/dnsp/stpis/anomalies?dnsp=ausgrid&months=3

# Specific anomaly detail
GET /api/dnsp/stpis/anomalies/{anomaly_id}

# Peer group Z-scores
GET /api/dnsp/stpis/peer-scores?metric=saidi&year=2025

# Revenue impact estimate
GET /api/dnsp/stpis/revenue-impact?dnsp=ergon&saidi_actual=108&saidi_target=120