Detection Ensemble¶
Seerflow runs multiple streaming ML detectors in parallel — each catching a different type of anomaly (content novelty, volume deviation, mean shifts, sequence anomalies). Their scores are blended via weighted average and tested against an adaptive EVT-based threshold.
Security Compromised Service Account
A stolen svc-deploy credential is used for lateral movement. Each detector catches a different aspect: HST flags novel SSH patterns, Holt-Winters catches the 3 AM volume spike, CUSUM detects the sustained auth-failure shift, and Markov flags the impossible command sequence.
Operations Memory Leak Cascade
A v2.3.1 deploy introduces a memory leak → OOM kills → connection-pool exhaustion → cascading timeouts. HST flags novel stack traces, Holt-Winters catches connection-count divergence, CUSUM detects the error-rate shift, and Markov flags abnormal restart sequences.
Follow either scenario through each detector's deep-dive page to see how the ensemble provides defense in depth.
How It Works¶
Each SeerflowEvent is first parsed by Drain3 into a log template, then numeric features are extracted from that template and the surrounding context. Four detectors score the event independently and in parallel — each watching a different signal type.
Their raw scores are z-score normalized using a per-detector Welford online accumulator, then combined into a single blended score via weighted average. When two or more detectors converge (all flagging elevated z-scores at the same time), the blended score is amplified — 1.5× when at least half converge, 2× when at least two-thirds converge. Finally, DSPOT applies an EVT-derived adaptive threshold to decide whether the blended score constitutes an anomaly.
Detector Summary¶
| Detector | Signal Type | What It Catches | Memory | Warmup | Deep Dive |
|---|---|---|---|---|---|
| HST | Content | Novel patterns in feature space | ~50 KB | None (scores immediately) | Half-Space Trees |
| Holt-Winters | Volume | Deviations from seasonal volume | ~12 KB | 1440 min (24 h) | Holt-Winters |
| CUSUM | Change-point | Sustained mean shifts | ~200 B | 30 min | CUSUM |
| Markov | Sequence | Low-probability event transitions | ~10 KB/entity | 100 events/entity | Markov Chains |
Note
DSPOT is the threshold layer, not a detector. It receives the blended score and applies an adaptive upper/lower threshold derived from Extreme Value Theory. See DSPOT for details.
Model Persistence¶
Detector state is checkpointed periodically to ModelStore (every model_save_interval_seconds, default 300 s / 5 min). The serialization strategy differs by detector type:
- HST — uses a restricted pickle unpickler with an explicit allowlist. Only River's
HalfSpaceTreesinternals are allowed through; arbitrary code execution is blocked. - Holt-Winters, CUSUM, Markov, DSPOT — serialized with
msgspecto JSON. No pickle, no arbitrary code execution.
The ensemble writes a ensemble:manifest key last, after all individual model keys are written. On restart, the manifest is read first; if it is missing (e.g., a crash during a save), the partial data is ignored and detectors start fresh. When the manifest is present, the ensemble loads each detector's saved state and resumes online learning — no cold start, no lost history.
Source Management¶
Each unique source_type value (e.g., "nginx", "auth", "k8s-pod") gets its own isolated set of 4 detectors plus a DSPOT threshold. Sources are tracked in an OrderedDict that acts as an LRU cache:
- When
max_sources(default 256) is reached, the least-recently-scored source is evicted — its detectors, threshold, score windows, and associated template/entity Holt-Winters instances are all removed. - Template-level and entity-level Holt-Winters pools have their own separate LRU limits (
max_template_hw,max_entity_hw), so a single noisy source cannot crowd out all HW capacity.
Memory budget per source (approximate):
| Component | Per-source footprint |
|---|---|
| HST | ~50 KB |
| Holt-Winters (source-level) | ~12 KB |
| CUSUM | ~200 B |
| DSPOT | ~8 KB |
| Markov | ~10 KB × tracked entities |
At defaults: 256 sources × ~70 KB base = ~18 MB, plus Markov entity overhead (varies by workload).
Scoring Pipeline¶
A brief overview — full detail including weights and amplification math is in Scoring & Attack Mapping:
- Each of the four detectors produces a raw score in [0, 1].
- Raw scores are z-score normalized against their historical distribution using a per-detector Welford online accumulator. During warmup (fewer than 2 observations) the raw score is used directly.
- Weighted average across all active (non-NaN) detectors using configurable weights (
weights_content,weights_volume,weights_pattern,weights_sequence, plusweights_template_volumeandweights_entity_volumefor the granular HW channels). - Signal amplification when multiple detectors converge — 1.5× when ≥ half fire, 2× when ≥ two-thirds fire (minimum 2 detectors required for either multiplier).
- DSPOT receives the amplified blended score and applies an adaptive upper/lower threshold to make the final anomaly decision.
See also:
- Security Primer: Anomaly Detection — background concepts behind streaming ML detection
- Architecture: Pipeline — where the detection ensemble fits in the broader processing pipeline
- Configuration Reference — all parameters controlling the ensemble
Next: Half-Space Trees → — streaming content anomaly detection via random half-space partitions.