Markov Chains¶

Security: Compromised Service Account — Impossible Command Sequence

svc-deploy normally follows predictable patterns: login → pull image → start container → health check. The attacker's sequence — login → sudo → cat /etc/shadow → scp — has near-zero transition probability in the learned model. Each step is individually plausible, but the sequence is impossible for this entity. Score: 0.95.

Operations: Service Restart Sequence Deviation

After OOM kills at T+30, api-gateway pods restart — but the init sequence is abnormal. The container starts the health check endpoint before the database migration completes, causing a cascade of failed readiness probes.

EVENT  03:30:14  Pod api-gateway-7f8d9 OOMKilled (exit code 137)
EVENT  03:30:16  Pod api-gateway-7f8d9 Pulling image api-gateway:v2.3.1
EVENT  03:30:22  Pod api-gateway-7f8d9 Started container api-gateway
EVENT  03:30:23  Pod api-gateway-7f8d9 Readiness probe failed: connection refused
EVENT  03:30:24  Pod api-gateway-7f8d9 Started migration runner

The expected restart sequence is pull → start → migrate → healthcheck. The observed sequence — start → healthcheck → migrate — means P(healthcheck | start) is near-zero in the learned model: healthcheck has never followed start directly. Each event is individually familiar, but the order is anomalous. Markov score: 0.88. See the Ops Primer for more on sequence-based failure detection.

Interactive: Markov sequence anomaly

Transition probability scores. A rare login → write transition at minute 180 scores above the threshold.

Theory¶

Intuition¶

A first-order Markov chain asks: "Given the last thing that happened, how surprising is this?" It models the probability of each template_id following another. If svc-deploy always follows "pull image" with "start container", but suddenly follows "pull image" with "cat /etc/shadow", the transition probability is near zero.

Per-entity tracking is critical: what's normal for a user account is abnormal for a service account. A human developer might legitimately sudo occasionally — but a CI/CD service account that has only ever pulled images and started containers has no business escalating privileges. The detector maintains a separate learned transition matrix for every entity it has observed, so each entity's baseline is judged on its own history.

Unlike content-based detectors (HST) or volume-based detectors (Holt-Winters), the Markov detector is blind to what any single event looks like in isolation — it only cares about order. This makes it uniquely sensitive to behavioral sequencing attacks that blend individually normal events into an impossible narrative.

Key Equations¶

Transition probability — the probability of template B following template A, with Laplace smoothing to handle unseen transitions:

\[ P(B \mid A) = \frac{\text{count}(A \rightarrow B) + \varepsilon}{\text{count}(A \rightarrow *) + \varepsilon \cdot |V|} \]

Anomaly score — normalized negative log-probability, clamped to \([0, 1]\):

\[ \text{score} = \min\!\left(1.0, \; \frac{-\log P(B \mid A)}{-\log \varepsilon}\right) \]

Where:

\( A, B \) = consecutive template_id values for the same entity
\( \varepsilon \) = Laplace smoothing constant (default 1e-6)
\( |V| \) = vocabulary size — number of distinct "from" templates seen for this entity
\( \text{count}(A \rightarrow *) \) = total transitions out of template A
The denominator \( -\log \varepsilon \) is the maximum possible surprisal — the score an entirely unseen transition would receive — which normalizes the score to \([0, 1]\)

A fully unseen transition (count = 0) yields score ≈ 1.0. A frequent, expected transition yields score ≈ 0.0. Smoothing ensures the score never literally reaches 1.0 or produces a division-by-zero error.

Seerflow Implementation¶

Configuration¶

Parameter	Type	Default	Range	Description
`markov_smoothing`	`float`	`1e-6`	1e-9–0.01	Laplace smoothing for unseen transitions. Lower values make unseen transitions score higher (closer to 1.0).
`markov_min_events`	`int`	`100`	10–1000	Minimum events per entity before scoring begins. Prevents noisy early scores when the transition matrix is sparse.
`markov_max_entities`	`int`	`1000`	100–10000	LRU cap on the number of tracked entities. When the cap is reached, the least-recently-used entity is evicted.

Per-Entity Tracking¶

Each entity gets its own _EntityModel instance containing:

Field	Type	Description
`prev_template`	`int`	The `template_id` of the most recent event for this entity (`-1` until first event)
`transitions`	`dict[int, dict[int, int]]`	Nested dict: `transitions[A][B]` = count of A→B transitions observed
`total_from`	`dict[int, int]`	`total_from[A]` = total transitions out of template A
`event_count`	`int`	Total events seen for this entity

Entity selection: The primary entity is entity_refs[0]. Events with no entity references, or with template_id == -1 (unrecognized by Drain3), return a score of 0.0 and are not learned.

Warmup: Score returns 0.0 until event_count >= min_events. This prevents anomaly noise during the initial observation period when the transition matrix contains very few counts.

LRU Eviction¶

Entity models are stored in a collections.OrderedDict keyed by entity ID. When max_entities is reached, the least-recently-used entity (the leftmost entry) is evicted via popitem(last=False). Each call to score() moves the entity to the end of the dict, refreshing its recency without creating the entity.

Note: score() performs a read-only lookup (no LRU promotion) — only _get_model() called from learn() moves an entity to the end. This matches the ensemble pipeline's pattern of calling score() before learn().

Serialization¶

Model state serializes to msgpack bytes via serialize() / deserialize(). The serialized payload includes:

All hyperparameters (smoothing, min_events, max_entities)
All per-entity models (transition dicts, total_from, prev_template, event_count)
Entity insertion order is preserved (OrderedDict semantics)

Unlike the HST detector (which uses a restricted pickle unpickler), the Markov detector uses msgspec.msgpack for safe, schema-aware serialization with no deserialization attack surface.

Memory Footprint¶

Approximately 10 KB per entity, depending on vocabulary size. At default max_entities = 1000, total memory is approximately 10 MB. Each additional unique template_id in the transition matrix adds roughly 80 bytes (two int keys + one int value in the nested dict).

Practical Examples¶

Security Walkthrough¶

svc-deploy has been observed for 10,000 events. Its transition matrix reflects a highly repetitive, predictable CI/CD workflow:

From template	To template	Count	P(B \| A)
login (T1)	pull image (T2)	4,982	≈ 0.998
pull image (T2)	start container (T3)	4,981	≈ 0.998
start container (T3)	health check (T4)	4,979	≈ 0.997
login (T1)	sudo (T99)	0	≈ 1.25e-10

When the attacker follows login with sudo: P(sudo | login) ≈ 1.25e-10. Anomaly score:

\[ \text{score} = \min\!\left(1.0,\; \frac{-\log(1.25 \times 10^{-10})}{-\log(10^{-6})}\right) \approx \frac{22.8}{13.8} \approx \min(1.0, 1.65) = \mathbf{0.95} \]

Sample detector output:

{
  "detector": "markov",
  "score": 0.95,
  "entity": "svc-deploy",
  "prev_template_id": 1,
  "curr_template_id": 99,
  "prev_template_label": "Accepted password for * from * port *",
  "curr_template_label": "sudo: * : TTY=* ; PWD=* ; USER=root ; COMMAND=*",
  "transition_count": 0,
  "transition_probability": 1.25e-10,
  "interpretation": "Near-zero sequence probability — this transition has never been observed for this entity"
}

Ops Walkthrough¶

api-gateway restarts normally follow pull → start → migrate → healthcheck. After the OOM kill, the container starts health checking before migration completes. The transition start → healthcheck has count 0 in the learned model (healthcheck has never directly followed start — migrate always comes first).

P(healthcheck | start) ≈ 1.25e-10. Score ≈ 0.88, firing well before the cascade of readiness probe failures produces volume-level signals in Holt-Winters or CUSUM.

Tuning Guide¶

When to Adjust¶

False positives from new entities: New entities start with empty transition matrices and reach min_events before scoring begins, but early counts may still be sparse. Increase min_events to 200 to require a more populated baseline before the detector fires.
Missing sequence anomalies (novel transitions scoring too low): Decrease markov_smoothing to 1e-9. Lower smoothing increases the surprisal of unseen transitions, pushing scores closer to 1.0 for events the entity has never performed.
High memory usage: Decrease max_entities to 500. Entities with infrequent access will be evicted sooner. For deployments with a large number of ephemeral entities (short-lived containers, transient sessions), consider a smaller cap combined with a higher min_events so that ephemeral entities are unlikely to reach the scoring threshold before eviction.

Sensitivity Tradeoffs¶

Smoothing	Unseen transition score	Best For
`1e-9`	≈ 1.0	High-security environments; maximum sensitivity to novel sequences
`1e-6` (default)	≈ 0.95	Balanced — catches rare transitions, tolerates minor vocabulary gaps
`1e-3`	≈ 0.5	Noisy environments with frequent template vocabulary churn

Common Patterns¶

Service accounts with predictable workflows: The Markov detector is most powerful here. A service account with 10,000 events and 5 distinct templates will have near-certainty for observed transitions and near-zero probability for anything else.
Human users with variable workflows: Increase min_events (e.g., 500) and smoothing (e.g., 1e-4). Human behavior is more varied; a lower smoothing value will generate false positives as users explore less common but legitimate paths.
Ephemeral entities (pods, containers): These rarely accumulate enough events to clear min_events. The detector naturally skips them. If sequence anomalies in ephemeral entities are important, reduce min_events and accept more noisy early scores.
Post-deployment template shifts: New software versions often introduce new template_id values. The Markov detector will score new templates as anomalous until their transitions are learned. This is expected — use HST's novelty signal for the initial burst, and allow Markov to catch sequence deviations once the new templates are established.