Graph Backends¶
The entity graph can be persisted to one of three backends. The
selection is a single config knob (storage.graph_backend);
all higher-level code (correlation engine, graph algorithms, dashboard)
is backend-agnostic.
| Backend | Process model | Persistence | Best for |
|---|---|---|---|
igraph (default) |
In-process | Periodic snapshot to disk | Single-host deployments, dev / test, < 5 M edges. |
falkordb |
External Redis-based service | Native | Multi-instance / clustered deployments, hot read paths. |
postgres_age |
External PostgreSQL + Apache AGE | Native, transactional | When you already operate Postgres and want graph + relational in one place. |
igraph is the zero-config default and matches the in-memory graph
described in Construction and
Algorithms. The other two trade startup cost for
durability and concurrency.
Picking a backend¶
| Question | If yes, lean toward |
|---|---|
| Single Seerflow process? | igraph |
| Multiple Seerflow instances need to share one graph? | falkordb or postgres_age |
| Already running Redis for caching? | falkordb |
| Already running Postgres and want one operational dependency? | postgres_age |
| Need transactional consistency between graph writes and SQL state? | postgres_age |
| Read-heavy analytical workloads (PageRank, Louvain) at scale? | falkordb (in-memory) |
igraph (default)¶
In-process graph using the python-igraph library.
storage:
graph_backend: igraph # default — can be omitted
- Storage: writes a snapshot file under
storage.data_diron the same cadence as the model store. No additional service to operate. - Throughput: ~100 K – 500 K edge writes/s on a single host (CPU-bound on hashing).
- Limits: bounded by process memory; igraph's 32 bytes/edge gives ~30 M edges in 1 GB.
falkordb¶
Redis-backed graph (falkordb fork). Connect via a URL — credentials
embedded in the URL are redacted from logs and repr.
storage:
graph_backend: falkordb
falkordb_url: redis://falkordb.internal:6379/0
Operational notes:
- Run a dedicated FalkorDB instance — do not mix on a Redis cache shared with other workloads.
- Set
maxmemory-policy noeviction; eviction policies would silently drop graph nodes. - For HA, run FalkorDB with Sentinel or in a managed environment that preserves the dataset on failover.
Connection pool is shared with the broader async runtime; no extra tuning needed unless you observe head-of-line blocking under heavy read load.
postgres_age¶
Apache AGE inside PostgreSQL. Reuses
storage.postgresql_url and the
asyncpg pool — no separate connection settings.
storage:
backend: postgresql
postgresql_url: ${DATABASE_URL}
postgresql_pool_min_size: 4
postgresql_pool_max_size: 32
graph_backend: postgres_age
Operational notes:
- The AGE extension must be installed and loaded in the target Postgres instance. Seerflow does not install the extension for you.
- The first startup creates the graph schema (one-time DDL).
- Long-running graph traversals share the same pool as event writes
— size
postgresql_pool_max_sizeaccordingly.
Migrating between backends¶
Use seerflow graph migrate to
move an existing graph between backends without downtime on the
ingest pipeline (writes during migration are replayed at the end of
the stream).
# Dry-run estimate
seerflow graph migrate --from igraph --to falkordb --dry-run
# Hot migration
seerflow graph migrate --from igraph --to falkordb \
--batch-size 10000 --wipe-destination
Procedure:
- Run with
--dry-runfirst — it reports projected vertex / edge counts and surfaces backend-specific incompatibilities before the write phase. - Run for real with
--wipe-destinationfor a strict count-equality check. Omit it to merge into an existing destination graph. - Flip
storage.graph_backendinseerflow.yaml. - Restart Seerflow.
Failure handling¶
| Scenario | Behaviour |
|---|---|
| Backend unreachable on startup | Seerflow refuses to start (fail-fast) — degraded operation on a stale graph would be worse than no graph. |
| Backend drops mid-run | Writes are buffered briefly; on extended outage the pipeline halts with a 503 on graph-dependent endpoints. |
| Read query times out | Graph-structural correlation skips the affected entity for that scoring window; ingest continues. |
Storage-side trade-offs¶
igraph |
falkordb |
postgres_age |
|
|---|---|---|---|
| Operational dependencies | None | FalkorDB service | Postgres + AGE extension |
| Crash-safe | Snapshot + WAL replay | Yes | Yes (transactional) |
| Shared between Seerflow instances | No | Yes | Yes |
| Algorithm latency (PageRank, M edges) | Lowest | Low | Medium |
| Best edge throughput | High | Medium | Low (Postgres overhead) |
| Suited to ad-hoc Cypher exploration | No (Python API only) | Yes | Yes |
For algorithm details, see Algorithms & Detection.