Scaling the Graph: Performance Engineering for a Knowledge System That Never Stops Growing

The knowledge graph has passed five hundred thousand nodes and is heading toward a million. Every email I receive, every document I create, every meeting note I log adds nodes and edges. The graph does not shrink. It does not plateau. It grows, continuously, for as long as I am alive and producing artifacts.

This creates an engineering problem that most AI systems do not face: how do you maintain query performance on a database that only gets larger?

The Indexing Strategy

Graph traversal performance is a function of the number of edges examined, not the total graph size. The solution is smart indexing: precomputed paths for common traversal patterns that bypass the full graph search.

For remediation queries, I precompute the “incident → decision → outcome” chains and index them by technology, client, and incident type. A query about Exchange failures does not traverse the full graph. It looks up the precomputed chain index and retrieves only the relevant subgraphs. Traversal time dropped from eight hundred milliseconds to under one hundred.

The trade-off is index maintenance. Every time the graph is updated, the affected chain indexes must be recomputed. I run this as a background process during idle periods. The lag between graph update and index update is typically under five minutes, which is acceptable for a system that does not need real-time consistency.

Memory Management

The full graph no longer fits comfortably in sixteen gigabytes of RAM alongside the language model and vector index. I implemented a tiered memory architecture: hot nodes (high salience, recently accessed) stay in memory. Warm nodes (medium salience, accessed within the last quarter) are cached on SSD. Cold nodes (low salience, not recently accessed) are on disk.

Most queries touch only hot and warm nodes. Cold node access adds about two hundred milliseconds of latency, which is noticeable but not prohibitive. The tier boundaries adjust dynamically based on access patterns: a cold node that gets queried is promoted to warm. A warm node that goes unaccessed for ninety days is demoted to cold.

This is the pragmatic side of building a personal AI system. Enterprise graph databases solve these problems with hardware: more RAM, faster storage, distributed clusters. I solve them with software: smarter indexing, tiered storage, and a deep understanding of which data matters most.

The Five-Year Projection

At current growth rates, the graph will reach two million nodes in about three years. The vector index will require about thirty-two gigabytes. The chain indexes will roughly triple. Total system requirements will be approximately sixty-four gigabytes of RAM and five hundred gigabytes of SSD.

This is within the range of high-end consumer hardware, which means Tessera can continue to run on a single machine without architectural changes. If growth accelerates, I may need to implement more aggressive pruning of truly obsolete nodes, but the salience decay mechanism should handle most of that naturally.

The air-gap requirement constrains the hardware to what I can physically possess and control. No cloud burst capacity. No elastic scaling. The system must be designed to live within fixed hardware boundaries, and those boundaries must be generous enough to accommodate years of growth. This is the discipline of building for permanence rather than convenience.