The Air-Gap Requirement: Running an Agentic System on Zero External Dependencies

Tessera will never make an API call. Not to OpenAI. Not to Anthropic. Not to any cloud service. The entire system, language model, retrieval infrastructure, knowledge graph, and agentic layer, runs locally on hardware I control.

This is not paranoia. This is architecture.

Why Air-Gap

The corpus contains twenty-three years of my professional and personal life. Client data, financial records, health information, family communications, strategic plans, security assessments. The sensitivity is not hypothetical. If this data were exposed, the consequences would be professional, legal, and personal.

Cloud AI services process data on infrastructure I do not control, in jurisdictions I may not agree with, under terms of service that can change without notice. Some services train on user inputs. Most retain data for some period. All are subject to legal processes that could compel disclosure.

The air-gap eliminates these risks entirely. The data never leaves my machine. The model runs locally. The queries are processed locally. The results are stored locally. There is no attack surface that does not require physical access to the hardware.

The Engineering Challenge

Air-gapping an agentic AI system is harder than it sounds. The best language models are cloud-hosted and massive. The best embedding models assume internet connectivity. The best graph databases are designed for server deployments, not laptops.

I am using locally hosted open-source models for both language generation and embedding. The trade-off is capability: a local seven-billion-parameter model is not as capable as GPT-4 or Claude. But it is capable enough. For the specific domain of my life and decisions, a smaller model with better retrieval outperforms a larger model with worse retrieval. This is the bet.

The knowledge graph runs on SQLite with a custom graph query layer. Not as performant as Neo4j, but zero dependencies, zero configuration, and a single file that can be encrypted and backed up trivially. The vector index uses FAISS, which runs locally with no external dependencies.

The entire stack, model, retrieval, graph, and agentic layer, fits in sixteen gigabytes of RAM and runs on a modern laptop. It is not fast. But it is private, portable, and completely under my control.

What I Give Up

I give up state-of-the-art generation quality. The local model produces less polished prose than cloud models. It hallucinates more frequently, which is why the verification layer is essential. It handles complex reasoning less reliably.

I gain something more valuable: certainty that my data is mine. That no vendor can access it, no subpoena can compel its production from a third party, and no terms-of-service change can retroactively claim rights to it. In a world where data sovereignty is becoming a strategic asset, this is not a limitation. It is a feature.

The generation quality gap is closing. Open-source models improve every quarter. What was impossible locally two years ago is routine now. What is difficult today will be routine in eighteen months. The architecture is designed for the models that are coming, not just the models that exist.