Trimodal Retrieval: Why One Search Method Is Never Enough

The first prototype of Tessera used pure vector search. Embed the query, find the nearest neighbors, feed them to the model. It worked about sixty percent of the time, which is worse than it sounds because the forty percent it missed were the cases where I needed it most.

The failures followed a pattern. Vector similarity is excellent at finding documents that say similar things. It is terrible at finding documents that are relevant for different reasons. When I am remediating a network failure at a client site, the most relevant prior experience might be a security incident that had nothing to do with networking but followed the same escalation pattern and stakeholder communication cadence.

Embeddings cannot see that. The words are different. The domains are different. The relevance is structural, not semantic.

Adding the Graph

The second layer is a knowledge graph. Every ingested artifact creates nodes: people, organizations, technologies, decisions, outcomes. Edges represent relationships: authored, decided, affected, preceded, contradicted. The graph does not care about semantic similarity. It cares about connections.

When I query “how did I handle the last critical outage at a financial services client,” the graph traverses from the concept of critical outage through incident nodes, through client nodes filtered by industry, through decision nodes that capture what I actually did. The path through the graph surfaces artifacts that vector search would never find because the language is completely different.

But the graph alone has its own failure mode. It returns too much. Everything is connected to everything if you follow enough edges. Without a relevance filter, graph traversal produces noise that drowns the signal.

The Lexical Verification Layer

The third mode is lexical. Old-fashioned keyword and phrase matching, but applied as a verification filter rather than a primary search mechanism. After vector search and graph traversal each produce candidate sets, the lexical layer cross-references them against the original query and against each other.

A candidate that appears in both the vector set and the graph set, and contains lexical matches to key terms in the query, gets the highest confidence score. A candidate that appears in only one set gets a lower score. A candidate with no lexical validation gets flagged for review rather than presented as a result.

The effect is dramatic. In my testing, trimodal retrieval produces relevant results in about eighty-five percent of queries compared to sixty percent for vector-only. More importantly, the failures are different. Vector-only fails silently, returning plausible but wrong results. Trimodal fails loudly, returning nothing rather than returning noise. I would rather have no answer than a confident wrong answer.

The Engineering Reality

Building this is not simple. Three retrieval paths means three indexes to maintain, three sets of update logic, and a fusion layer that weights results from each path based on query characteristics. A factual lookup should weight vector search heavily. A pattern-matching query should weight the graph. A specific-document retrieval should weight lexical.

The query classifier that routes this weighting is itself a model, and it needs training data I do not have yet. For now, I am using a heuristic approach: analyze the query structure, detect whether it is asking for facts, patterns, or specific artifacts, and weight accordingly. It works well enough to validate the architecture while I build the training set.

This is the foundation. Everything else in Tessera, the agentic capabilities, the life management features, the technical remediation support, depends on retrieval working at this level. If I get this right, the rest follows. If I get this wrong, nothing else matters.