5 min read
The Case for Context Graphs

Talking about ontologies, RAG, embeddings and whatnot is great and all but it’s easy to get lost without talking about the why. So let me make the case for why the discussion around context graphs is not just hype, plant a few ideas to dissect in the future or grudgingly accept that I’ll have to change my mind on.

Here’s the thing: the more things change, the more they stay the same. If there’s one principle that held fast in my decade in the AI space, it’s “garbage in/garbage out”.

And the point we’re at is inevitable: not because the industry tried different solutions and failed. It’s because the sequence of innovation in the LLM space (or what we broadly call AI now) had to lead to this. The order of the steps could have been different. Maybe the industry landed on tools before addressing the substrate because tool usage makes for better demos. Plausible lineage doesn’t change the need for the discussion.

Think about it. When GPT-3 made it big, the users’ intellectual work was focused on finding the right wording to help the model surface knowledge. It’s funny because if you’ve had your finger on this field for the past 3-4 years, prompt engineering - as a discipline - looks almost like a comical notion in retrospect.

Learning to become more intentional with the prompt layer led to the next logical step: wrapping these LLMs into what we now call Agents. Give them tools. Give them personas. Remember those Github repos of .md files masquerading as employees? Have them play theatre on stage like believable humans. Build harnesses and orchestrators, think about protocols, observability and so on. All valuable topics and a glut of tooling and companies in this space.

But agents lack a fundamental skill: what you’d loosely call common sense and intuition. Common sense and intuition arise when you operate in an environment that has a certain structural truth. Most discussions with agents are “in principle”. They’re not directly relevant if you don’t inject an entire memory lane into the context. It’s not on Notion. It’s not in BambooHR. It’s not in the way GitHub conventions are encoded or what Actions are set up. This structural truth is encoded in how things work together. In how decisions lead to outcomes that lead to new decisions - leading to better outcomes or maybe worse ones.

Humans pick up on this, involuntarily and they develop this organizational intuition. They have random coffee chats that uncover more of this structural truth. Sometimes it’s conflicting. That’s data too.

We build a lot of dirt roads masquerading as paved roads because human intuition and communication was doing the heavy lifting. An agent navigating these roads is landing from pothole to pothole. Policies that are outdated. Org structures that don’t exist anymore or exist but do something different now. Decision-level context that changed. There’s a gradient of fidelity from org structure all the way down to decision-level structure that we haven’t quite solved for yet. And this sits at the intersection of systems rather than in the systems themselves.

A refund over $25,000 looks policy-compliant in the finance system, but approval actually depends on contract carveouts in CRM, an active fraud review in Zendesk, a regional exception the ops lead granted in Slack, and a temporary controller delegation recorded in email. The agent can retrieve each artifact separately, but we’re relying on it to figure out how they relate. Before you know it, you’re writing if-else statements disguised as prompts so it knows who actually has authority, which rule takes precedence, or whether the exception is still in force.

That’s the shape of the problem. It’s why we are starting to talk about Human-in-the-loop systems. HITL in principle is fine. HITL as it’s getting deployed is a bandaid and should be recognized as such. It positions humans as janitors rather than orchestrators. As we flood some poor knowledge worker’s inbox with all sorts of decisions to verify, greenlight, contextualize, it’s time we also had a discussion on how to model the environment and address this bottleneck in model cognition.

We’re at the point where we a) can actually do something meaningful about this b) can’t continue to wave our hands on the topic. Some semblance of cross-ontology reasoning that goes beyond the horrible piles of mappings and rules - the “old world”, if you will - is now feasible. It’s not perfect. Mapping ontologies to context graphs, and doing something useful with them is an art form, if anything, and most tooling out there offloads the actual hard part onto developers. But it’s possible. And without it, we’ll have agents with flimsy intent compounding problems at machine speed.