Institutional Context as the Missing Layer in Agent Systems

Jun 27

In the unfolding narrative of artificial intelligence, the idea of agents, autonomous software entities capable of perceiving, reasoning about, and acting in complex environments, has largely centered on model capabilities, orchestration frameworks, and function-call interoperability. Yet, with progress on foundational LLM architectures and agent orchestration platforms, an often overlooked component risks hindering large-scale, multi-agent systems: institutional knowledge. A tacit, contextual, and procedurally embedded know-how that lives in the heads of people and in the informal practices of organizations. Treating institutional knowledge as a first-class construct in the software stack, something that agents carry, share, and reason over, is the core substrate of effective agentic reasoning. Without this layer, multi-agent systems will remain brittle, limited in scope, or stuck at the proof-of-concept purgatory. With it, they can evolve into resilient, interoperable infrastructures that encode not only procedures but organizational judgment. Call it OrgMem Layer.

The blind spot in current agent architectures

When developers discuss AI agent interoperability, they often borrow mental models from web services or microservice architectures: agents are treated as stateless endpoints with well-defined APIs. They expose functions, accept inputs, return outputs. By wiring these together with orchestration engines, developers achieve workflows that look impressive in demos.

While technically functional, this view is deeply reductive. It obscures the rich, implicit protocols that govern human coordination—rules not found in process docs, but in daily conversations, unspoken escalation paths, and learned risk tolerances, passed along through mentorship and apprenticeship. In a human team, hand-offs carry not just the raw data but a set of expectations: when to notify a manager, what risk levels are tolerable, which stakeholders must be informed, and what tone to adopt with external parties. Current agent systems replicate the form of workflows but not their function. They follow the letter of the protocol but miss the hidden organizational context.

As enterprises adopt agentic workflows, they initially mirror human processes by encoding business rules in prompts, integrating vector databases for contextual retrieval, or adding audit logs for regulatory compliance. Yet these efforts stop at structural fidelity. They capture documents, templates, and transactional logs, but they do not internalize the mores of the organization. Though some platforms—like Microsoft Copilot or ServiceNow—attempt to layer context into tasks via retrieval or metadata, they still fall short of embedding role-specific judgment or cross-team memory. Agents today may “call functions,” but they don’t “call judgment.”

Deconstructing 'OrgMem Layer'

Drawing from organizational theory (March & Simon, Karl Weick, Nonaka & Takeuchi) and distributed systems to bridge the gap between rigid workflows and human-like coordination, I see institutional knowledge or OrgMem Layer as the following constituent layers:

Normative frameworks or implicit rules about who speaks to whom and when, acceptable risk thresholds, and unwritten hierarchies of decision-making. These norms govern whether an issue is escalated to a manager or resolved autonomously by a frontline agent, whether external vendors get looped into conversations, and which data classifications demand special handling.
Cultural signifiers where every organization has its own lexicon—metaphors, jargon, or idioms that shape how meaning is constructed. These impact tone, branding, and how agents interpret unstructured input or compose communication.
Procedural memory which, beyond flowcharts and SOPs, comprises the informal heuristics and operational shortcuts accumulated over time: common workarounds, anticipated failure modes, and the informal “best practices” not captured in official documentation.
Relational context covers who trusts whom, which teams or individuals have overlapping responsibilities, and the network of trust and interdependencies that permeate large organizations. Agents that operate blindly across these boundaries risk duplicating work, creating friction, or violating access controls.
Historical precedents or knowledge of past incidents, root-cause analyses, and “lessons learned” from prior projects that inform policy exceptions and operational risk postures. These precedents often reside in case studies, internal wikis, or retained solely in human memory.

Unlike RAG, which recalls facts, institutional context encodes the unspoken norms that guide interpretation and decision-making (RAG can recall that “legal must review vendor X,” but it cannot infer that a similar vendor might also need the same scrutiny). These layers cannot be captured in traditional schema, neatly reduced to a JSON schema or a structured database table. They live in a semantic organizational context comprising narrative fragments, role-based rules, graph structures, and evolving annotations.

Why context-free hand-offs break in practice

Consider an automated workflow that processes vendor invoices. Agent A extracts line items and tags expense categories. Agent B routes them for approval. Technically, the hand-off is clean: schema-compliant data flows from one agent to another.

But what if the vendor has a history of legal exceptions that require pre-review above a spending threshold? Humans would know this through institutional precedent. If Agent A lacks that context, it won’t flag the invoice. Agent B, unaware of legal involvement rules, approves based on surface data. The outcome can be a compliance breach or an operational delay as humans intervene retroactively.

Similar breakdowns can occur in customer support workflows. An agent may escalate a customer query because it detects a high-value account, but it may not know that this particular customer’s account has an ongoing dispute and that communications are handled by a dedicated team. The next agent, uninformed of that nuance, might default to generic templates, damaging the customer relationship.

These failures are not rare fringe events. They reflect systemic brittleness inherent when agents operate on syntactic interoperability (shared formats and protocols) without semantic interoperability (shared meaning and context).

Encoding OrgMem layer as a first-class construct

To bridge these gaps, we must treat institutional knowledge as a discrete data model and runtime abstraction, on par with documents, code artifacts, and user profiles. There are several architectural strategies that support this:

Memory graphs that, unlike conventional knowledge graphs, are dynamic, agent-internal representations where nodes represent people, teams, policies, and heuristics, while edges encode relational context, procedural annotations, and trust signals. They capture organizational directory structures and evolve as agents interact, logging failures, corrective actions, and informal shortcuts (e.g., AriGraph, Zep, G-Memory). Agents can query these graphs to retrieve not just facts but heuristics: “which workflows require legal sign-off when anomalies exceed threshold X?” or “which teams typically handle exceptions of this type?” Implementing memory graphs involves instrumenting conversational exchanges and workflow logs to extract context signals, defining a lightweight schema for tactical annotations (e.g., exception rules, trust levels), and embedding graph embeddings into agent prompts for context-aware decision-making. (Challenges still remain: schema drift, privacy segmentation, and update consensus must be addressed via techniques borrowed from distributed systems—e.g., CRDTs or graph sharding.)
Semantic state transfer objects that build on ideas from distributed systems—data structures that bundle payloads with context metadata (e.g., SHIMI that bridges document-centric RAG and structured context transfer). A state object for an invoice might include fields for raw line items alongside pointers to policy documents, trust scores for the supplier, and escalation flags derived from historical precedent. When Agent A hands this object to Agent B, it provides both data and semantic guidance on how to interpret and process it. Key considerations would be a standardized metadata envelope that can carry variable context fields without schema bloat, versioning mechanisms to ensure agents agree on the semantics of metadata fields, and lightweight cryptographic signing to protect the integrity of context metadata.
Protocol extensions across existing interoperability protocols such as the Multi-Tool Coordination Protocol (MCP) or Agent-to-Agent (A2A) messaging provide function call interfaces, but they can be extended with context headers. These headers serve a role analogous to HTTP headers—conveying routing, authentication, and behavioral hints. By defining standard header fields for context tokens, agents can perform policy checks and trust assessments before executing calls. For example, a header might carry a contextual-integrity-token that encodes a hash of the memory graph sub-graph relevant to the task at hand. Receiving agents can validate the token, fetch the referenced context, and align their behavior accordingly. Over time, this mechanism creates a distributed context bus, where agents negotiate not just on data and task specs but on governance semantics.

From proof-of-concept to enterprise-grade resilience

Proof-of-concepts in innovation labs and startup pilots show initial promise, but scaling this approach in large enterprises tend to raise distinct operational challenges.

Governance of context schemas where organizations need ContextOps teams—similar to data governance councils—that define taxonomies, resolve semantic collisions, and approve context mutations. These teams will be tasked with defining context taxonomies, remediation workflows for outdated or incorrect context, and audit controls for who can inject or modify context metadata managing lifecycle policies for memory graphs and state transfer schemas.

Querying a distributed memory graph or fetching context artifacts can introduce latency, undermining the responsiveness of agent pipelines. Solutions may involve local caching of frequently accessed context fragments, asynchronous pre-fetching based on predictive workload models, and prioritizing context bundles by criticality. These event-sourced design patterns may support efficient replay and rollback of memory states.

Institutional knowledge often (always) encompasses sensitive data—corporate strategies and policies, risk postures, or customer dispute histories. The context infrastructure must enforce fine-grained access controls, encryption at rest and in transit, and token-based access that aligns with enterprise IAM systems (e.g., SCIM, SAML, OAuth2 scopes).

Ultimately, I surmise enterprises will adopt agents from multiple vendors, stitching together multiple AI platforms and custom agent frameworks. Establishing a cross-platform context standard will be critical and requires industry consortia or open-source efforts to define schema registries, serialization formats, and reference implementations (see this Internet-Draft proposal for a formal AI agent protocol stack by IETF). Without this, organizations will just recreate semantic silos, effectively undoing interoperability gains.

Lastly, unlike accuracy metrics on text extraction or classification, evaluating whether an agent correctly interprets institutional context would require new KPIs (reduction in human intervention rate, improvement in first-pass resolution, decline in exception retries etc) to help attribute ROI to context-driven workflows versus baseline automation.

Toward an interoperable future

Agentic systems promise to transform enterprise productivity by automating complex, cross-functional tasks. But without an adequate model of institutional knowledge, these systems will remain brittle, prone to breakdowns whenever they step outside the narrow corridors of pre-defined scenarios. The next generation of agents must not only process data but also interpret norms, negotiate exceptions, and remember precedents. By elevating context to the same level of abstraction as tools, data, and LLM models, we enable agents to reason like seasoned professionals rather than rote processors.

Just as the early Internet scaled by encoding trust, identity, and protocol negotiation into HTTP and TCP/IP, so too must enterprise AI systems encode institutional judgment into their foundation. In the end, agents will only be as capable as the context they collectively share. Institutional knowledge is the core layer without which multi-agent ecosystems cannot scale beyond brittle co-pilots. Treating that knowledge as a first-class construct is the next step that unlocks resilient, interoperable, and enterprise-grade AI at scale.

Dean Mai

I specialize in identifying emerging high-risk, high-reward technologies in early-stage startups, research universities, government sponsored laboratories and commercial companies.

In my current role at Xerox Ventures, I lead early- and growth-stage investments to catalyze rapidly evolving technologies, with particular interest in AI & ML, low-code/no-code, security, cloud infrastructure, dev tools, fintech/DeFi, serverless, and open-source.

Previously, I managed sourcing and diligence of strategic technology opportunities for Dyson Research, Design and Development (RDD), New Product Innovation (NPI), and New Product Development (NPD) groups, focusing on companies with a novel and differentiated scientific understanding or tough engineering solution for Dyson core research categories—energy storage (batteries), high-speed digital motors, power electronics, AI & ML, embedded sensors, turbomachinery (aero-thermodynamics and flow), spectroscopy, particle separation (filtration), and materials—in the U.S., Israel and China.

https://www.deanm.ai