Agentic design patterns: The missing link between AI demos and enterprise value

The enterprise AI market is currently nursing a massive hangover. For the past two years, decision-makers have been inundated with demos of autonomous agents booking flights, writing code, and analyzing data. Yet, the reality on the ground is starkly different. While experimentation is at an all-time high, deployment of reliable, autonomous agents in production remains challenging. A recent study by MIT’s Project NANDA highlighted a sobering statistic: Roughly 95% of AI projects fail to deliver bottom-line value. They hit walls when moved from the sandbox to the real world, often breaking under the weight of edge cases, hallucinations, or integration failures.According to Antonio Gulli, a senior engineer at Google and the Director of the Engineering Office of the CTO, the industry is suffering from a fundamental misunderstanding of what agents actually are. We have treated them as magic boxes rather than complex software systems. "AI engineering, especially with large models and agents

Dec 17, 2025 0 5

Agentic design patterns: The missing link between AI demos and enterprise value

The enterprise AI market is currently nursing a massive hangover. For the past two years, decision-makers have been inundated with demos of autonomous agents booking flights, writing code, and analyzing data. Yet, the reality on the ground is starkly different. While experimentation is at an all-time high, deployment of reliable, autonomous agents in production remains challenging.

A recent study by MIT’s Project NANDA highlighted a sobering statistic: Roughly 95% of AI projects fail to deliver bottom-line value. They hit walls when moved from the sandbox to the real world, often breaking under the weight of edge cases, hallucinations, or integration failures.

According to Antonio Gulli, a senior engineer at Google and the Director of the Engineering Office of the CTO, the industry is suffering from a fundamental misunderstanding of what agents actually are. We have treated them as magic boxes rather than complex software systems. "AI engineering, especially with large models and agents, is really no different from any form of engineering, like software or civil engineering," Gulli said in an exclusive interview with VentureBeat. "To build something lasting, you cannot just chase the latest model or framework."

Gulli argues that the solution to the "trough of disillusionment" is not a smarter model, but better architecture. His recent book, "Agentic Design Patterns," provides repeatable, rigorous architectural standards that turn "toy" agents into reliable enterprise tools. The book pays homage to the original "Design Patterns" (one of my favorite books on software engineering), which brought order to object-oriented programming in the 1990s.

Gulli introduces 21 fundamental patterns that serve as the building blocks for reliable agentic systems. These are practical engineering structures that dictate how an agent thinks, remembers, and acts. "Of course, it's important to have the state-of-the-art, but you need to step back and reflect on the fundamental principles driving AI systems," Gulli said. "These patterns are the engineering foundation that improves the solution quality."

The enterprise survival kit

For enterprise leaders looking to stabilize their AI stack, Gulli identifies five "low-hanging fruit" patterns that offer the highest immediate impact: Reflection, Routing, Communication, Guardrails, and Memory. The most critical shift in agent design is the move from simple "stimulus-response" bots to systems capable of Reflection. A standard LLM tries to answer a query immediately, which often leads to hallucination. A reflective agent, however, mimics human reasoning by creating a plan, executing it, and then critiquing its own output before presenting it to the user. This internal feedback loop is often the difference between a wrong answer and a correct one.

Once an agent can think, it needs to be efficient. This is where Routing becomes essential for cost control. Instead of sending every query to a massive, expensive "God model," a routing layer analyzes the complexity of the request. Simple tasks are directed to faster, cheaper models, while complex reasoning is reserved for the heavy hitters. This architecture allows enterprises to scale without blowing up their inference budgets. “A model can act as a router to other models, or even the same model with different system prompts and functions,” Gulli said.

Connecting these agents to the outside world requires standardized Communication by giving models access to tools such as search, queries, and code execution. In the past, connecting an LLM to a database meant writing custom, brittle code. Gulli points to the rise of the Model Context Protocol (MCP) as a pivotal moment. MCP acts like a USB port for AI, providing a standardized way for agents to plug into data sources and tools. This standardization extends to "Agent-to-Agent" (A2A) communication, allowing specialized agents to collaborate on complex tasks without custom integration overhead.

However, even a smart, efficient agent is useless if it cannot retain information. Memory patterns solve the "goldfish" problem, where agents forget instructions over long conversations. By structuring how an agent stores and retrieves past interactions and experiences, developers can create persistent, context-aware assistants. “The way you create memory is fundamental for the quality of the agents,” Gulli said.

Finally, none of this matters if the agent is a liability. Guardrails provide the necessary constraints to ensure an agent operates within safety and compliance boundaries. This goes beyond a simple system prompt asking the model to "be nice"; it involves architectural checks and escalation policies that prevent data leakage or unauthorized actions. Gulli emphasizes that defining these "hard" boundaries is "extremely important" for security, ensuring that an agent trying to be helpful doesn't accidentally expose private data or execute irreversible commands outside its authorized scope.

Fixing reliability with transactional safety

For many CIOs, the hesitation to deploy agents stems from fear. An autonomous agent that can read emails or modify files poses a significant risk if it goes off the rails. Gulli addresses this by borrowing a concept from database management: transactional safety. "If an agent takes an action, we must implement checkpoints and rollbacks, just as we do for transactional safety in databases," Gulli said.

In this model, an agent’s actions are tentative until validated. If the system detects an anomaly or an error, it can "rollback" to a previous safe state, undoing the agent’s actions. This safety net allows enterprises to trust agents with write-access to systems, knowing there is an undo button. Testing these systems requires a new approach as well. Traditional unit tests check if a function returns the right value, but an agent might arrive at the right answer via a flawed, dangerous process. Gulli advocates for evaluating Agent Trajectories, metrics that evaluate how agents behave over time.

“[Agent Trajectories] involves analyzing the entire sequence of decisions and tools used to reach a conclusion, ensuring the full process is sound, not just the final answer,” he said.

This is often augmented by the Critique pattern, where a separate, specialized agent is tasked with judging the performance of the primary agent. This mutual check is fundamental to preventing the propagation of errors, essentially creating an automated peer-review system for AI decisions.

Future-proofing: From prompt engineering to context engineering

Looking toward 2026, the era of the single, general-purpose model is likely ending. Gulli predicts a shift toward a landscape dominated by fleets of specialized agents. "I strongly believe we will see a specialization of agents," he said. "The model will still be the brain... but the agents will become truly multi-agent systems with specialized tasks—agents focusing on retrieval, image generation, video creation — communicating with each other."

In this future, the primary skill for developers will not be to coax a model into working with clever phrasing and prompt engineering. Instead, they will need to focus on context engineering, the discipline that focuses on designing the information flow, managing the state, and curating the context that the model "sees."

It is a move from linguistic trickery to systems engineering. By adopting these patterns and focusing on the "plumbing" of AI rather than just the models, enterprises can finally bridge the gap between the hype and the bottom line. "We should not use AI just for the sake of AI," Gulli warns. "We must start with a clear definition of the business problem and how to best leverage the technology to solve it."