Skip to main content

Before you build agentic AI, understand the confused deputy problem

Learn how organizations must think differently about risk in preparation for multi-agent generative AI.

We are seeing a lot of internal, assistance-based AI use cases for:

  • Topic research
  • R&D
  • Code generation
  • Data analysis
  • Marketing
  • and other tasks

What we are not seeing (yet) is a plethora of external, customer-facing use cases (beyond chatbots), where customers are engaging directly with AI.

But this is about to change ... and fast.

In his keynote address at Think 2025, Arvind Krishna, IBM Chairman and CEO, said the time of AI experimentation is over, and organizations are moving quickly to deliver business value through AI.

AI generally is a challenge within the enterprise. Its usefulness is directly correlated with access to data and its ability to predict answers (provide what we view as intelligence). The challenge is that the language models available to the generative AI will need to include sensitive information as we pivot to more enterprise use cases. GenAI does not have much power on its own, but when we grant it access to applications that it can act on and through, we begin to see real power through automation.

Agentic AI is a new pattern where a series of AIs work together to produce a result. Multi-agent generative AI workflows offer even more power by orchestrating fine-grained control on complex tasks. Where straight(er)forward GenAI provides broad leverage of data to support a human actor (generally), Agentic flows support humans, but leverage each other in progressively more complex ways.

Organizations need to be ready to mitigate a number of risks that come with this new wave of AI. This post will explain how a well-known problem in InfoSec — the confused deputy problem — can help you think about risks in the agentic AI era.

»Risks in agentic AI

Up to now, organizations have been cautious about deploying AI externally, and for good reasons. AI is incredibly powerful, but it is not perfect. Mistakes can leave a customer dissatisfied with their experience. And there is a risk that a bad actor can take advantage of what the AI system is doing.

»The confused deputy problem

The confused deputy problem is a good illustration. This occurs when a user, application, or machine tricks a higher-privileged entity (system) into exposing sensitive data, performing a damaging action, or allowing unauthorized access to restricted functions. This risk is way more obvious in GenAI work. Not only is GenAI effective at gaining access to passwords (basic hacking stuff), but AI loves to be helpful. Imagine a demo where we import an email archive to an OpenAI engine, and in minutes, the AI helps us to isolate cloud credentials in the archive. This is a compelling example of how I can ‘confuse my AI deputy’ into showing me risky information.

The risk associated with confused deputy attacks is especially high in multi-agent generative AI systems as more agents = more applications = more data and a chain of systems calls that are often using their own permissions (not the user who requested).

Comparison of single-agent vs. multi-agent architectures

Comparison of single-agent vs. multi-agent architectures (source: IBM)

»Multi-agent generative AI and its risks

Think about how multi-agent generative AI works. Let’s use software development as an example. I can create an agentic pattern where…

  1. One agent writes code
  2. Another agent critiques the code for errors
  3. A third agent analyzes the code for best practice coding standards

In this scenario, two agents check up on another agent’s work. This is a fairly safe usage as a human gets involved to test, approve, and promote the code. As the developer, I buffer for any mistakes this AI chain might make. This can be very productive and low-risk.

But what about another example, such as financial trades? I can create an agentic pattern where…

  1. An AI agent recommends a trade
  2. Another agent critiques it
  3. A third checks for SEC violations, etc.

… and finally, an algorithmic trading decision is executed. It’s very productive but very high risk because there’s no human in the loop. (Humans are also very risky, but that is for another dialogue.) In this example, as an institutional trading firm, I am way more likely to stay with human traders managing an AI assistant due to the overwhelming risk a mistake imposes on profitability.

These multi-agent patterns are interconnected and complex — one AI calling another AI, calling another AI, and so on. Each step in the workflow increases the attack surface. All a bad actor needs to do is convince one agent to give them rights that they shouldn’t have, and now, potentially, I have a cascading set of vulnerabilities.

»Laying a more secure foundation

With agentic AI systems, credentials are used everywhere to grant and restrict privileges. They are used for every API call to a third-party application. There are credentials for every user and application, an AI agent interacting with an AI agent. It is a complex process which, if not managed well, could lead to a sprawling estate of secrets.

To prepare for this, organizations need dynamic environments. This way, if a problem is discovered—i.e., an AI agent is misused, a confused deputy scenario occurs, or sensitive data is exposed—organizations can quickly take action to tear down and destroy environments and build them up again, this time with tighter controls.

This requires automation. When the delivery of systems is automated from end-to-end using predefined workflows, infrastructure as code, and identity-based security, organizations can easily shut down, move, destroy, and build environments while also adding policies and credentials without requiring a lot of human involvement. Attack surfaces are not necessarily reduced, but risk is mitigated. Mean time to resolve (MTTR) is significantly improved and easier to measure through automation.

Consider an AI support aid for customer service representatives. I would like to have the ability to build the environment with the latest trained models, but I would also like to cascade the appropriate user permissions through the agentic chain as data (RAG, databases, archives, other) is being requested so as to prevent the unauthorized data from even being available to the AIs in question. This kind of guardrail is a more ‘first principle’ style of proactivity that will yield tremendous downstream benefits by improving the cost and risk profile of everything that you do with AI.

The world is becoming more dynamic, not less. If organizations aren’t already using dynamic infrastructure unlocked with dynamic secrets, they risk falling behind. The agentic AI push is going to continue at pace, and if we want to use it experimentally and get value from it, we need to be able to control the risks surrounding it.

»Prepare for the AI era

The Infrastructure Cloud from HashiCorp is a unified platform that helps organizations make the most of cloud investments and prepare for mitigating the security risks associated with AI. It lets you manage the full lifecycle of your infrastructure and security resources through infrastructure as code, policy as code, and dynamic secrets management, and automated workflows to create safe, dynamic environments.

If you’re tired of technology and process fragmentation in your organization, learn how a platform-based approach to security and infrastructure automation can transform your teams for the better. Read Do cloud right with The Infrastructure Cloud.

Sign up for the latest HashiCorp news

By submitting this form, you acknowledge and agree that HashiCorp will process your personal information in accordance with the Privacy Policy.