The Accountability Gap

The Accountability Gap: Who Owns What an AI Agent Does?

In October 2023, New York City launched an AI chatbot to help small business owners navigate city regulations. Mayor Eric Adams called it “a once-in-a-generation opportunity to more effectively deliver for New Yorkers.”

Within months, investigators discovered the chatbot was telling businesses to break the law. It advised employers they could take a cut of workers’ tips (illegal under New York labor law). It told landlords they didn’t need to accept housing vouchers (source-of-income discrimination has been illegal in New York since 2008). It said businesses could refuse cash payments (prohibited by city law since 2020).

The city’s response was instructive. Officials added a disclaimer advising users not to rely on the bot for legal advice. They tweaked some answers. But the chatbot stayed live, continuing to dispense inaccurate information for nearly two years. When journalists asked the bot itself whether it could be used for professional business advice, it confidently replied: “Yes, you can use this bot for professional business advice.”

In January 2026, incoming Mayor Zohran Mamdani announced the chatbot would be shut down, describing it as “functionally unusable” and a symbol of “money spent while refusing to account for the actual costs of what these programs are.”

Here’s what makes this case worth examining: when asked who was responsible for the chatbot’s failures, there was no clear answer. Microsoft built the underlying technology. The city’s IT department deployed it. The Department of Small Business Services owned the content it was trained on. The mayor’s office promoted it. Multiple contractors worked on the project. The previous administration made the decisions; the current administration inherited the mess.

No single party owned the outcome. And when no one owns a decision, everyone pays for the consequences.

Who is Responsible?

This article is the third in a series about deploying AI agents in business contexts. The first, “The Context Problem,” explored what agents need to operate safely: explicit boundaries, clear policies, and genuine oversight rather than approval theater. The second, “The Multiplier Problem,” examined how agent failures scale with agent output, and why multiplication requires something worth multiplying.

This piece asks the question that follows naturally from those two: when an agent takes action and something goes wrong, who is responsible?

The answer matters more than it might seem. Without clear accountability, problems get discovered late. Remediation moves slowly. Learning doesn’t happen. Trust erodes, both internally and with customers. And in an environment where regulators are paying close attention, unclear accountability translates directly into legal and compliance exposure.

The core argument here is simple: accountability cannot be figured out after failure. It must be designed into agent deployments from the start.

How Accountability Disappears

Consider how responsibility diffuses in a typical agent deployment.

The vendor built the capability. IT deployed the infrastructure and manages access. A business unit defined the agent’s scope and configured its behavior. Individual users interact with the agent and trigger its actions. A manager somewhere approved the deployment. Compliance reviewed the project (maybe) at some point before launch.

Each party has legitimate involvement. None has complete control. And when something goes wrong, each can point to the others.

The vendor says: we built the technology, but we don’t control how it’s configured or deployed. IT says: we manage the infrastructure, but we don’t own the business decisions about what the agent should do. The business unit says: we set the parameters, but we can’t control what the AI actually outputs. The users say: we just asked questions and followed the answers. The manager says: I approved the general concept, not every individual interaction.

This isn’t negligence. It’s the natural result of deploying new technology into organizational structures that weren’t designed for it. Traditional accountability frameworks assume human decision-makers at identifiable points. They assume intent can be evaluated, judgment can be assessed, and responsibility follows authority. Agent systems break these assumptions. The agent itself has no intent to evaluate. Its “judgment” operates differently than human judgment. Authority is diffuse by design.

The NYC chatbot illustrates this perfectly. The technology came from Microsoft. The content came from city agencies. The deployment decisions came from multiple offices. When the chatbot told a business owner something incorrect, who exactly was supposed to catch that? Who was watching? Who owned the quality of the output?

The answer, apparently, was nobody.

“Human in the Loop” Isn’t Accountability

Organizations often address this problem by adding human oversight. Someone reviews agent outputs. Someone approves actions before they execute. There’s a “human in the loop.”

But as I argued in “The Context Problem,” much of what passes for human oversight is really approval theater. Having a human touch the process doesn’t create accountability if that human can’t meaningfully evaluate what they’re approving.

This is especially true at scale. An agent handling hundreds or thousands of interactions per day can’t have each one genuinely reviewed by a human. The math doesn’t work. So what happens instead is sampling (which catches only a fraction of problems), spot checks (which create false confidence), or perfunctory approval (which exists mainly as a liability shield).

Recent research from Asana found that 33% of knowledge workers don’t know who to contact when an AI-related issue arises. A third of them. When something goes wrong, a third of the people working with these systems have no idea who’s responsible for fixing it.

This isn’t a training problem. It’s an accountability architecture problem. The systems were deployed without clear ownership, and no amount of documentation fixes that fundamental gap.

True accountability means someone owns the decision itself, not just the process around it. There’s a difference between “I approved this workflow” and “I am responsible for whether this workflow produces good outcomes.” The first is process accountability. The second is outcome accountability. Most organizations have plenty of the first and almost none of the second when it comes to agent deployments.

Existing Frameworks Don’t Map Cleanly

Legal liability frameworks, compliance structures, and organizational accountability systems were designed for a different era. They assume decisions have identifiable decision-makers. They assume you can evaluate intent, assess judgment, and trace responsibility to specific individuals or roles.

The Air Canada chatbot case from 2024 is illustrative. A customer, grieving his grandmother’s death, asked the airline’s chatbot about bereavement fares. The chatbot told him he could book a regular ticket and apply for a partial refund within 90 days. The customer did exactly that. The airline refused the refund because, in fact, their policy doesn’t allow retroactive applications.

When the case reached the British Columbia Civil Resolution Tribunal, Air Canada argued something remarkable: they claimed the chatbot was essentially a separate legal entity responsible for its own actions, and that the customer should have verified the chatbot’s information against other parts of the website.

The tribunal rejected this as “remarkable.” The ruling stated: “While a chatbot has an interactive component, it is still just a part of Air Canada’s website. It should be obvious to Air Canada that it is responsible for all the information on its website. It makes no difference whether the information comes from a static page or a chatbot.”

This ruling established an important principle: companies are liable for what their AI tools tell customers. But it also revealed how unprepared existing frameworks are for these questions. Air Canada’s legal team apparently thought they could argue the chatbot was its own entity. They thought this might work. That tells you something about the state of accountability thinking around AI.

And the Air Canada case was relatively simple: one company, one chatbot, one customer, one interaction. Enterprise agent deployments are far more complex. Multiple systems, multiple decision points, interactions that span days or weeks, outputs that feed into other processes. Tracing a failure back to its source in that environment isn’t just legally complicated; it may be technically impossible without infrastructure that most organizations haven’t built.

What Unclear Accountability Costs

The practical consequences of accountability gaps compound over time.

Problems get discovered late because no one is watching. When a agent produces incorrect output, how long until someone notices? If no one owns the quality of agent decisions, errors can propagate for days, weeks, or months before surfacing. The NYC chatbot dispensed illegal advice for nearly two years. Air Canada’s chatbot was giving customers bad information until someone finally took them to tribunal over it.

Remediation moves slowly because no one owns the fix. When problems do surface, unclear accountability creates coordination overhead. Who decides what to do? Who has authority to change the agent’s behavior? Who tests the fix? Organizations without clear ownership waste time in meetings figuring out who should act while the problem continues.

Learning doesn’t happen because no one is responsible for improving. Accountability isn’t just about fixing individual failures; it’s about getting better over time. An agent deployment without clear ownership doesn’t improve. The same classes of errors recur. The same gaps persist. There’s no feedback loop because there’s no one to close it.

Trust erodes, internally and externally. Employees stop relying on agents they don’t trust. Customers lose confidence in companies whose agents mislead them. The Asana research found that 62% of knowledge workers see AI agents as unreliable, and six in ten say their jobs are made harder by “confidently wrong outputs.” That trust deficit has a cost. It slows adoption, increases oversight burden, and limits the value organizations can capture from these technologies.

Regulatory and legal exposure increases. The EU AI Act’s high-risk provisions take effect in August 2026, with penalties up to 35 million euros or 7% of global revenue. Multiple U.S. states have enacted AI governance requirements. Regulated industries face sector-specific rules. All of these frameworks assume organizations can answer basic questions: Who made this decision? Why? What information influenced it? Who is responsible? Organizations that can’t answer those questions face not just reputational risk but regulatory liability.

Designing Accountability In

The solution isn’t to figure out accountability after deployment. It’s to build accountability into the design from the start.

This requires thinking differently about agent systems. The core principle: AI can contribute thinking, but deterministic systems should retain authority over decisions that matter. This means separating what the agent proposes from who decides to act on it.

Every decision type needs an identified owner. Not a team. Not a role. An individual who is accountable for outcomes, not just process. For low-stakes decisions, this might be the end user. For high-stakes decisions, it should be someone with appropriate authority and expertise. The key is that ownership is explicit and known before deployment, not discovered during a post-mortem.

Ownership means responsibility for outcomes, not just approvals. The person who owns a decision type is responsible for whether decisions of that type are good, not just whether the approval checkbox was checked. This is a meaningful distinction. It’s the difference between “I reviewed this” and “I stand behind this.” True owners care about quality because they bear the consequences.

Escalation paths must handle what doesn’t fit. No framework anticipates every scenario. The question isn’t “who owns every possible decision?” but “what happens when a decision doesn’t fit existing ownership categories?” Clear escalation paths ensure that novel situations reach people with authority to handle them rather than falling through cracks or defaulting to the agent’s judgment.

Authority must match responsibility. A common failure pattern: someone is nominally responsible for agent quality but lacks authority to change agent behavior, access to agent logs, or budget to improve agent training. Accountability without authority is theater. Real accountability requires the power to act.

Organizations already do this for other types of decisions. Consider signing authority for contracts. Most organizations have clear rules about who can commit the company to agreements of different sizes. A junior employee can’t sign a million-dollar deal without appropriate approvals. This isn’t bureaucracy for its own sake. It’s accountability architecture. Agent deployments need the same clarity about what decisions require what level of authority and who specifically holds that authority.

Ownership must be maintained over time. Organizations change. People leave. Roles evolve. Agents get reconfigured. Accountability frameworks that aren’t actively maintained decay. The person who owned the agent at launch may have moved on. The original configuration may have drifted. Regular reviews ensure that accountability remains clear as context changes.

The Questions That Matter

Before deploying an agent, organizations should be able to answer several questions clearly. Inability to answer them isn’t a sign that more planning is needed. It’s a sign that the deployment isn’t ready.

Who owns decisions of this type? Not “which team is involved” but “which individual is accountable for the quality of these decisions.” If the answer is unclear or involves multiple people with shared responsibility, accountability is already diffusing.

What happens when a decision doesn’t fit existing ownership? Every framework has edge cases. What’s the escalation path? Who has authority to make judgment calls? How quickly can escalation happen?

Who is responsible for outcomes, not just approvals? Process accountability is necessary but not sufficient. Who stands behind the actual results? Who cares whether the agent’s outputs are good, not just whether they passed review?

How will we trace decisions back to their sources when problems emerge? When something goes wrong (and eventually something will), can you reconstruct what happened? What information influenced the decision? What was the agent’s state? What did human reviewers see and approve? Without traceability, learning from failures is nearly impossible.

Who maintains accountability assignments as the system evolves? Accountability frameworks decay without maintenance. Who reviews and updates them? How often? What triggers a review?

These questions aren’t theoretical. They’re practical readiness indicators. Organizations that can answer them clearly have done the work to design accountability in. Organizations that can’t are building accountability gaps into their deployments.

Final Thoughts

The three articles in this series form a connected argument. Agents need explicit boundaries to operate safely. Those boundaries need owners. Those owners need to be identified before deployment, not after failure.

Context, multiplication, accountability: they’re all aspects of the same underlying challenge. Deploying AI agents isn’t just a technology problem. It’s an organizational design problem. The technology enables new capabilities, but capturing value from those capabilities requires new structures for governance, oversight, and responsibility.

Organizations that build these structures now will be positioned to scale agent deployments with confidence. They’ll catch problems early, remediate quickly, learn systematically, and maintain trust with customers and regulators. They won’t spend years discovering their chatbots are giving illegal advice, and they won’t find themselves arguing in tribunal that their AI is somehow responsible for its own actions.

The accountability gap isn’t a technology problem waiting for a technology solution. It’s an organizational design problem waiting for leaders willing to do the work. The organizations that close that gap will have an advantage. Those that don’t will learn through failure, and by then, the costs will have compounded.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *