The Agent Census Comes Before Agent Governance
The governance conversation around AI agents is finally becoming serious, but it is still starting one step too late. Most companies want a policy, a control layer, a procurement rule, or a risk framework. Those matter. They just do not answer the first operational question.
What agents are already acting inside the business?
That question sounds basic until someone tries to answer it. A customer-support agent is answering calls. A finance agent is preparing reconciliations. A sales agent is drafting follow-ups. A coding agent is opening pull requests. A workflow builder has three automations with model calls buried inside them. A department head has connected a tool to email, calendar, CRM, and files because the demo saved time on Friday afternoon.
None of that is automatically bad. It is also not automatically governed because someone said “human in the loop” in a meeting.
The first governance object is not the policy. It is the roster.
The visible signal
Forbes put the current market anxiety plainly on May 21: “the agents are already running. The governance is not.” The same piece described companies with “a fleet it cannot fully account for.” That is the phrase to pay attention to. A fleet you cannot account for is not a strategy problem yet. It is an inventory problem.
Writer’s 2026 enterprise AI adoption report points at the same structural tension from another direction. Nearly all executives in its survey said their company deployed AI agents in the past year, while 79% of organizations still face challenges adopting AI. The gap is not enthusiasm. The gap is operating capacity. Agents are easy to announce and increasingly easy to connect. They are harder to account for across roles, tools, decisions, data access, review gates, and consequences.
Microsoft’s guidance for real-time voice agents gives a more practical clue. A production deployment should be tested for escalation behavior, latency under load, and handoff context preservation before customer traffic is routed to it. That is not abstract AI ethics language. That is operational readiness language. It asks whether the agent can fail safely, move work to the right human, and preserve enough context that the next person does not have to reconstruct the scene from ashes.
Finance and compliance buyers are pulling in the same direction. AI audit-control language is getting more concrete: prompts, inputs, outputs, model and configuration versions, and evidence of human review. In plain English, the system has to leave enough evidence that someone can reconstruct what happened.
This is the shape of the market now. The question is moving from “can agents do work?” to “can we account for the work agents do?”
Why policy-first governance misses the root problem
A policy assumes the organization can identify the thing being governed. That assumption breaks quickly with agents.
An agent is not just a model. It is not just a chatbot. It is a model connected to a job, a set of tools, a memory or state layer, a data boundary, a trigger, a human routing line, and some amount of decision authority. Change any one of those and the operating risk changes.
A harmless drafting assistant becomes a customer-facing actor when it can send. A research helper becomes a compliance concern when it can read private files. A scheduling bot becomes an operational liability when it can move appointments without preserving the reason for the change. A coding agent becomes a production risk when its output can merge, deploy, or alter dependencies.
The policy may say “agents require approval.” Fine. Which agents? Approved by whom? For what job? With what tools? Under what stop condition? Where does the evidence live? Who owns the agent when it fails on a Sunday?
If those questions cannot be answered, the policy is theater. The organization has governed a category in theory while leaving the actual working objects unnamed.
The correct sequence is simpler: census, then control.
What an agent census is
An agent census is a live operating roster of every agent or agent-like automation that can act inside the business.
It does not need to be complicated at the start. In fact, it should be boring enough that people will keep it current. The first version should answer ten questions for each agent:
- What is this agent called?
- Who owns it?
- What job is it supposed to perform?
- What product or output is it expected to create?
- What tools, systems, and data can it access?
- What decisions can it make without approval?
- Where must a human approve, review, or take over?
- What handoff path does it use when it cannot complete the work?
- What logs or evidence does it leave behind?
- What condition stops it?
That list is not bureaucracy. It is how a business turns a vague autonomous actor into a manageable post.
The roster exposes the difference between three things that are often blurred together: workflow, assistant, and agent. A workflow follows defined steps. An assistant helps a human perform a task. An agent carries a job across steps, tools, and decisions with some degree of independence. Public builders are already naming this distinction in their own language: many business processes are really workflows with a few decision branches. That line matters because it protects companies from buying autonomy where structure would work better.
The census prevents the most expensive kind of AI confusion: treating every automation as an agent, then treating every agent as if it should be more autonomous.
The operating protocol
The practical protocol has four passes.
First, find the actors. Search the obvious places: workflow tools, customer-support systems, CRM automations, finance platforms, internal chatbots, coding tools, browser automations, and departmental experiments. Do not start with blame. Start with visibility. People hide shadow AI when the first question sounds like punishment. They disclose it when the first question sounds like operational support.
Second, assign the post. Every agent needs a job description. Not a feature description. A job description. “Draft outbound emails” is not enough. “Prepare first-draft renewal follow-ups from approved account data, route to account owner for approval, and log source records used” is closer. The job should name the product, the inputs, the routing line, and the point where authority stops.
Third, map authority. Access is not the same thing as authority. An agent may be allowed to read a CRM record without being allowed to edit it. It may draft an invoice exception without approving it. It may recommend a support refund without issuing it. These lines need to be explicit because the model will not infer the business’s risk tolerance from vibes.
Fourth, require evidence. Agents that actually work leave trails. The trail does not have to be a surveillance museum. It has to be enough to reconstruct important actions: what the agent saw, what it produced, what version or configuration was involved, what it changed, what human approved it, and where it handed off. If the business cannot reconstruct an agent’s action, it does not yet have operational control of that action.
This is where governance becomes real. The policy can now attach to named objects. Risk review can focus on actual authority. Security can evaluate real tool access. Operators can test handoffs and failure modes. Leaders can decide where autonomy is worth it and where a workflow is the better design.
The human mirror
The same pattern exists in human organizations. A company that cannot name who owns a responsibility does not fix the problem by writing a better values statement. It fixes the org chart, the routing lines, and the product expectations.
Agents do not remove that requirement. They make it easier to ignore for a while.
That is why agent governance feels slippery. The business sees output and assumes a role exists behind it. But output is not a role. A role has a post, a product, authority limits, supervision, escalation, and consequences. Without those, the agent is just activity with a friendly interface.
The danger is not that agents become too intelligent. The ordinary danger is that they become operationally ambiguous. Nobody knows whether the agent is a tool, a teammate, a workflow, a recommendation engine, or a decision-maker. That ambiguity is where accountability leaks out.
What this looks like in practice
Imagine a professional-services firm using AI across sales, delivery, and finance. The sales team has an agent drafting follow-ups. Delivery has an agent summarizing client calls and creating action items. Finance has an agent flagging unusual invoice patterns. Leadership wants “AI governance.”
The weak move is to hold a meeting about acceptable AI use and circulate a policy. Useful, but incomplete.
The stronger move is to run the census.
The sales agent is allowed to draft but not send. Its owner is the revenue lead. Its product is a first-draft follow-up with CRM references. Its approval gate is the account owner. Its stop condition is any pricing, legal, or delivery promise.
The delivery agent can summarize calls and create internal tasks, but cannot assign client commitments without a project lead. Its required evidence is the transcript reference and the human who accepted the task.
The finance agent can flag anomalies and prepare rationale, but cannot approve adjustments. It must log the data fields used, the rule or model output that triggered the flag, and the manager’s decision.
Now governance has something to govern. The company can test handoffs. It can audit actions. It can improve the agents. It can decide where more autonomy is safe. It can also decide where autonomy is unnecessary because a simple workflow is cheaper and clearer.
That is the difference between AI adoption and AI operations.
Stephen's operating view
Stephen’s “agents that actually work” frame starts here: an agent is not real operating capacity until it has a defined job, measurable product, safe authority, clean routing, and review where it matters.
The agent census is the first artifact because it makes the invisible visible. It turns scattered experiments into an accountable operating fleet. It also stops the organization from mistaking enthusiasm for infrastructure.
This is not anti-agent. It is pro-agent in the only way that survives contact with real work.
More autonomy can be useful. But autonomy compounds whatever operating design already exists. If the job is vague, autonomy compounds vagueness. If the handoff is weak, autonomy compounds dropped context. If the logs are missing, autonomy compounds unprovable work. If nobody owns the agent, autonomy compounds orphaned responsibility.
The census does not slow the organization down. It prevents the false speed of deploying actors nobody can account for.
Bottom line
Governance starts with naming what exists.
Before the next control layer, before the next agent platform, before the next executive AI roadmap, make the roster. Name the agents. Name their jobs. Name their owners. Name their authority. Name their handoffs. Name the evidence they leave behind. Name the stop condition.
A company cannot govern a fleet it cannot see.
And if the agents are already running, the census is not paperwork. It is the first act of control.
Stephen Nickerson.
Built for operators who need AI agents they can test, trust, and improve.
