The Protocol Is Not the Operating Model

MCP is becoming the common road system for AI agents. That matters. Roads make movement possible. They do not decide where the business should go, who is allowed to drive, what cargo can move, what gets logged, or when traffic has to stop.

That distinction is getting more important as the agent market matures. Toloka's May 15 summary of the Model Context Protocol roadmap describes a protocol crossing from developer experiment into enterprise infrastructure: remote transports, scalable sessions, async tasks, agent-to-agent communication, governance, audit trails, authentication, gateways, and configuration portability. VentureBeat's May 15 enterprise orchestration piece makes the same larger point from the platform side. The fight is moving away from raw model quality and toward the layer where agents plan, call tools, access data, run workflows, and prove to security teams that they did not do something they were not allowed to do.

That is the right fight. It is also easy to misunderstand.

A protocol can connect an agent to tools. It cannot give the agent a job description. It cannot decide who owns the outcome. It cannot know which action is low-risk in your business, which one requires human review, which credential should be revoked, or which exception means the agent must stop. Those are not protocol questions. They are operating model questions.

Agents that actually work need both.

The connection layer is finally becoming real

The reason MCP is getting attention is simple: agents need a standard way to reach the world. A model sitting alone in a chat window can generate advice. An agent has to touch calendars, documents, databases, CRMs, ticketing systems, code repositories, search indexes, internal tools, and customer records. Without a connection layer, every agent project becomes a pile of custom glue.

MCP gives the market a shared pattern for that glue. It lets tools describe themselves to agent clients. It gives builders a cleaner way to expose data and actions. It allows ecosystems of servers, clients, directories, and gateways to emerge. That is why the adoption numbers are moving so quickly. Toloka reported that MCP TypeScript and Python SDKs reached 97 million monthly downloads in March 2026, with more than 9,400 public servers and support from major AI providers.

Those numbers do not mean production agent work is solved. They mean the connection layer is becoming legible.

That is still a big step. Once a common protocol exists, teams can stop rebuilding the same adapters over and over. They can expose tools in a more portable way. They can reason about the agent surface as infrastructure instead of treating every integration as a one-off experiment.

But infrastructure creates a new responsibility. The moment the road system becomes easier to use, more traffic appears on it. More tools. More agents. More delegated actions. More credentials. More possible failures that do not look like model errors at all.

The protocol makes movement easier. It also makes operating discipline harder to avoid.

Production problems are showing up in the roadmap

The useful part of the MCP roadmap is not that it sounds advanced. The useful part is that it reveals where real deployments are hurting.

Transport has to evolve because production systems run behind load balancers, proxies, restarts, and horizontal scaling. Sessions have to be resumable because losing a session can mean losing the working context of the agent. Tasks matter because serious workflows do not always finish inside a single request. Streaming matters because large results arrive progressively. Agent-to-agent communication matters because production work is rarely one model talking to one tool in isolation.

Then the roadmap reaches the enterprise readiness problems, and the signal gets sharper.

Toloka lists audit trails as a known blocker: there is no standardized way to log which tools were called, by which agent, with what arguments, and what results were returned. It lists authentication and identity because static secrets are not enough for browser-based agents or enterprise teams. It lists gateway behavior because companies need to know how agent traffic moves through their network infrastructure. It lists configuration portability because a setup that works in one client does not automatically transfer to another.

None of those are glamorous. All of them are what separate a clever demo from operating capacity.

This is the pattern Stephen's world should pay attention to. The market keeps discovering that agent work fails in the spaces between capabilities: between model and tool, tool and identity, identity and policy, policy and evidence, evidence and review, review and improvement. The weak point is rarely one dramatic AI mistake. It is the missing operating structure around a system that can now take action.

Agentic means action in a loop

Agentic.ai's current definition is useful because it keeps the line clean. Agentic AI is not a smarter chatbot. It is a system that pursues goals, decides what to do next, takes action through tools, observes the result, and loops until the task is done, blocked, or escalated.

That loop is the product promise. It is also the risk.

A chatbot can be wrong and remain mostly contained. A copilot can suggest something a human may ignore. An agent connected to tools can change records, move files, send drafts, update tickets, call APIs, run code, and trigger other systems. The question is no longer only whether the output was good. The question is what happened while the system was trying to reach the goal.

That is why one of Agentic.ai's scoring dimensions lands harder than it may first appear: a real agent knows when to stop.

Knowing when to stop is not a personality trait. It is an operating design. The agent needs a job boundary. It needs a definition of done. It needs uncertainty thresholds. It needs retry limits. It needs categories of action that always escalate. It needs a way to recognize that a request has left its assigned lane. It needs evidence that a human or another agent can inspect after the fact.

If those pieces are missing, the loop is not operational maturity. It is automated drift.

The operating model wraps the protocol

The simplest way to avoid that drift is to separate two artifacts that are often blurred together.

The protocol answers: how can the agent reach tools and data?

The operating model answers: what work is the agent assigned, under whose authority, with which permissions, leaving what evidence, reviewed by whom, and stopped under what conditions?

Both are necessary. They are not interchangeable.

For a small or mid-sized business, the first operating model does not need to be heavy. It can start as a plain-language agent card. One page. One agent. One job. The card names the owner, the trigger, the tools it can read, the tools it can change, the credential or identity it uses, the output it must produce, the log it must leave, the review gate, and the stop rule.

That card sounds simple because it is. It is also the difference between building an agent and accumulating automation fog.

When the team adds MCP servers, the card becomes more important, not less. Each new server expands the surface the agent might touch. Each tool call becomes a real operational event. Each permission needs a reason. Each output needs an owner. Each failure needs somewhere to land.

The businesses that treat the agent card as bureaucracy will eventually have to reconstruct it during an incident. The businesses that write it first get to use it as a design tool.

Governance is not the brake

A lot of teams still hear governance as a synonym for delay. That is understandable. Bad governance is slow, vague, and disconnected from the work. It produces policies no one reads and approval steps that do not improve the outcome.

Good governance is different. Good governance is the operating substrate that lets autonomy increase without making the system less visible.

Toloka's line is the right one: the organizations closing the deployment gap fastest treat evaluation, governance, and human oversight as infrastructure requirements, not afterthoughts. That is not a compliance slogan. It is the practical shape of production AI.

Evaluation tells the team whether the agent's work is getting better. Governance tells the team what authority the agent has and why. Human oversight tells the system where judgment still belongs. When those are built into the workflow, the agent can move faster inside its lane because the boundary is clear.

Salesforce uses different language on its Agentforce page, but the market signal is similar. It emphasizes grounding checks against business policies before responses are sent and human handoff when the agent should not continue. Vendor pages will always frame this in product-friendly terms. The underlying buyer desire is still useful: people want automation that is tied to policy, context, and escalation instead of free-floating intelligence.

That desire is the button. The message is that agents become trustworthy through designed operating boundaries.

The practical test before expanding agent access

Before a team expands an agent's protocol surface, it should answer six questions.

  1. What job is this agent assigned?
  2. Who owns the outcome?
  3. What tools and data can it read?
  4. What tools and data can it change?
  5. What evidence does every meaningful action leave behind?
  6. What exact condition makes it stop and escalate?

Those questions are not a substitute for technical architecture. They are the minimum operating architecture that tells the technical work where to go.

If the team cannot answer them, adding more MCP servers will not create a better agent. It will create a wider blur. The agent may become more capable in the abstract while becoming less manageable in the business.

If the team can answer them, the protocol becomes powerful. Tool connections have a purpose. Permissions have a boundary. Logs have a reader. Review gates have a reason. Stop rules protect both the business and the agent from being asked to do work it was never designed to carry.

That is how autonomy earns its next layer.

The protocol is the road; the operating model is the work

The agent market is going to keep building roads. MCP will get better. Enterprise gateways will mature. Agent communication will become less awkward. Runtimes will become stickier. Vendors will compete to own the place where workflows, memory, permissions, audit logs, and monitoring live.

That is all useful. None of it removes the operator's job.

The operator's job is to define work clearly enough that a non-human worker can help without becoming invisible. Name the job. Name the owner. Name the access. Name the evidence. Name the review gate. Name the stop rule. Then choose the protocol and platform that make those decisions executable.

A protocol can make an agent reachable. An operating model makes it responsible.

That is the distinction. Once it is visible, the hype gets quieter and the work gets simpler.

Sources

  • Toloka, "The future of MCP: 2026 roadmap, enterprise adoption, and what comes next," May 15, 2026: https://toloka.ai/blog/the-future-of-mcp-enterprise-adoption/
  • VentureBeat, "Claude's next enterprise battle is not models: it's the agent control plane," May 15, 2026: https://venturebeat.com/orchestration/claudes-next-enterprise-battle-is-not-models-its-the-agent-control-plane
  • Agentic.ai, "What Is Agentic AI?", updated May 18, 2026: https://agentic.ai/what-is-agentic-ai
  • Salesforce, "Best AI Agents: A Guide to the Leading Autonomous Platforms," accessed May 18, 2026: https://www.salesforce.com/agentforce/ai-agents/best-ai-agents/

Stephen Nickerson.
Built for operators who need agents they can test, trust, and improve.