The new promise of AI agents is not just that they answer. It is that they keep working when nobody is sitting in the chair.

OpenAI’s current Codex positioning makes the shift explicit. Codex is described as a command center for multi-agent coding, with worktrees, cloud environments, and automations for “always-on background work” like issue triage, alert monitoring, CI/CD, and review. That is not a chatbot promise. That is an operating promise: assign work, let the system move, and return when something meaningful is ready.

The promise is real. It is also where many agent projects quietly break.

A background worker creates a new kind of operational surface. If the agent completes the task, everyone sees the magic. If it cannot complete the task, needs approval, lacks context, hits a permission boundary, finds conflicting instructions, or produces something that requires judgment, the system has to know where that partial work goes. Without that answer, autonomy does not remove work. It hides the unfinished part until it becomes backlog.

That hidden backlog is the agent queue problem.

The failure is often downstream of the model

The public language around agents is getting more practical because more people have now tried to place agents inside real workflows. The complaint is not simply that models hallucinate. The sharper complaint is that workflows do not have clean access, reliable actions, notification paths, review states, or escalation rules.

One indexed Reddit discussion put it plainly: “Biggest issue is not the model, it’s integration.” The same snippet continued that agents either do not have clean access to live data or cannot take reliable actions, so teams end up building glue code instead of using them. Another complaint named the stall point even better: “The 10% that needs review piles up silently. Two days later, 47 things are waiting and the whole pipeline is stalled. The workflow needed a notification system before it needed the AI.”

That last sentence is the whole operating lesson.

An agent can draft, classify, summarize, triage, patch, compare, and route. But every workflow has exceptions. A customer record is missing. A policy is ambiguous. A test fails for a reason the agent cannot safely decide. A suggested change touches another team’s code. A message should be sent, but the tone is risky. A document can be prepared, but not filed. A refund can be recommended, but not issued.

None of those are model failures by default. They are queue failures when the system has no visible place for unresolved work.

Always-on work needs visible states

Human teams survive messy workflows because people improvise. Someone remembers that Denise needs to approve refunds over $500. Someone knows the client hates long emails. Someone sees the Slack thread and nudges the engineer who forgot to review the pull request. The workflow looks stable because humans are quietly carrying state in their heads.

Agents expose that hidden coordination. They cannot reliably depend on vibes, hallway memory, or unofficial exceptions unless those things are turned into operating substrate. When the agent reaches the edge of its authority, the business has to decide what state the work enters.

A useful agent queue has more states than done and failed.

It needs draft, waiting for data, waiting for approval, needs human judgment, blocked by permission, retry scheduled, escalated, paused, expired, and stopped. Each state should have an owner, a notification path, a time rule, and an evidence trail. Otherwise the agent can appear busy while the business loses sight of the work that matters.

This is the difference between automation and operations. Automation asks whether a task can be performed. Operations asks what happens to the task across its whole life: intake, assignment, execution, exception, review, approval, handoff, evidence, and closure.

Agents that actually work need the second frame.

Security is part of the queue, not a separate department

AWS’s May 2026 AI Security Framework gives operators a useful principle: “You aren’t adding security to AI. You’re building AI on top of security.” The same executive summary tells teams to establish agentic identity and fine-grained access on day one, then harden for production with threat detection, data classification, and AI-specific monitoring.

That advice is usually read as security architecture. It is also queue architecture.

Access rules determine which work the agent may finish and which work must be routed elsewhere. Identity determines whose authority is being used when the agent takes action. Monitoring determines whether unresolved work is visible. Guardrails determine which states are allowed. A kill switch is a queue rule too: under these conditions, stop the worker and surface the issue.

If security is added after the prototype, the queue usually inherits bad assumptions from the demo. The demo agent has broad access so it can look impressive. It uses a human’s credentials because that was convenient. It writes directly to the system of record because a review layer felt slow. It logs vaguely because nobody asked what evidence would be needed later.

Then the demo becomes useful, and the team starts depending on it.

By the time the system is called production, the queue is already full of implied authority. That is when “governance” feels like friction. Not because governance is the problem, but because the operating protocol was missing when the agent first touched real work.

The protocol is simple

The agent queue protocol is the named path for work the agent cannot or should not finish alone.

Before an agent becomes an always-on worker, define seven things.

  1. The work state. What states can a task occupy besides done and failed?
  2. The owner. Who is accountable for each state when the agent stops moving?
  3. The authority boundary. What may the agent read, draft, recommend, change, send, delete, or approve?
  4. The notification rule. When does a human get pulled in, and where does that alert appear?
  5. The evidence trail. What artifact shows what the agent saw, decided, attempted, and changed?
  6. The escalation rule. What happens when the queue ages, repeats, or hits a high-risk condition?
  7. The stop condition. What makes the agent pause rather than improvise?

Those seven pieces do not need to be complicated. A small business can start with a Trello column, Slack alert, owner field, review checklist, and simple stop rule. A larger company may need ticketing integration, identity brokering, policy enforcement, audit logs, and monitoring. The scale changes. The shape does not.

The point is to make unfinished work visible before it becomes operational debt.

The first useful agent may be the one that reveals the queue

There is a temptation to begin with the most impressive automation. Let the agent process the inbox. Let it triage every ticket. Let it update the CRM. Let it review every pull request. Let it run in the background.

A better first move is often more boring and more valuable: have the agent map where work currently gets stuck.

Ask it to classify the last 100 tickets by state. Ask where human judgment was required. Ask which approvals were predictable. Ask which records were missing. Ask which tasks waited without anyone noticing. Ask which exceptions repeated. Ask where the system relied on someone remembering a rule that was never written down.

That pass turns the agent into a workflow x-ray. It shows the queue that already existed before automation. Then the business can decide which parts should be automated, which parts should be drafted for review, and which parts should stay human because the judgment is the product.

This is how agents move from theater to operating capacity. Not by pretending the work is fully autonomous, but by designing the path for every part that is not.

Bottom line

Always-on agents make background work possible. They also make invisible work dangerous.

If an agent can start work while nobody is watching, the business needs a protocol for the work it leaves behind. That protocol is not extra administration. It is the operating surface that keeps autonomy from becoming a silent pileup.

The agent’s job is not only to complete tasks. Its job is to move work into the right state with the right evidence, owner, authority, and next action.

That is when background work becomes trustworthy.

Sources

  • OpenAI, “Codex,” accessed May 22, 2026, https://openai.com/codex/
  • OpenAI Developers, “Codex,” accessed May 22, 2026, https://developers.openai.com/codex
  • AWS Security Blog, “The AWS AI Security Framework: Securing AI with the right controls, at the right layers, at the right phases,” May 15, 2026, https://aws.amazon.com/blogs/security/the-aws-ai-security-framework-securing-ai-with-the-right-controls-at-the-right-layers-at-the-right-phases/
  • Anthropic, “Project Glasswing: Securing critical software for the AI era,” accessed May 22, 2026, https://www.anthropic.com/glasswing
  • Reddit indexed snippets from r/AI_Agents and r/artificial, accessed through public search results May 22, 2026.

Stephen Nickerson.
Built for operators who need agents they can test, trust, and improve.