Agent Action Approval Queue: Build the UI for Reviewing What Your AI Wants to Do
Approval Queue Strategy for Your New SaaS
Goal: Once your in-product AI agent can do real work — send emails, write to a CRM, charge cards, modify records, post to external systems — you need a UI where humans review and approve sensitive proposed actions before they execute. Done well, an approval queue is the difference between "customers trust the agent because they always know what's about to happen" and "customers turned the agent off because it sent the wrong email once." Done badly, the queue becomes a bottleneck that defeats the agent's value, drowns reviewers in noise, or creates a false sense of safety because reviewers click "approve" without reading. Avoid the founder traps of approving all actions individually with a generic modal (reviewer fatigue), batch-approving everything (defeats the purpose), routing every action to the queue (latency kills agent value), or routing nothing to the queue (the first wrong-email incident is a question of when).
Process: Follow this chat pattern with your AI coding tool such as Claude or v0.app. Pay attention to the notes in [brackets] and replace the bracketed text with your own content.
Timeframe: Tier classification + first approval flow shipped in week 1. Queue UI + batched approval + audit trail in week 2-3. SLA tracking + escalation + auto-approval rules in week 4. Continuous tuning of what's auto vs. queued for months after.
Why This Matters Now
You shipped an in-product AI agent. Customers love it for the read-only and low-stakes-write actions. Then someone notices the agent can also send emails / modify records / make payments — and the support escalation that follows ("the AI emailed our wrong CFO; please remove this feature") teaches you that trust is asymmetric: one bad action erases dozens of good ones.
The right answer is not "ship without sensitive actions" (you lose the value) and not "ship with everything auto-executing" (you take the trust hit eventually). The right answer is a tiered approval architecture: low-stakes actions execute immediately; high-stakes actions queue for human approval; critical-stakes actions require additional review. Combined with a good queue UI that makes review fast, accurate, and audit-traceable, this gets you the productivity wins without the trust risk.
This guide assumes you have already shipped In-Product AI Agent Implementation (the broader agent build) and considered AI Memory & Context Retention. Cross-reference Approval Workflows / Multi-Step Routing (similar pattern, non-AI focused), Audit Logs, Customer-Facing Audit Logs, Long-Running Operations / Job Status UI, and Background Jobs / Queue Management. Reference VibeReference: Agent Reliability & Production Operations for the broader operational discipline this UX sits inside.
1. Tier the Actions FIRST
Not every agent action needs the same gating. Tier them explicitly before building any UI.
Help me classify every action my agent can take into the right tier.
The tiers:
**Tier 0: No approval (auto-execute)**
- Read-only operations: search, lookup, retrieve
- Idempotent low-stakes writes: toggle a UI preference, add a tag, mark
read
- Rationale: zero blast radius; reviewing them creates pure friction
**Tier 1: Notify-only (auto-execute, surface what happened)**
- Low-stakes writes: create internal note, add to-do, log activity
- Auto-execute; surface in a "what your agent did" feed
- Reversible by the user with one click
- Rationale: small productivity wins should not be friction-gated
**Tier 2: Soft confirmation (in-flow approval)**
- Medium-stakes writes: schedule meeting, send DM to teammate, update
shared record
- User sees the action proposal in the agent's reply with a "do it" /
"edit" / "cancel" button
- No separate queue; resolved within the conversation
- Latency budget: seconds (user is engaged in the conversation)
**Tier 3: Queued approval (defer to a queue)**
- High-stakes writes: send external email, post to public feed, modify
customer-facing data, large transactions
- Routed to an approval queue; reviewer (the user OR a team member with
approval authority) reviews + approves/rejects
- Latency budget: minutes-to-hours; users / customers know this is async
**Tier 4: Multi-party approval / out-of-band**
- Critical-stakes writes: large financial transactions, account-level
changes, irreversible deletes, sending to high-risk recipients
- Two-person rule: requires two distinct approvers
- Out-of-band confirmation (email link with token, SMS, etc.)
- Time delay: optional cool-off period (e.g., 30 minutes) before execution
- Latency budget: hours
For my product, list every tool / action my agent can call. Help me
classify each into the right tier.
Defaults that founders get wrong:
- "Send email" — almost always Tier 3 (external party affected)
- "Add internal note" — almost always Tier 1 (low blast radius; reversible)
- "Charge card" — Tier 4 (financial; usually two-person + out-of-band)
- "Delete record" — Tier 3 minimum; Tier 4 if irreversible
- "Update shared record" — Tier 2 if the user requested it; Tier 3 if
the agent initiated proactively
- "Reply to customer in shared inbox" — Tier 3 (external party)
- "Schedule a meeting" — Tier 2 if internal; Tier 3 if invites external
Output: a table of [tool_name, default_tier, rationale, override_rules].
Discipline: write this table down once. It is the spec. Every new tool added must be classified before it's enabled.
2. The Queue Data Model
A clean schema makes the rest tractable.
Build me the data model for an agent action approval queue.
Tables:
**`agent_action_proposals`**
- id (UUID)
- workspace_id (tenant scope; NEVER let one workspace see another's queue)
- proposing_agent_run_id (FK to your agent_runs table)
- proposed_by_user_id (the customer's user who triggered the agent)
- tool_name (string; the action being proposed)
- tier (enum: 0/1/2/3/4)
- payload (jsonb; the full proposed arguments)
- summary (string; human-readable: "Send email to john@acme.com about
Q3 contract")
- impact_preview (jsonb; the diff / preview of what will change)
- proposed_at (timestamp)
- expires_at (timestamp; proposals stale after N hours)
- status (enum: pending / approved / rejected / executed / failed /
expired / superseded)
- approver_user_id (nullable; populated on resolve)
- approver_action (nullable; "approve" / "reject" / "edit" / "edit-and-approve")
- resolution_at (nullable timestamp)
- rejection_reason (nullable string)
- edited_payload (nullable jsonb; if approver edited the proposal)
- execution_id (nullable; populated when actually executed)
- execution_result (jsonb; the result of the action; success/error/payload)
- escalation_required (bool; true if Tier 4 needed second approver)
- second_approver_user_id (nullable)
- audit_metadata (jsonb; user agent, IP, etc. of the approver)
**`agent_action_proposal_attachments`**
- id, proposal_id, kind (preview_image / diff / sample / context_excerpt),
payload
**`agent_action_proposal_comments`**
- id, proposal_id, author_user_id, body, created_at
- Allows reviewers to ask questions or add context before resolving
**`agent_action_approval_rules`**
- id, workspace_id, tool_name, tier, condition (jsonb), action (auto-approve
/ require-approval / escalate)
- Rules engine: per-workspace overrides ("auto-approve emails to
internal-domain recipients up to $0 cost"; "always require approval for
any send to a non-customer recipient")
**Indexes**:
- (workspace_id, status, proposed_at) for the queue listing
- (workspace_id, approver_user_id) for "my queue"
- (proposed_by_user_id, status) for the proposer to see their pending
proposals
**Tenant isolation**:
- Every query MUST filter by workspace_id; row-level security via Postgres
policies if your stack supports it
- Test: two workspaces; user from A cannot see proposals from B (write the
test before shipping)
Build the migration; the indexes; and the simple repository functions
(create, list-pending, list-mine, get, resolve, escalate).
3. The Queue UI: The Reviewer's Experience
This is where most teams under-invest. The queue UI is the product. If reviewing is slow, reviewers approve without reading; the queue is theater.
Build the agent action approval queue UI. The components:
**Queue list view**
- Default: "My queue" (proposals routed to the current user as approver)
- Tabs: My queue / Team queue / All queues / Resolved
- Each row shows:
- Time queued (relative: "2m ago")
- Tier badge (color-coded: green Tier 1, yellow Tier 2/3, red Tier 4)
- Tool / action name
- Summary line ("Send email to john@acme.com")
- Proposed by (the customer / agent run)
- Status indicator
- Quick-action buttons: Approve / Reject / Open
- Sort: tier desc, then proposed_at asc (highest-stakes oldest first)
- Filters: tool, tier, proposer, date range, status
- Bulk-select: "approve all selected" with count + tier check
**Single proposal review view**
- Header: tool name, summary, tier, time queued, expires_at countdown
- Section: "What the agent proposes to do"
- Plain-English description
- The full payload (collapsed by default; expand for details)
- Section: "Impact preview"
- For an email: rendered email preview (subject, recipient, body)
- For a record update: side-by-side diff (current vs. proposed)
- For a transaction: amount, recipient, currency, scheduled date
- For a delete: what's being deleted; what depends on it; reversibility
- Section: "Why the agent proposes this"
- The agent's reasoning (chain-of-thought / scratchpad excerpt)
- The conversation context that triggered the proposal
- Section: "Reviewer actions"
- Approve (executes immediately)
- Reject (with optional reason)
- Edit and approve (modify payload, then execute the modified version)
- Defer (for me later — re-queues with a tag)
- Escalate (route to a higher-authority approver)
- Section: "Comments / discussion"
- Reviewer can ask questions in-line; tag teammates; gather context
- Section: "Related context"
- Links to: the conversation where this was proposed; the customer's
record; recent similar proposals; relevant policy docs
**Tier 4 multi-party flow**
- After first approver approves, status transitions to "awaiting second
approval"
- Second approver sees: original proposal + first approver's identity +
any comments
- Second approver: approve (executes) / reject (final) / request-changes
(re-queues to first approver with notes)
- Optional cool-off period: timer starts after second approval; can be
cancelled by either approver during the period
**Keyboard shortcuts**
- a → approve
- r → reject
- e → edit
- d → defer
- j/k → next/previous proposal
- shift+enter → batch approve selected
- Reviewers WILL want these; they're approving dozens per day at scale
**Mobile**
- The queue must work on mobile; reviewers will approve on the go
- Compact list view + same single-proposal review on a smaller screen
- One-handed approve / reject with confirmation step
Build me each of these as React components with the data hooks they need.
Critical UX principle: the proposal view shows EXACTLY what will happen, in the form it will happen. Email previews render the actual email. Record diffs show the actual fields. Don't summarize away the details that the reviewer needs to make a real decision.
4. The Notification Layer
Reviewers won't sit in the queue waiting. Notify them, but not too much.
Build the notification system for the approval queue.
Channels:
- In-app badge on the navigation: count of pending proposals routed to me
- In-app push: real-time update on new proposals in my queue
- Email digest: hourly batch summary (turn off if reviewer disables)
- Slack/Teams: optional integration; per-proposal message OR digest
- Mobile push: for Tier 3/4 only; configurable
Notification rules:
- Tier 1: never notify (in-app feed only; user inspects when curious)
- Tier 2: in-flow; no separate notification
- Tier 3: in-app + email digest (hourly)
- Tier 4: in-app + immediate email + optional SMS/push
Per-proposal rules:
- Aged proposals (>1 hour): re-notify with elevated emphasis
- Expiring proposals (<30 min to expiry): high-priority notification
- Critical-stakes proposals (Tier 4 or amount > threshold): immediate
notification regardless of timing
Per-reviewer rules:
- Quiet hours: don't ping outside business hours unless Tier 4
- Out-of-office: route to backup approver automatically; original
reviewer notified on return
- Notification preferences: each reviewer configures their channels and
thresholds
Unsubscribe / mute flows:
- Per-channel mute (no email but still in-app)
- Snooze for N hours
- Permanent opt-out of digest emails (still get critical notifications)
Build me:
- The notification fan-out worker
- The per-user preference schema
- The Slack integration
- The email digest template
- The mobile push registration flow
5. SLA Tracking and Stale Proposal Handling
Proposals that sit too long are a failure mode. Track and act on them.
Build the SLA layer.
Per-tier SLA:
- Tier 2: implicit (in-flow; resolved within seconds-minutes)
- Tier 3: target 4 business hours; expires at 24 hours
- Tier 4: target 1 business hour; expires at 8 hours
Stale handling:
- At 75% of SLA: notify reviewer "this proposal needs your attention"
- At SLA expired: route to backup approver if configured
- At expiry: status → "expired"; the proposal is dead; agent must re-propose
Customer-facing visibility:
- Customer sees: "your agent has 2 proposals awaiting approval; one expires
in 30 min" in the agent's UI
- Allows the customer to chase their internal approver if needed
Reporting:
- Per-reviewer: median resolution time, approval rate, rejection rate
- Per-tool: % of proposals approved, % rejected, % expired
- Per-tenant: total proposals processed, throughput, bottlenecks
- Surface in admin dashboard for the tenant + globally for ops
Auto-rejection rules (optional, per-tool):
- "If a Tier 3 proposal expires, auto-reject and notify proposer"
- "If a Tier 4 proposal goes 24 hours unresolved, auto-reject"
- Rules are explicit; opt-in per workspace policy
Build me the SLA tracker, the stale-handling job, and the reporting
queries.
6. Auto-Approval Rules — The Productivity Lever
The queue can become a bottleneck. Auto-approve where the risk is genuinely low, but only with explicit, audit-traceable rules.
Build a rules engine for auto-approval. The principle: a workspace admin
explicitly opts into auto-approval rules; the rules are visible, audit-
logged, and reversible.
Rule shape:
- Scope: per-tool, per-tier
- Condition (jsonb): predicate evaluated on the proposal's payload
- Action: auto-approve / require-approval / escalate
- Created_by, created_at (audit)
- Active (bool; can be disabled without deletion)
- Effective_until (optional expiry; rules can be time-bounded)
Examples:
- "Auto-approve emails to recipients in my domain (matches @acme.com)"
- "Auto-approve CRM updates of fields [notes, last_contacted, tags]
but NOT [owner, account_type]"
- "Require approval for any transaction > $1,000 even if Tier 2"
- "Auto-approve scheduled meetings with internal attendees only"
- "Require Tier 4 for any delete affecting > 100 records"
Rule evaluation:
- On every proposal: evaluate matching rules in order; first match wins
- Result logged on the proposal: which rule matched; what tier the rule
set; whether auto-approved or queued
- Every auto-approved action is in the audit log; auditor can review the
rule that approved it
UI for rules:
- Settings → Approval Rules page in the workspace admin
- List of active rules with edit/disable
- Rule builder: dropdown for tool, condition builder, action selector
- Test mode: paste a sample proposal payload; show what would happen
- "Recent auto-approvals from this rule" link for spot-checking
Discipline:
- Auto-approval rules expand the trust footprint; require admin role
- Audit log of rule creation + modification (separate from proposal log)
- Periodic review: surface rules that haven't matched in 90 days for
pruning; surface rules with high match rate for additional scrutiny
Build me:
- The rule schema
- The rule evaluator (including test mode)
- The Settings → Approval Rules UI
- The audit log entries for rule creation/modification
- A monthly digest for admins: "your auto-approval rules approved N
actions; here are the top patterns"
Critical: auto-approval is the most dangerous feature in this whole article. Make rule creation a deliberate, auditable, admin-only act. The default for new workspaces should be NO auto-approval; rules opt-in over time as trust builds.
7. The Audit Trail
Every approval action is potentially evidence. Log accordingly.
Build the audit trail for all approval-queue activity.
Event types:
- proposal_created (with proposed payload + tier classification)
- proposal_viewed (who viewed it; not blocking; useful for compliance)
- proposal_commented (comment author + body)
- proposal_resolved (approver + decision + decision_at + decision_reason)
- proposal_executed (execution_at + result)
- proposal_failed (execution_at + error)
- proposal_expired (expired_at + reason)
- rule_matched (which rule; what action it took; whether escalated)
- rule_created/updated/deleted (the rule lifecycle)
- escalation_triggered (from whom to whom; reason)
Storage:
- Immutable log; append-only; cannot be edited (only marked retracted with
a reason)
- Per-tenant; queryable by date range, user, tool, action_type
- Retention: per workspace policy; minimum 12 months for SOC 2; 7 years
for regulated industries
- See [Audit Logs](audit-logs-chat.md) for the broader pattern
Customer-visibility (optional):
- The customer admin can see the queue's audit history for their workspace
- Surfaces who approved what, when
- See [Customer-Facing Audit Logs](customer-facing-audit-logs-chat.md)
Reviewer accountability:
- Per-reviewer report: their approvals, their rejections, their average
resolution time
- Surfaces patterns: a reviewer who approves 100% of proposals in <5
seconds is rubber-stamping; that signal needs visibility
- Spot-check: random sample of approvals re-reviewed by a second person
monthly (audit hygiene)
Compliance integration:
- Export audit logs to SIEM (Splunk, Datadog, etc.)
- Per-jurisdiction retention rules
- Search interface for audits / compliance / legal use
Build me:
- The audit-log schema
- The fan-out from proposal events to the audit log
- The reviewer-accountability report
- The spot-check sampling job (random monthly)
8. The Reverse Path: Reverting Approved Actions
Sometimes the right answer is "I approved that and I shouldn't have." Build the path.
Build the reverse / undo flow for actions executed via approval.
Per-tool reversibility classification:
- **Reversible**: 1-click undo is possible (e.g., update_record can be
reverted by writing the old payload back)
- **Soft-reversible**: requires manual cleanup (e.g., sent_email — can
send a follow-up correction; the original send is permanent)
- **Irreversible**: cannot be undone (e.g., charge_card — refund is a
separate transaction; some records once published are public)
For reversible actions:
- "Undo" button on the executed proposal in the audit log (within 24
hours of execution; longer for soft-reversible)
- Undo is itself an audit event with its own approval flow
(irony noted; the undo is also a sensitive action)
For soft-reversible:
- "Issue correction" flow: proposes a corrective action through the same
approval queue
- Original action is annotated as "corrected by [proposal X]"
For irreversible:
- "Acknowledge irreversible action and escalate" flow: routes to a senior
team member; logs the regret + the corrective steps taken outside the
product
UI:
- Audit log entry has a "Reverse" / "Issue correction" / "Mark for
manual cleanup" button per the action's reversibility class
- Reversal flow shows: what was done, what undoing means, what cannot
be undone
Discipline:
- Approving Tier 3+ actions is consequential; reviewers need a clear
mental model of what's reversible
- Surface reversibility in the proposal review (before the approval) so
reviewers can be appropriately careful
Build me:
- The reversibility metadata on each tool definition
- The "Reverse" UI flow in the audit log
- The corrective-action proposal flow for soft-reversible
- The irreversible-action acknowledgment flow
9. Anti-Patterns and Failure Modes
- One generic "approve action?" modal for everything. Reviewers stop reading; rubber-stamp; first incident is a question of when.
- No tier classification. Every action treated the same; either over-friction or under-protection.
- Routing all actions through the queue. Latency kills agent value; users disable the agent.
- Routing nothing through the queue. Trust incident is a question of when.
- Auto-approval rules with no audit. Cannot answer "why did the agent send that email?" → "a rule auto-approved it" → "what rule? created when? by whom?" → silence.
- Rubber-stamping detection ignored. Reviewer who approves 100% in <5 seconds = nobody is reviewing.
- No SLA tracking. Proposals sit; expire; nothing happens; users frustrated.
- No reverse path. "I approved that and I shouldn't have" leads to support escalation that takes hours.
- No tenant isolation. Workspace A sees workspace B's queue → existential bug.
- Approver's reasoning not captured. When the eventual incident happens, you can't answer "what did the reviewer think they were approving?"
- No keyboard shortcuts. Reviewers handling dozens of approvals will avoid the queue if it's mouse-only.
- Mobile experience an afterthought. Approvers will be on the go; if mobile is broken, the queue stalls.
- No backup approver routing. Single approver out sick → backlog.
- Two-person rule for Tier 4 enforced inconsistently. "Just this once" exceptions defeat the rule.
- Over-collecting agent reasoning. Including chain-of-thought verbatim can leak sensitive context; redact appropriately.
- No spot-checks of auto-approved actions. Auto-approval is the highest-risk surface; sampled human review on a cadence prevents drift.
10. What Done Looks Like
You have shipped a real approval queue when:
- Every action your agent can take is classified into a tier with documented rationale
- Tier 0/1 actions auto-execute and are visible in a "what your agent did" feed
- Tier 2 actions resolve in-conversation
- Tier 3 actions queue for human review with rich context (preview, reasoning, related context)
- Tier 4 actions require two approvers with optional cool-off
- Reviewers can resolve a proposal in <30 seconds (well-designed UI; keyboard shortcuts; rich preview)
- SLA tracking surfaces stale proposals; backup routing kicks in
- Auto-approval rules opt-in per workspace; visible; auditable; revocable
- Reversibility classes documented per tool; reverse flow available where applicable
- Tenant isolation is verified by tests
- Audit log captures every event; queryable; exportable
- A new engineer can read this doc + your tool tier table and predict what each new tool's approval flow will look like
Mistakes to Avoid
- Skipping the tier classification. Without it, every later choice is wrong.
- Generic approval modals. Reviewer fatigue → rubber-stamping → trust incident.
- No preview / no diff. Reviewers can't make real decisions; they approve based on summary alone.
- No audit log. Cannot reconstruct what happened when the incident occurs.
- No SLA / stale handling. Queue rots; users disable agent.
- Auto-approval without audit and rule visibility. Trust footprint expands invisibly.
- No reverse path. "I approved that wrongly" becomes an incident; should be a button.
- No tenant isolation tests. Eventual leak = existential.
- Over-collecting reasoning that leaks context. Redact thoughtfully.
- No keyboard shortcuts / no mobile. Reviewer experience degrades; queue stalls.
- Tier 4 two-person rule with single-approver shortcut. Defeats the rule; provides false sense of safety.
- No reviewer accountability surface. Rubber-stampers go undetected.
- No spot-check on auto-approved actions. Auto-approval drift accumulates silently.
- Forgetting non-customer reviewers. The reviewer is sometimes the customer's user, sometimes the customer's admin, sometimes your support team — design for all three.
See Also
- In-Product AI Agent Implementation — the broader agent build pattern
- AI Memory & Context Retention — memory layer that approval decisions may reference
- Approval Workflows / Multi-Step Routing — non-AI approval routing pattern; shares mechanics
- Audit Logs — broader audit pattern
- Customer-Facing Audit Logs — surface to customer admins
- Long-Running Operations / Job Status UI — status UX patterns
- Background Jobs / Queue Management — queue infrastructure
- Multi-Tenancy — tenant isolation discipline
- Roles & Permissions — who can approve what
- Notification Preferences / Unsubscribe — reviewer notification preferences
- In-App Notifications — notification surface
- Diff Views / Change Tracking UI — proposal-preview component pattern
- Idempotency Patterns — execution must be idempotent (approval may double-fire)
- VibeReference: Agent Reliability & Production Operations — operational discipline that this UX sits inside
- VibeReference: AI Guardrails & LLM Application Security — runtime defense layer
- VibeReference: AI Agent Frameworks — frameworks underneath
- VibeReference: Human in the Loop — adjacent treatment
- LaunchWeek: Crisis Communication Playbook — when an action goes wrong publicly