Builder endpoint

Give agents scoped permission, not a bigger prompt.

Trust Graduation is a small runtime pattern for agent products: classify the action, check whether it can execute, prepare approval when needed, and record the receipt.

Use it around MCP tools, coding agents, browser agents, eval stacks, workflow agents, or any tool-call surface where reading is safe but sending, spending, deleting, or mutating is not.

Demo path

One agent action, five visible states.

The first useful demo is not a broad autonomous workflow. It is a single consequential action moving through classification, permission, approval, receipt, and future trust.

Run the live Trust Graduation demo.

1. Proposed actionAn agent wants to send an external email, push code, post publicly, create a calendar event, or spend money.
2. Action classMission maps the proposed action to the consequence: draft, send, mutate, publish, spend, delete, or commit.
3. Permission stateThe action is allowed, constrained, review-required, deferred, blocked, or human-only based on trust state and evidence.
4. Approval packetIf review is required, the agent can prepare a local packet with external_actions: 0.
5. ReceiptThe human decision becomes a durable receipt that can improve or reduce future permission.
Where it fits

The trust unit is the action class.

Do not start with the tool name. Start with the consequence. The same tool can be safe for one action class and unsafe for another.

MCP appsExpose allowed tools only when the action class has evidence; block invocation when approval is missing.
Coding agentsLet read and patch graduate separately from shell commands, commits, pushes, and pull requests.
Browser agentsNavigate and extract can be low risk; form submit, purchase, send, and account mutation stay gated.
Evals / observabilityUse receipts from supervised work as permission evidence, not only retrospective quality data.
Enterprise workflowsKeep every approval, refusal, and executed external action attributable and reviewable.
Minimal integration

Wrap the boundary, not the whole product.

01

Define action classes

Examples: read.context, draft.compose, tool.call.local, email.send.external, repo.push, payment.initiate.

02

Call canExecute

Return allowed, allowed_with_constraints, review_required, deferred, blocked, or human_only from evidence, user approval, reversibility, and prior receipts.

03

Prepare before execute

If review is required, create a local approval packet with external_actions: 0. The agent can prepare; the human decides.

04

Record the receipt

Log what was approved, refused, revised, or executed. Use that receipt as future permission evidence.

Endpoint shape

The interface should be boring.

Trust Graduation should be easy to add around an existing agent loop. The protocol boundary is four calls, with no requirement to replace the model, framework, MCP server, or eval stack.

const action = classifyAction(toolCall)
const decision = canExecute({
  actionClass: action.class,
  actor: agent.id,
  target: action.target,
  evidence: receipts.relevant(action)
})

if (decision.status === "allowed") {
  const result = await execute(toolCall)
  recordReceipt({ action, decision, result })
}

if (decision.status === "review_required") {
  prepareApprovalPacket({ action, decision, external_actions: 0 })
}
Concrete packet

What an approval packet should contain.

A useful packet makes the proposed external consequence reviewable before the agent creates it.

Action classemail.send.external, repo.push, calendar.create, payment.initiate, etc.
Proposed actionThe exact message, command, mutation, destination, or state change.
EvidenceRelevant context, prior receipts, user intent, risk, reversibility, and open questions.
DecisionAllowed, constrained, needs review, deferred, blocked, or human-only.
Receipt previewWhat will be recorded after approve, reject, revise, or execute.
The critique we want

Where should this live?

We are asking builders a narrow question: should action-class permission live in each app, in MCP, in evals/observability, or as a small receipt-based layer across them?

Builder askWhich action classes in your product should graduate by evidence instead of a binary allow/block permission?
MCP askShould tool discovery and invocation include action-class permission metadata?
Evals askShould supervised outcomes become permission evidence for future execution?
Security askWhat must be enforced architecturally instead of trusted to prompts?
Open loop

We want blunt builder feedback.

If you build agents, MCP servers, evals, coding tools, workflow automation, or enterprise AI controls, tell us where this pattern is wrong, redundant, or useful.