Sandboxed Tasks
Spawn isolated Firecracker microVMs running a Pi coding agent with authenticated CLI access to the platform. The agent can do anything the caller's API key permits — scoped, metered, auto-killed. Tasks are how this platform turns "natural language instruction" into "executed multi-step work" without each caller having to write the orchestration.
Default stance
Pass an instruction, not a recipe. The whole point of the task system is that the caller doesn't know how, only what. The Pi agent has the CLI tools — let it figure out the steps. If you find yourself encoding step-by-step logic in the instruction, you probably want a regular API endpoint, not a task.
Always scope the sub-key to the minimum the task needs. The intersection of caller scopes and scopes parameter is what the sub-agent gets — favor smaller intersections. A task that only needs to read keys should not be passed keys:write "just in case."
Fire-and-forget by default. Subscribe to task.completed / task.failed webhooks instead of polling. The whole runtime is built around webhook notification.
Use this skill when
Implementing agent workflows, designing the task instruction surface for a product, debugging sandbox execution, changing Pi/sandbox snapshot behavior, or scoping sub-keys for delegated work.
Lifecycle
runTask()
1. Insert task row (status: pending)
2. Mint task-scoped sub-key (TTL = timeout + 30s buffer)
3. Update to running
4. createAndRunSandbox(instruction, seedApiKey, seedApiUrl, timeoutMs)
→ Pi agent runs in Firecracker VM with the CLI authenticated
5. Update to completed/failed with stdout, stderr, exit code, tokens
6. Record usage event
7. Fire webhook: task.completed or task.failed
8. (finally) Revoke the sub-key — cleanup always runs
The sub-key is the security boundary. It's revoked in finally, so even a thrown error or timeout cleans up.
Create and run a task
# REST
curl -X POST https://your-app.com/api/v1/tasks/run \
-H "Authorization: Bearer sk_live_..." \
-H "Content-Type: application/json" \
-d '{
"instruction": "Audit our API key inventory and revoke any expired ones",
"scopes": ["keys:read", "keys:write"],
"timeout": 60000
}'
// Server-side
import { runTask } from "@/lib/tasks"
const result = await runTask({
callerKeyId: apiKey.id,
callerUserId: user.id,
callerOrganizationId: organizationId,
callerScopes: apiKey.scopes,
instruction: "Audit our API key inventory and revoke any expired ones",
scopes: ["keys:read", "keys:write"],
timeoutMs: 60_000,
})
Scoping
The scopes parameter is intersected with the caller's scopes — you can only grant permissions you have. Omit scopes to inherit all caller permissions:
// Read-only audit — can't mutate anything
{ scopes: ["obs:read"] }
// Mixed read/write within a domain
{ scopes: ["keys:read", "keys:write", "obs:read"] }
// Inherit everything the caller has
{ /* no scopes */ }
Timeouts
| Task type | Recommended |
|---|---|
| Simple lookup | 30 s |
| Multi-step workflow | 60 s |
| Data processing / generation | 180 s |
| Complex multi-tool agent work | 300 s (default) |
| Maximum allowed | 600 s |
clampTimeout() enforces [10s, 600s]. The sub-key TTL is timeout + 30s so the agent can't finish but then make a follow-up call after revocation.
Snapshot management
SANDBOX_SNAPSHOT_ID points to a pre-built Firecracker snapshot containing Pi and its dependencies. It does NOT contain the seed CLI — the CLI is downloaded fresh per task so it's always the current version. Recreate the snapshot only when upgrading Pi or changing the default LLM model:
npx tsx scripts/setup-sandbox-snapshot.ts
# Then update SANDBOX_SNAPSHOT_ID in Vercel env vars
Hard rules
- Don't pass full secrets in
instruction. It ends up in the task record and webhook payload. Pass references; let the agent fetch via CLI. - Don't widen scopes to debug a failure. If the agent can't do something, that's the security boundary working — change the scope intentionally, not as a workaround.
- Don't poll for status when you can subscribe.
task.completedandtask.failedwebhooks fire automatically. Seewebhooks-and-events. - Don't bypass
runTaskto call sandbox directly. The wrapper handles sub-key minting, status transitions, usage recording, and revocation. Bypassing means you'll forget cleanup and leak access.
Cost model
- Sandbox compute: ~$0.01-0.03 per 5-min task
- LLM tokens: $0 with OpenRouter free tier (default)
- CLI install: 2-3s per task
- Vercel Pro $20/mo credit covers ~650 five-min tasks
For forkers: adding domain-specific tasks
You don't write task-specific code. Pi figures out which CLI commands to call from the natural language instruction. Your job:
- Make the CLI comprehensive — every domain action should have a
seed <resource> <verb>command. - Define correct scopes —
contacts:read,inventory:write, etc. (seeadd-api-endpoint). - That's it. Pi discovers commands at runtime.
Where things live
| File | Purpose |
|---|---|
lib/tasks.ts | runTask, getTask, stopTask, scope intersection logic |
lib/sandbox.ts | createAndRunSandbox, stopSandboxById |
lib/api-keys.ts | createSubKey, revokeKeyWithCascade |
app/api/v1/tasks/ | REST endpoints |
pages/api/mcp.ts | MCP exposure (Claude Desktop, Cursor) |
scripts/setup-sandbox-snapshot.ts | Build a new sandbox snapshot |
scripts/test-sandbox.ts | Local sandbox smoke test |
Auxiliary content
- references/original-guide.md — full canonical guide with industry instruction examples (CRM, e-commerce, DevOps, finance, etc.) and response shape
- references/graph.md — handoff to
webhooks-and-events,add-api-endpoint,cli-development - scripts/list-task-surfaces.sh — enumerates current task endpoints, scope coverage, and CLI commands available to Pi
- assets/task-contract-template.json — template for declaring a task contract (instruction shape, expected scopes, expected outputs)