Skip to main content
You want to delegate a coding task to an AI agent: open a Git repo, hand it a brief, let it run, and pick up the result. Brimble ships dedicated templates with the major agents preinstalled, so you don’t have to build the runtime yourself.

When to use this recipe

  • A “fix-this-ticket” workflow where you wire an agent to your issue tracker.
  • Batch refactors across many repos: the same agent task, fanned out.
  • Evaluation harnesses that compare different agents on the same task.
  • A scheduled background worker that picks up tasks from a queue and runs them.

Prerequisites

  • A Brimble account with API access enabled.
  • The SDK installed and BRIMBLE_SANDBOX_KEY set.
  • An API key for whichever AI agent you’re running (Anthropic, OpenAI, etc.) to pass in as a sandbox env var.
  • Optional but recommended: a Git provider token if the agent needs to clone private repos.

Recipe

Available agent templates (see the Templates list for the live catalog):
  • claude-code, Anthropic’s Claude Code CLI
  • codex, OpenAI Codex CLI
  • opencode, OpenCode, open-source agent
  • droid, Factory’s Droid agent
The shape below uses claude-code. Swap the template name and the invocation command for any of the others.
import { Sandbox } from "@brimble/sandbox";

const client = new Sandbox();

async function runAgent({
  repoUrl,
  task,
  anthropicKey,
}: { repoUrl: string; task: string; anthropicKey: string }) {
  const handle = await client.sandboxes.createReady({
    region: "auto",
    template: "claude-code",
    persistent: true,
    persistentDiskGB: 20,         // room for the checkout + node_modules / venvs
    autoDestroy: true,
    destroyTimeout: "3h",         // generous; agent runs can take a while
    blockOutbound: false,         // agent needs the internet
  });

  try {
    // Seed the task description on disk
    await handle.putFile("/work/task.md", task);

    // Clone and run the agent. Pass the agent's API key via `env`, never bake
    // a secret into the cmd string.
    const run = await handle.exec({
      cmd: [
        "set -e",
        "mkdir -p /work && cd /work",
        `git clone --depth 1 ${repoUrl} repo`,
        "cd repo",
        "claude --print --dangerously-skip-permissions < /work/task.md > /work/agent.log 2>&1",
      ].join(" && "),
      env: { ANTHROPIC_API_KEY: anthropicKey },
      timeout_seconds: 1800,       // 30 min cap
    });

    if (run.exit_code !== 0) {
      throw new Error(`agent failed (${run.exit_code}): ${run.stderr}`);
    }

    // Collect the resulting diff
    const diff = await handle.exec({
      cmd: "cd /work/repo && git add -A && git diff --staged",
    });

    // Snapshot before we destroy, in case we want to resume / inspect later
    const snap = await handle.snapshots.create({ name: "post-run" });

    return { diff: diff.stdout, snapshotId: snap.id };
  } finally {
    await handle.destroy().catch(() => {});
  }
}

const result = await runAgent({
  repoUrl: "https://github.com/myorg/widgets.git",
  task: "Refactor src/utils to use async/await. Add tests.",
  anthropicKey: process.env.ANTHROPIC_API_KEY!,
});

console.log(result.diff);

What’s happening

  1. Pick the agent template. claude-code ships with the Claude Code CLI preinstalled and configured for unattended runs. Swap to codex, droid, or opencode and adjust the binary name in the exec command (codex --print ..., droid run ..., etc.).
  2. Persistent disk on, 20 GB. The agent’s checkout, node_modules, model caches, and intermediate files all live on the workspace volume. A snapshot at the end captures that state for replay or audit.
  3. autoDestroy with a 3-hour ceiling. Agent runs can be long; a 30-minute hard timeout would kill the work mid-flight. Three hours is a reasonable backstop. Set oneShot: true instead if you want the sandbox to terminate the moment claude --print exits.
  4. Outbound network is on. The agent needs to reach the model API (api.anthropic.com, api.openai.com, etc.). If you want stricter isolation, pre-bake any tool the agent needs into the snapshot and turn blockOutbound: true on for the actual run.
  5. API key as a shell env var, not a build-time secret. Sandboxes don’t bake env vars into the template; you set them per-exec.
  6. Diff capture after the run. git add -A && git diff --staged shows everything the agent touched, including new files. Pipe it to your own review tool or PR-creation flow.
  7. Snapshot before destroy. Cheap insurance: if the diff looks wrong or the agent did something unexpected, you can restore the snapshot into a fresh sandbox and inspect.

Variations

  • Resume an interrupted run. Pass fromSnapshot on createReady to start a new sandbox from the agent’s last state. Useful when an agent hits the timeout mid-task and you want to continue.
  • Reuse a checkout across runs. Create a sandbox-type volume, attach it on first run, and pass the same volumeId on subsequent runs. The repo is already cloned, dependencies are already installed.
  • Fan out across many tasks. Loop runAgent over a task list. Each call provisions an isolated sandbox; the platform’s concurrency cap is your only ceiling.
  • Different agents, same task. Swap the template (claude-codecodexdroid) on the same input and compare the resulting diffs.
  • Background mode with logs. Replace the inline exec with two calls: one to start the agent in the background (nohup claude ... &), one to tail /work/agent.log while it runs. You’ll need getFile to stream the log.

Next steps

Last modified on May 23, 2026