Superagent LogoSuperagent

Secure E2B sandboxes

Guard AI‑generated code before executing it inside E2B sandboxes using the Superagent Guard SDK.

Superagent Guard lets you screen model‑generated code, tool calls, and file operations before they reach an E2B sandbox. This example shows a minimal “code runner” service that accepts model output, vets it with the guard, and only then launches a disposable E2B sandbox to execute it with strict limits.

Why combine Superagent + E2B

  • Guard rejects malicious code (exfiltration, rm -rf, crypto miners) before any process starts.
  • E2B isolates execution with per‑run sandboxes, timeouts, CPU/memory caps, and ephemeral storage.
  • Together you get defense‑in‑depth: pre‑execution safety + runtime isolation.

TypeScript

Install dependencies (replace the E2B package with the one you use):

npm install superagent-ai @e2b/sdk

Create a small helper that gates execution through the guard and a sandbox. The E2B calls below are illustrative; adapt to your E2B SDK’s methods (createSandbox, exec, close, etc.).

import { createGuard } from "superagent-ai";
// Example E2B SDK import. Adjust to your actual package / APIs.
import { Sandbox } from "@e2b/sdk";

type RunResult = {
  status: "pass" | "block" | "error";
  reasoning?: string;
  stdout?: string;
  stderr?: string;
};

const guard = createGuard({
  apiBaseUrl: process.env.SUPERAGENT_API_BASE_URL || "https://app.superagent.sh/api/guard",
  apiKey: process.env.SUPERAGENT_API_KEY!,
});

// Hard limits you enforce per run
const RUN_LIMITS = {
  timeoutMs: 20_000,
  cpuSeconds: 5,
  memoryMb: 512,
};

export async function runCodeSafely(params: {
  language: "python" | "node";
  code: string;
}): Promise<RunResult> {
  const { code, language } = params;

  // 1) Pre‑execution guard
  const { decision, reasoning, rejected } = await guard(code, {
    onBlock: () => console.warn("Blocked generated code", { reasoning }),
  });

  if (rejected) {
    return { status: "block", reasoning };
  }

  // 2) Launch sandbox (disposable, with strict limits)
  let sandbox: Sandbox | undefined;
  try {
    sandbox = await Sandbox.create({
      runtime: language === "python" ? "python3" : "nodejs18",
      timeoutMs: RUN_LIMITS.timeoutMs,
      cpuSeconds: RUN_LIMITS.cpuSeconds,
      memoryMb: RUN_LIMITS.memoryMb,
      // Consider disabling outbound network or allowlisting hosts here
      network: { outbound: "blocked" },
    });

    // 3) Execute code
    const result = await sandbox.exec({ code });

    return {
      status: "pass",
      stdout: result.stdout ?? "",
      stderr: result.stderr ?? "",
      reasoning,
    };
  } catch (err: any) {
    return { status: "error", reasoning: String(err?.message || err) };
  } finally {
    // 4) Always destroy the sandbox
    await sandbox?.close().catch(() => {});
  }
}

// Example usage
// const res = await runCodeSafely({ language: "python", code: "print('hello')" });
// console.log(res);

Guarding tool calls used by the code

If your runner exposes helper tools (file reads, HTTP fetches, shell), run the guard on the tool inputs as well before performing the action. For example:

async function guardedFetch(url: string) {
  const { rejected, reasoning } = await guard(`FETCH ${url}`);
  if (rejected) return { ok: false, error: `Blocked: ${reasoning}` };
  // perform fetch inside the sandbox or via a broker with allowlists
}

Python

Install dependencies (adjust the E2B package name for your environment):

uv add superagent-ai e2b

The structure mirrors the TypeScript example: guard first, then execute in a time‑boxed sandbox.

import asyncio
from typing import Literal, TypedDict

from superagent_ai import create_guard
# Example import; replace with your actual E2B client
from e2b import Sandbox

class RunResult(TypedDict, total=False):
    status: Literal["pass", "block", "error"]
    reasoning: str
    stdout: str
    stderr: str


guard = create_guard(
    api_base_url="https://app.superagent.sh/api/guard",
    api_key="sk-...",
)


async def run_code_safely(language: str, code: str) -> RunResult:
    # 1) Guard the code before any execution
    result = await guard(code)
    if result.rejected:
        return {"status": "block", "reasoning": result.reasoning}

    sandbox: Sandbox | None = None
    try:
        # 2) Create a sandbox with strict resource limits
        sandbox = await Sandbox.create(
            runtime="python3" if language == "python" else "nodejs18",
            timeout_ms=20_000,
            cpu_seconds=5,
            memory_mb=512,
            network={"outbound": "blocked"},
        )

        # 3) Execute code
        exec_result = await sandbox.exec({"code": code})
        return {
            "status": "pass",
            "stdout": exec_result.get("stdout", ""),
            "stderr": exec_result.get("stderr", ""),
            "reasoning": result.reasoning,
        }
    except Exception as e:
        return {"status": "error", "reasoning": str(e)}
    finally:
        # 4) Tear down sandbox
        if sandbox is not None:
            try:
                await sandbox.close()
            except Exception:
                pass


async def main():
    out = await run_code_safely("python", "print('hello from sandbox')")
    print(out)


if __name__ == "__main__":
    asyncio.run(main())

Hardening checklist

  • Enforce guard checks on both model‑generated code and any tool I/O inside the sandbox.
  • Disable or strictly allowlist outbound network from the sandbox.
  • Set tight timeouts, CPU, and memory limits per run; prefer ephemeral sandboxes.
  • Keep audit logs: persist the guard decision and reasoning alongside the run outputs.
  • Redact secrets in logs; never echo credentials back from the sandbox.

This pattern gives you a simple, repeatable way to execute AI‑generated code with layered protections.