Superagent LogoSuperagent
SDK

Python

Python SDK for content safety - guard and redact methods

Python SDK

The safety-agent package provides two core methods: guard() for detecting threats and redact() for removing sensitive data.

Installation

uv add safety-agent

Or with pip:

pip install safety-agent

Quick Start

from safety_agent import create_client

client = create_client()

# Guard: Detect threats (uses default superagent/guard-1.7b model)
result = await client.guard(input="user message to analyze")

# Redact: Remove PII
result = await client.redact(
    input="My email is john@example.com",
    model="openai/gpt-4o-mini"
)

Guard

The guard() method classifies input content as pass or block. It detects prompt injections, malicious instructions, and security threats.

Basic Usage

result = await client.guard(input="user message to analyze")

if result.classification == "block":
    print("Blocked:", result.violation_types)

Options

OptionTypeRequiredDefaultDescription
inputstr | bytesYes-The input to analyze
modelstrNosuperagent/guard-1.7bModel in provider/model format
system_promptstrNo-Custom system prompt
chunk_sizeintNo8000Characters per chunk (0 to disable)

Response

FieldTypeDescription
classification"pass" | "block"Whether content passed or should be blocked
violation_typeslist[str]Types of violations detected
cwe_codeslist[str]CWE codes associated with violations
usageTokenUsageToken usage information

Input Types

Guard supports multiple input types:

  • Plain text: Analyzed directly
  • URLs: Automatically fetched and analyzed
  • Bytes: Analyzed based on content type
  • PDFs: Text extracted and analyzed per page
  • Images: Requires vision-capable model
# URL input
result = await client.guard(input="https://example.com/document.pdf")

# File input
with open("document.pdf", "rb") as f:
    result = await client.guard(input=f.read())

Redact

The redact() method removes sensitive content from text using placeholders or contextual rewriting.

Basic Usage

result = await client.redact(
    input="My email is john@example.com and SSN is 123-45-6789",
    model="openai/gpt-4o-mini"
)

print(result.redacted)
# "My email is <EMAIL_REDACTED> and SSN is <SSN_REDACTED>"

Options

OptionTypeRequiredDefaultDescription
inputstrYes-The text to redact
modelstrYes-Model in provider/model format
entitieslist[str]NoDefault PIIEntity types to redact
rewriteboolNoFalseRewrite contextually instead of placeholders

Response

FieldTypeDescription
redactedstrSanitized text with redactions
findingslist[str]What was redacted
usageTokenUsageToken usage information

Rewrite Mode

Contextually rewrites text instead of using placeholders:

result = await client.redact(
    input="My email is john@example.com",
    model="openai/gpt-4o-mini",
    rewrite=True
)

print(result.redacted)
# "My email is on file"

Custom Entities

Specify which entity types to redact:

result = await client.redact(
    input="Contact john@example.com or call 555-123-4567",
    model="openai/gpt-4o-mini",
    entities=["email addresses"]  # Only redact emails
)

Default Entities

When entities is not specified:

  • SSNs, Driver's License, Passport Numbers
  • API Keys, Secrets, Passwords
  • Names, Addresses, Phone Numbers
  • Emails, Credit Card Numbers

Environment Variables

Configure provider API keys:

export SUPERAGENT_API_KEY=your-superagent-key
export OPENAI_API_KEY=your-openai-key
export ANTHROPIC_API_KEY=your-anthropic-key
export GOOGLE_API_KEY=your-google-key
export GROQ_API_KEY=your-groq-key
export FIREWORKS_API_KEY=your-fireworks-key
export OPENROUTER_API_KEY=your-openrouter-key