Redact PDFs in seconds
Remove sensitive information from PDF documents using Superagent Redact API with SDK, CLI, or REST
Protect sensitive information in your PDF documents by automatically redacting PII/PHI such as SSNs, emails, phone numbers, credit cards, and more. Superagent makes it simple to sanitize documents while preserving their original formatting.
Why redact PDFs?
- Compliance: Meet GDPR, HIPAA, and SOC 2 requirements by removing PII/PHI before sharing
- Security: Prevent data leaks when distributing contracts, invoices, or reports
- Privacy: Protect customer information in legal documents and financial records
- Automation: Process documents at scale without manual review
Quick Start
Choose your preferred method to redact PDFs:
import { createClient } from "superagent-ai";
import { readFileSync, writeFileSync } from "fs";
const client = createClient({
apiKey: process.env.SUPERAGENT_API_KEY!,
});
// Read PDF file
const pdfBuffer = readFileSync("sensitive-document.pdf");
const pdfBlob = new Blob([pdfBuffer], { type: "application/pdf" });
// Redact PDF and get redacted file
const result = await client.redact(pdfBlob, {
format: "pdf", // Returns redacted PDF
entities: ["SSN", "credit card numbers", "email addresses", "phone numbers"],
});
// Save the redacted PDF
if (result.pdf) {
const arrayBuffer = await result.pdf.arrayBuffer();
writeFileSync("redacted-output.pdf", Buffer.from(arrayBuffer));
console.log("✓ Redacted PDF saved to redacted-output.pdf");
}
// Installation: npm install superagent-aiWhat gets redacted?
Superagent automatically detects and redacts:
- Email addresses →
<REDACTED_EMAIL> - Social Security Numbers →
<REDACTED_SSN> - Credit cards (Visa, Mastercard, Amex) →
<REDACTED_CC> - Phone numbers (US format) →
<REDACTED_PHONE> - IP addresses (IPv4/IPv6) →
<REDACTED_IP> - API keys & tokens →
<REDACTED_API_KEY> - AWS access keys →
<REDACTED_AWS_KEY> - Medical record numbers →
<REDACTED_MRN> - Passport numbers →
<REDACTED_PASSPORT> - IBAN →
<REDACTED_IBAN> - ZIP codes →
<REDACTED_ZIP>
Custom entity redaction
Define your own entity types using natural language:
const result = await client.redact(pdfBlob, {
format: "pdf",
entities: [
"employee IDs",
"project codenames",
"salary information",
"bank account numbers"
]
});The AI model interprets your natural language descriptions and redacts matching content intelligently.
Use cases
Legal documents
Redact client information from contracts before sharing with third parties:
const result = await client.redact(contractBlob, {
format: "pdf",
entities: ["client names", "addresses", "phone numbers", "SSN"]
});Medical records
Maintain HIPAA compliance by removing PHI from patient records:
result = await client.redact(
medical_record_file,
format="pdf",
entities=["patient names", "MRN", "SSN", "addresses", "phone numbers"]
)Financial documents
Sanitize invoices and statements before archiving:
superagent redact --file invoice.pdf --entities "credit card numbers,bank accounts,SSN" "Redact financial data"Output options
Option 1: Redacted PDF file (format="pdf")
Returns a PDF with redactions applied directly to the document. The original formatting and layout are preserved.
Option 2: Redacted text (format="json")
Extracts text from the PDF, redacts it, and returns as JSON. Useful for text analysis or indexing.
const result = await client.redact(pdfBlob, {
format: "pdf" // Returns Blob with redacted PDF
});
if (result.pdf) {
// Save or process the redacted PDF
const arrayBuffer = await result.pdf.arrayBuffer();
writeFileSync("redacted.pdf", Buffer.from(arrayBuffer));
}What you've protected
- Compliance audits get cleaner with no PII leaks
- Customer trust increases with proper data handling
- Legal risk decreases by sanitizing documents before distribution
- Workflow efficiency improves with automated redaction at scale
Next steps
- Integrate redaction into your document processing pipeline
- Set up automated workflows with n8n integration
- Explore Guard API for content validation
- Join our Discord community
Ready to protect your documents? Get your API key at app.superagent.sh
Vercel AI SDK
Guard Vercel AI SDK prompts and tool calls with the Superagent TypeScript SDK
Guardrails in n8n with Superagent
Modern inbox automations touch real customer data. If a workflow reads raw mail, you risk leaking phone numbers, emails, addresses, card-like strings, and URLs into logs and prompts. Redaction gives you a safety buffer while keeping your agents useful.