Redaction rules
Beyond choosing PII categories, you can pin the outcome for specific terms on each job:
- Always redact (
pii_included_terms) — terms that must always be masked, even when the AI would not classify them as PII. Useful for project codenames, internal IDs, or proprietary terms the model has no concept of. - Always keep visible (
pii_excluded_terms) — terms that must never be masked, even when the AI detects them as PII. Useful for your own company name, public support addresses, or anything that is safe to leave in the clear.
Both are sent on POST /v1/jobs as JSON array strings in the multipart form.
Matching
- Case-insensitive, whole-word matching.
- Multi-word phrases are matched as a phrase (e.g.
"Project Titan"matchesproject titanbut nottitanalone).
Precedence
If the same term appears in both lists, it is redacted — pii_included_terms
wins over pii_excluded_terms.
Defaults
If you omit a field, the API key owner's saved defaults (from dashboard
preferences) apply. Send an empty array ([]) to explicitly use no terms for
that list on this job.
Limits
- Max 500 terms per list.
- Max 120 characters per term.
- Whitespace is normalized and duplicates are removed.
Example
Redact everything the model finds, plus the codename Project Titan, but
keep your own company name Acme Corp visible:
curl -sS -X POST "https://www.redact-pdf.ai/v1/jobs" \
-H "X-API-Key: YOUR_API_KEY" \
-F 'files=@/absolute/path/contract.pdf;type=application/pdf' \
-F 'pii_included_terms=["Project Titan","ACME-1234"]' \
-F 'pii_excluded_terms=["Acme Corp","support@acme.com"]'
Because redaction is burned into rasterized pages, the redacted output is an image-based PDF: masked text cannot be recovered, and kept-visible terms remain selectable only in the sense that they were never covered.