Developer docs

Redaction rules

Always-redact and always-keep-visible term lists per job.

Redaction rules

Beyond choosing PII categories, you can pin the outcome for specific terms on each job:

  • Always redact (pii_included_terms) — terms that must always be masked, even when the AI would not classify them as PII. Useful for project codenames, internal IDs, or proprietary terms the model has no concept of.
  • Always keep visible (pii_excluded_terms) — terms that must never be masked, even when the AI detects them as PII. Useful for your own company name, public support addresses, or anything that is safe to leave in the clear.

Both are sent on POST /v1/jobs as JSON array strings in the multipart form.

Matching

  • Case-insensitive, whole-word matching.
  • Multi-word phrases are matched as a phrase (e.g. "Project Titan" matches project titan but not titan alone).

Precedence

If the same term appears in both lists, it is redactedpii_included_terms wins over pii_excluded_terms.

Defaults

If you omit a field, the API key owner's saved defaults (from dashboard preferences) apply. Send an empty array ([]) to explicitly use no terms for that list on this job.

Limits

  • Max 500 terms per list.
  • Max 120 characters per term.
  • Whitespace is normalized and duplicates are removed.

Example

Redact everything the model finds, plus the codename Project Titan, but keep your own company name Acme Corp visible:

curl -sS -X POST "https://www.redact-pdf.ai/v1/jobs" \
  -H "X-API-Key: YOUR_API_KEY" \
  -F 'files=@/absolute/path/contract.pdf;type=application/pdf' \
  -F 'pii_included_terms=["Project Titan","ACME-1234"]' \
  -F 'pii_excluded_terms=["Acme Corp","support@acme.com"]'

Because redaction is burned into rasterized pages, the redacted output is an image-based PDF: masked text cannot be recovered, and kept-visible terms remain selectable only in the sense that they were never covered.