What Counts as PII in a PDF? A Complete 2026 Checklist
What Counts as PII in a PDF? A Complete 2026 Checklist
"PII" is one of those terms everyone uses but few people define the same way. Under GDPR, it's "personal data." Under HIPAA, the equivalent is PHI ("protected health information"). Under California's CCPA, it's "personal information." Under the Swiss FADP, it's "personal data" (Personendaten / données personnelles).
The categories overlap heavily — but not perfectly. Here's a practical 2026 checklist for what to redact in any document destined for a third party.
The universal "always redact" list
Regardless of jurisdiction, these are always personal data and should be redacted whenever sharing externally:
- Full name (first + last)
- Postal address
- Email address
- Phone number (mobile or landline)
- National ID number (SSN, AVS, NIR, NHS, etc.)
- Passport / driver's license number
- Bank account / IBAN / credit card number
- Date of birth
- Photograph of a face
- Biometric data (fingerprint, retina, voice signature)
- Signature image
GDPR-specific (EU/EEA)
GDPR takes a broad view: any information relating to an identified or identifiable person. This includes:
- Online identifiers — IP address, cookie ID, device fingerprint
- Location data — GPS coordinates, even approximate
- Pseudonymized data that can be re-identified
- Genetic data
- Trade union membership
- Religious or philosophical beliefs
- Political opinions
- Sexual orientation
- Criminal convictions (special protection)
- Health data (special category — see HIPAA section)
When in doubt, the question is: could a recipient, using this data plus other available data, identify a specific individual? If yes, it's personal data.
Swiss FADP-specific
The Swiss FADP (revised 2023) aligns closely with GDPR but with some Swiss specifics:
- AVS / NAVS13 numbers — Swiss social security identifiers
- Religious affiliations (especially relevant for civil-status documents)
- Sensitive data category mirrors GDPR's special categories
HIPAA-specific (US healthcare)
HIPAA's Safe Harbor method lists 18 specific identifier categories. Beyond the universal list, HIPAA also covers:
- Medical record numbers (MRN)
- Health plan beneficiary numbers
- Account numbers
- Certificate / license numbers
- Vehicle identifiers (VIN, license plate)
- Device identifiers and serial numbers
- URLs and IP addresses (when linked to an individual)
- Dates more granular than year (related to the individual)
- Geographic subdivisions smaller than a state (street, city, ZIP)
CCPA / US state laws
California (CCPA), Virginia, Colorado, Utah, and growing list of US state laws include categories not always covered elsewhere:
- Browsing history
- Search history
- Purchase history and preferences
- Education records
- Employment information
- Inferences drawn from other data (e.g., "interested in cars")
Industry-specific checklists
For legal documents
- All party names (plaintiffs, defendants, witnesses, third parties)
- All addresses
- Bank/IBAN references in financial disputes
- Minor children's identities (often special protection)
For medical records
- All HIPAA Safe Harbor identifiers (18 categories)
- Provider names (sometimes — depends on use case)
- Institution names (sometimes)
For financial documents
- IBAN, account numbers, SWIFT/BIC
- Credit card numbers (PCI-DSS adds extra requirements)
- Salary / income figures
- Investment portfolio details
For real-estate documents
- Buyer / seller / tenant identities
- Property addresses (if identifying a private home)
- Mortgage and financing details
- Reference contacts (prior landlords, employers)
How to redact at scale
For documents with many PII categories — court filings, medical records, mortgage applications — manual highlighting is too slow and error-prone. AI tools like Redact PDF AI handle the standard categories (names, addresses, phone, email, IBAN, credit cards, dates) automatically across the entire document, in 100+ languages, including OCR for scanned PDFs.
For institution-specific identifiers (MRN, case numbers, internal IDs), add them to an "Always Redact" terms list for deterministic coverage.
The "minimum necessary" principle
A common principle across GDPR, HIPAA, and CCPA: only share the minimum data necessary for the specific purpose. This means your redaction policy isn't always "redact everything" — sometimes it's "keep clauses 3 and 7 visible, redact the rest." Modern Studio editors let you make these decisions per document.
Get started
Try the free demo on redact-pdf.ai or browse our PII-type guides for category-specific redaction tips.