In this tutorial, we build a complete, production-style pipeline for detecting and redacting personally identifiable information using the OpenAI Privacy Filter. We begin by setting up the environment and loading a token classification model that identifies multiple categories of sensitive data, including names, emails, phone numbers, addresses, and secrets. We then design helper functions to normalize labels, extract structured spans, and transform raw model outputs into usable formats. From there, we implement a configurable redaction system that allows us to replace sensitive entities with meaningful placeholders, preserving privacy and providing contextual clarity. Throughout the process, we test the pipeline on curated examples, convert outputs into structured dataframes, and prepare the system for batch processing and real-world usage.
!pip install -q -U transformers accelerate torch pandas matplotlib huggingface_hub
import os, re, json, time, textwrap, warnings
from pathlib import Path
from collections import Counter
import pandas as pd
import matplotlib.pyplot as plt
import torch
from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline
warnings.filterwarnings(“ignore”)
MODEL_ID = “openai/privacy-filter”
OUT_DIR = Path(“/content/privacy_filter_outputs”)
OUT_DIR.mkdir(parents=True, exist_ok=True)
device = 0 if torch.cuda.is_available() else -1
torch_dtype = torch.bfloat16 if torch.cuda.is_available() else torch.float32
print(“Device:”, “GPU” if torch.cuda.is_available() else “CPU”)
print(“Torch dtype:”, torch_dtype)
print(“Model:”, MODEL_ID)
We install all required libraries and set up the pipeline’s runtime environment. We configure device selection and initialize paths for storing outputs. We also print system details to confirm that everything is ready before loading the model.
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
model = AutoModelForTokenClassification.from_pretrained(
MODEL_ID,
torch_dtype=torch_dtype,
device_map=”auto” if torch.cuda.is_available() else None
)
classifier = pipeline(
task=”token-classification”,
model=model,
tokenizer=tokenizer,
aggregation_strategy=”simple”,
device=device if not torch.cuda.is_available() else None
)
LABEL_MASKS = {
“account_number”: “[ACCOUNT_NUMBER]”,
“private_address”: “[PRIVATE_ADDRESS]”,
“private_email”: “[PRIVATE_EMAIL]”,
“private_person”: “[PRIVATE_PERSON]”,
“private_phone”: “[PRIVATE_PHONE]”,
“private_url”: “[PRIVATE_URL]”,
“private_date”: “[PRIVATE_DATE]”,
“secret”: “[SECRET]”
}
We install all required libraries and set up the pipeline’s runtime environment. We configure device selection and initialize paths for storing outputs. We also print system details to confirm that everything is ready before loading the model.
def normalize_label(label):
label = label.replace(“B-“, “”).replace(“I-“, “”).replace(“E-“, “”).replace(“S-“, “”)
return label.strip()
def detect_pii(text):
raw = classifier(text)
spans = []
for item in raw:
label = normalize_label(item.get(“entity_group”, item.get(“entity”, “”)))
if label == “O” or not label:
continue
spans.append({
“label”: label,
“score”: float(item[“score”]),
“text”: item[“word”],
“start”: int(item[“start”]),
“end”: int(item[“end”])
})
spans = sorted(spans, key=lambda x: (x[“start”], x[“end”]))
return spans
def redact_text(text, spans, min_score=0.50, mode=”typed”):
filtered = [s for s in spans if s[“score”] >= min_score]
filtered = sorted(filtered, key=lambda x: x[“start”], reverse=True)
redacted = text
for span in filtered:
replacement = LABEL_MASKS.get(span[“label”], “[PII]”) if mode == “typed” else “[REDACTED]”
redacted = redacted[:span[“start”]] + replacement + redacted[span[“end”]:]
return redacted
def privacy_report(text, min_score=0.50):
spans = detect_pii(text)
redacted = redact_text(text, spans, min_score=min_score)
return {
“original_text”: text,
“redacted_text”: redacted,
“span_count”: len([s for s in spans if s[“score”] >= min_score]),
“spans”: [s for s in spans if s[“score”] >= min_score]
}
We define helper functions to normalize labels and extract PII spans from model predictions. We implement a redaction function that replaces sensitive segments based on confidence thresholds. We combine everything into a single reporting function that returns structured outputs.
sample_texts = [
“My name is Alice Smith and my email is [email protected]. Call me at +1 415 555 0189.”,
“Patient Rohan Mehta visited on 2025-04-11 and lives at 221B Baker Street, London.”,
“Use API key sk-test-51HxYzDemoSecret987 and send the invoice to [email protected].”,
“The public website is https://example.com, but Jane Doe’s private portal is https://jane-private.example.net.”,
“Account number 123456789012 was linked to Ahmed Khan on 12 March 2024.”,
“This sentence has no private information and should mostly remain unchanged.”
]
reports = []
for i, text in enumerate(sample_texts, 1):
report = privacy_report(text, min_score=0.50)
report[“example_id”] = i
reports.append(report)
for r in reports:
print(“\n” + “=” * 100)
print(“Example:”, r[“example_id”])
print(“Original:”, r[“original_text”])
print(“Redacted:”, r[“redacted_text”])
print(“Detected spans:”)
print(json.dumps(r[“spans”], indent=2, ensure_ascii=False))
rows = []
for r in reports:
for s in r[“spans”]:
rows.append({
“example_id”: r[“example_id”],
“label”: s[“label”],
“score”: s[“score”],
“detected_text”: s[“text”],
“start”: s[“start”],
“end”: s[“end”],
“original_text”: r[“original_text”],
“redacted_text”: r[“redacted_text”]
})
df = pd.DataFrame(rows)
display(df)
We create sample inputs and run them through the pipeline to test detection and redaction. We collect structured results and print both original and redacted text for comparison. We also convert the outputs into a dataframe for easier analysis.
json_path = OUT_DIR / “privacy_filter_reports.json”
csv_path = OUT_DIR / “privacy_filter_spans.csv”
with open(json_path, “w”, encoding=”utf-8″) as f:
json.dump(reports, f, indent=2, ensure_ascii=False)
df.to_csv(csv_path, index=False)
print(“\nSaved JSON:”, json_path)
print(“Saved CSV:”, csv_path)
if len(df):
label_counts = df[“label”].value_counts()
plt.figure(figsize=(10, 5))
label_counts.plot(kind=”bar”)
plt.title(“Detected PII Categories”)
plt.xlabel(“PII Category”)
plt.ylabel(“Detected Span Count”)
plt.xticks(rotation=35, ha=”right”)
plt.tight_layout()
plt.show()
plt.figure(figsize=(10, 5))
df[“score”].plot(kind=”hist”, bins=10)
plt.title(“Detection Confidence Distribution”)
plt.xlabel(“Confidence Score”)
plt.ylabel(“Frequency”)
plt.tight_layout()
plt.show()
def compare_thresholds(text, thresholds=(0.30, 0.50, 0.70, 0.90)):
spans = detect_pii(text)
results = []
for threshold in thresholds:
kept = [s for s in spans if s[“score”] >= threshold]
results.append({
“threshold”: threshold,
“span_count”: len(kept),
“redacted_text”: redact_text(text, spans, min_score=threshold)
})
return pd.DataFrame(results)
threshold_demo = compare_thresholds(sample_texts[0])
display(threshold_demo)
We save the processed outputs into JSON and CSV formats for persistence and reuse. We visualize detected PII categories and confidence distributions using plots. We also analyze how changing thresholds impacts detection and redaction behavior.
long_document = “””
Customer Support Transcript:
Agent: Hello, may I confirm your name?
Customer: My name is PSP.
Agent: Thanks. Could you confirm your email?
Customer: [email protected].
Agent: And your phone number?
Customer: +91 xxxxx xxxxx.
Agent: Your service address is 45 MG Road, Bengaluru, Karnataka.
Customer: Yes. Also, my backup email is [email protected].
Agent: Please do not share passwords or OTPs.
Customer: The temporary token I received is ghp_demoSecretToken123456.
“””
long_report = privacy_report(long_document, min_score=0.50)
print(“\nLONG DOCUMENT REDACTION”)
print(“=” * 100)
print(long_report[“redacted_text”])
print(“\nStructured spans:”)
print(json.dumps(long_report[“spans”], indent=2, ensure_ascii=False))
def pii_audit_table(texts, min_score=0.50):
audit_rows = []
for idx, text in enumerate(texts, 1):
result = privacy_report(text, min_score=min_score)
labels = Counter([s[“label”] for s in result[“spans”]])
audit_rows.append({
“id”: idx,
“original_chars”: len(text),
“redacted_chars”: len(result[“redacted_text”]),
“span_count”: result[“span_count”],
“labels_found”: dict(labels),
“redacted_text”: result[“redacted_text”]
})
return pd.DataFrame(audit_rows)
audit_df = pii_audit_table(sample_texts + [long_document], min_score=0.50)
display(audit_df)
audit_path = OUT_DIR / “privacy_filter_audit.csv”
audit_df.to_csv(audit_path, index=False)
print(“Saved audit CSV:”, audit_path)
custom_text = input(“\nEnter your own text for PII redaction, or press Enter to skip:\n”)
if custom_text.strip():
custom_report = privacy_report(custom_text, min_score=0.50)
print(“\nOriginal:”)
print(custom_report[“original_text”])
print(“\nRedacted:”)
print(custom_report[“redacted_text”])
print(“\nSpans:”)
print(json.dumps(custom_report[“spans”], indent=2, ensure_ascii=False))
else:
print(“Skipped custom input.”)
print(“\nTutorial complete.”)
We test the pipeline on a longer, realistic document to evaluate robustness. We generate an audit-style summary showing counts and categories of detected PII. We also allow custom user input so we can run the privacy filter interactively.
In conclusion, we developed a robust and extensible privacy filtering workflow that goes beyond simple detection. We systematically evaluated model predictions, applied confidence thresholds, and compared different redaction strategies to understand their impact. We also generated structured reports, visualized detection patterns, and exported results in JSON and CSV formats for auditing and downstream integration. This approach allows us to build reliable privacy safeguards into data pipelines, ensuring that sensitive information is consistently identified and handled responsibly while maintaining the usability of the underlying data.
Check out the Full Codes here. Also, feel free to follow us on Twitter and don’t forget to join our 130k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.
Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us

