Create a Custom Classifier
This guide walks you through defining a custom YAML classifier to detect organization-specific sensitive data patterns.
Time required: 5–10 minutes
Prerequisites:
- Editor or Admin role in the Customer Dashboard
- Knowledge of the data pattern you want to detect
Step 1: Navigate to Classifiers
- In the Customer Dashboard, navigate to Classifiers in the sidebar.
- Click Create Classifier.
- Select YAML Editor as the creation method.
Step 2: Write the Classifier YAML
Here is an example classifier that detects an internal employee ID format:
apiVersion: slim.io/v1
kind: Classifier
metadata:
name: internal-employee-id
description: "Internal employee ID in format EMP-XXXXXXXX"
category: Employee ID
tags:
- internal
- hr-data
spec:
type: proximity
pattern: '\bEMP-[A-Z0-9]{8}\b'
keywords:
- "employee"
- "emp id"
- "staff number"
- "personnel"
window: 80
confidence: high # high | medium | low — relative to your tuning
enabled: trueChoosing the Right Type
| Pattern Characteristic | Recommended Type |
|---|---|
| Fixed format with reliable regex | regex |
| Format needs contextual keywords | proximity |
| Known list of values | dictionary |
| Format includes check digits | checksum |
Step 3: Validate the Classifier
- Click Validate in the YAML editor.
- Slim.io checks:
- YAML syntax is valid
- All required fields are present
- The regex pattern compiles without errors
- No duplicate classifier name exists
- Fix any reported errors before proceeding.
Step 4: Test Against Sample Data
- Click Test to open the validation console.
- Paste sample text containing the pattern you want to detect:
Employee Record
Name: Jane Doe
Emp ID: EMP-A1B2C3D4
Department: Engineering- Click Run Test.
- Verify the classifier matches the expected values with the correct confidence score.
Test with both positive examples (text that should match) and negative examples (text that should not match) to validate precision and recall before deploying.
Step 5: Deploy the Classifier
- Click Deploy to activate the classifier.
- The classifier is immediately active and will be used in all subsequent scans.
- Existing scan results are not retroactively updated — run a new scan to apply the classifier.
Step 6: Verify in a Scan
- Trigger a scan on a connector that contains data matching your pattern.
- After the scan completes, navigate to the Data Catalog.
- Filter findings by your custom category (e.g., “Employee ID”).
- Verify the matches are correct and the confidence scores are appropriate.
Advanced: Detection-as-Code
For teams that manage classifiers through Git, add the YAML file to your repository:
slim-io-config/
classifiers/
custom/
internal-employee-id.yamlEnable Git sync under Settings > Integrations to automatically deploy classifier changes on merge. See Detection-as-Code for details.
Writing Effective Classifiers
- Be specific — Narrow regex patterns reduce false positives
- Use proximity — Adding contextual keywords significantly improves accuracy
- Set appropriate confidence — Higher for validated formats, lower for broad patterns
- Write suppressions — Add suppression rules for known false positive patterns
- Document thoroughly — The
descriptionfield should explain what the classifier detects and why
Next Steps
- Detection-as-Code — Manage classifiers through version-controlled YAML
- Set up governance policies — Automate actions on findings from your custom classifier
Last updated on