Cloud DLP
Cloud DLP extends Slim.io’s data loss prevention to data at rest in cloud storage. It integrates with native cloud DLP services — Google Cloud DLP, AWS Macie, and Azure Purview — while supplementing their detections with Slim.io’s own classification engine.
How Cloud DLP Works
Cloud DLP operates through the same connector infrastructure used for scanning. When enabled, it combines two detection sources:
- Slim.io Detection Engine — Slim.io’s multi-layered classifier pipeline (regex, dictionary, proximity, checksum, ML) runs against file contents.
- Native Cloud DLP — Findings from the cloud provider’s built-in DLP service are ingested and correlated with Slim.io’s detections.
The result is a unified findings view that merges both sources, deduplicates overlapping detections, and applies a combined confidence score.
Provider Setup
Google Cloud DLP
Google Cloud DLP provides deep content inspection for data stored in GCS, BigQuery, and Datastore.
Prerequisites:
- Google Cloud DLP API enabled in your project
- A service account with
roles/dlp.userandroles/dlp.reader - The Slim.io connector’s service account must also have DLP permissions
Configuration:
connector:
provider: gcp
dlp:
enabled: true
inspection_template: projects/your-project/inspectionTemplates/slim-io-template
info_types:
- CREDIT_CARD_NUMBER
- US_SOCIAL_SECURITY_NUMBER
- EMAIL_ADDRESS
- PHONE_NUMBER
- PERSON_NAME
min_likelihood: LIKELY
max_findings_per_item: 100Supported info types: Google Cloud DLP supports 150+ built-in info types spanning financial data, government IDs, health data, credentials, and demographic information. Slim.io imports all findings regardless of info type and maps them to its unified category taxonomy.
AWS Macie
Amazon Macie provides automated data discovery and classification for S3 buckets.
Prerequisites:
- Amazon Macie enabled in the target AWS region
- The Slim.io cross-account IAM role must include
macie2:GetFindingsandmacie2:ListFindingspermissions
Configuration:
connector:
provider: aws
dlp:
enabled: true
macie:
classification_job_schedule: daily
managed_data_identifiers:
- CREDIT_CARD_NUMBER
- AWS_CREDENTIALS
- SSH_PRIVATE_KEY
- USA_SOCIAL_SECURITY_NUMBER
custom_data_identifiers: []
severity_filter: MEDIUM # Minimum severity: LOW, MEDIUM, HIGHFinding sync: Slim.io polls Macie findings on a configurable schedule (default: every 6 hours) and merges them with its own scan results. Duplicate findings are deduplicated based on file path and data category.
Azure Purview
Microsoft Purview (formerly Azure Purview) provides unified data governance including sensitive data classification.
Prerequisites:
- Microsoft Purview account provisioned in your Azure subscription
- The Slim.io Service Principal must have
Purview Data Readerrole - Sensitivity labels configured in the Microsoft Purview compliance portal
Configuration:
connector:
provider: azure
dlp:
enabled: true
purview:
account_name: your-purview-account
scan_rule_set: default
sensitivity_labels:
- Confidential
- Highly Confidential
- Internal
classification_rules:
- Credit Card Number
- U.S. Social Security Number (SSN)
- Email AddressLabel mapping: Slim.io maps Purview sensitivity labels to its own severity system. Custom label mappings can be configured in the connector settings.
Unified Findings
When both Slim.io and a native cloud DLP service detect sensitive data in the same file, the findings are correlated:
| Scenario | Behavior |
|---|---|
| Both detect same data | Findings are merged; confidence score uses the higher of the two |
| Only Slim.io detects | Finding is stored with Slim.io as the source |
| Only native DLP detects | Finding is imported and mapped to Slim.io categories |
| Category mismatch | Both findings are preserved with their respective categories |
Native cloud DLP findings are imported as read-only. Slim.io does not modify or delete findings in the source DLP service. All policy actions (tokenize, mask, quarantine) operate on Slim.io’s own finding records.
Info Type Mapping
Slim.io maintains a mapping table between native cloud DLP info types and its own category taxonomy:
| Slim.io Category | Google Cloud DLP | AWS Macie | Azure Purview |
|---|---|---|---|
| SSN | US_SOCIAL_SECURITY_NUMBER | USA_SOCIAL_SECURITY_NUMBER | U.S. Social Security Number |
| Credit Card | CREDIT_CARD_NUMBER | CREDIT_CARD_NUMBER | Credit Card Number |
EMAIL_ADDRESS | EMAIL_ADDRESS | Email Address | |
| Phone | PHONE_NUMBER | USA_PHONE_NUMBER | U.S. Phone Number |
| AWS Credentials | N/A | AWS_CREDENTIALS | N/A |
| Private Key | ENCRYPTION_KEY | SSH_PRIVATE_KEY | N/A |
Custom info types from any provider are mapped to a Custom category unless a specific mapping is configured.
Scheduling
Cloud DLP integrations follow the same scheduling model as standard scans:
- Full sync — Import all findings from the native DLP service (run periodically or after initial setup)
- Incremental sync — Import only new findings since the last sync (default behavior on schedule)
- Event-driven sync — Triggered when the native service reports new findings (supported for Google Cloud DLP via Pub/Sub)
The default sync interval is 6 hours. Configure this under Connectors > [Connector Name] > DLP Settings.
Best Practices
- Enable native DLP alongside Slim.io — Native services have access to provider-specific data types (e.g., AWS credentials, Azure sensitivity labels) that complement Slim.io’s general-purpose classifiers.
- Use Slim.io as the single pane — Even with native DLP enabled, review all findings in the Slim.io dashboard for a unified view across providers.
- Align info types — Configure native DLP services to detect the same categories as your Slim.io classifiers to maximize deduplication accuracy.
- Monitor sync health — Check the connector health page for sync failures or delays. Native DLP API rate limits can cause temporary sync interruptions.