Architecture
Slim.io is a cloud-native DSPM platform that discovers, classifies, and protects sensitive data across your organization’s cloud storage. This page describes the high-level product architecture and how data flows through the platform.
Core Flow
The platform operates through four stages:
Connect → Scan → Protect → Govern1. Connect
Slim.io establishes secure, read-only connections to your cloud storage providers using cross-account IAM roles (AWS), Workload Identity Federation (GCP), or Service Principals (Azure). No data leaves your environment during connection setup.
- Supports AWS S3, Google Cloud Storage, and Azure Blob Storage
- Credentials are encrypted at rest and never stored in plaintext
- Each connector is scoped to specific buckets or prefixes
2. Scan
The scanning engine processes files in parallel through a multi-layered detection pipeline:
- Pre-screening filters out files unlikely to contain sensitive data
- Classifiers (regex, dictionary, proximity, checksum, ML) analyze content
- Confidence scoring assigns certainty levels to each finding
- LLM Assist disambiguates borderline detections using AI
Scans can run as full scans, incremental scans (only changed files), or event-driven scans triggered by new uploads.
3. Protect
When sensitive data is found, Slim.io applies protective actions defined in your policies:
- Tokenization — Replaces PII with AES-256 encrypted tokens (reversible)
- Masking — Permanently redacts sensitive values (irreversible)
- Quarantine — Moves files to an isolation bucket
- Alerting — Notifies your team via Slack, email, or webhooks
4. Govern
Governance policies written in YAML define automated responses to findings. The policy engine evaluates conditions (PII category, confidence, file location) and executes actions declaratively.
- Policy-as-Code — Version-controlled YAML definitions
- Drift Detection — Monitors for policy violations over time
- Risk Scoring — Aggregates findings into file-level and organization-level risk scores
- Compliance Mapping — Maps findings to regulatory frameworks (HIPAA, PCI-DSS, SOC 2, GDPR)
AI Classification Pipeline
Slim.io uses a multi-provider AI pipeline for classification and false positive reduction:
| Provider | Role |
|---|---|
| Primary | Default classification engine for high-throughput scanning |
| Fallback | Automatic failover when the primary provider is unavailable |
| Secondary Fallback | Tertiary option ensuring classification availability during outages |
The orchestrator automatically fails over between providers, ensuring classification remains available even during provider outages.
Security Architecture
- Authenticated AES-256 Encryption — PII tokenization uses authenticated encryption with per-tenant key isolation
- HMAC-Signed API Calls — Inter-service communication is signed to prevent tampering
- mTLS — Mutual TLS between internal services
- Audit Logging — All sensitive operations (scans, policy changes, connector modifications, decryption requests) are recorded with actor identity and timestamps
BYOC Deployment
For organizations requiring data residency within their own infrastructure, Slim.io supports Bring Your Own Cloud (BYOC) deployments where the scanning engine runs inside your VPC. See BYOC for details.
BYOC deployments maintain feature parity with the hosted platform. The scanning agent communicates with the Slim.io control plane for orchestration while all data processing remains within your network boundary.