Architecture

Slim.io is a cloud-native DSPM platform that discovers, classifies, and protects sensitive data across your organization’s cloud storage. This page describes the high-level product architecture and how data flows through the platform.

Core Flow

The platform operates through four stages:


Connect → Scan → Protect → Govern

1. Connect

Slim.io establishes secure, read-only connections to your cloud storage providers using cross-account IAM roles (AWS), Workload Identity Federation (GCP), or Service Principals (Azure). No data leaves your environment during connection setup.

Supports AWS S3, Google Cloud Storage, and Azure Blob Storage
Credentials are encrypted at rest and never stored in plaintext
Each connector is scoped to specific buckets or prefixes

2. Scan

The scanning engine processes files in parallel through a multi-layered detection pipeline:

Pre-screening filters out files unlikely to contain sensitive data
Classifiers (regex, dictionary, proximity, checksum, ML) analyze content
Confidence scoring assigns certainty levels to each finding
LLM Assist disambiguates borderline detections using AI

Scans can run as full scans, incremental scans (only changed files), or event-driven scans triggered by new uploads.

3. Protect

When sensitive data is found, Slim.io applies protective actions defined in your policies:

Tokenization — Replaces PII with AES-256 encrypted tokens (reversible)
Masking — Permanently redacts sensitive values (irreversible)
Quarantine — Moves files to an isolation bucket
Alerting — Notifies your team via Slack, email, or webhooks

4. Govern

Governance policies written in YAML define automated responses to findings. The policy engine evaluates conditions (PII category, confidence, file location) and executes actions declaratively.

Policy-as-Code — Version-controlled YAML definitions
Drift Detection — Monitors for policy violations over time
Risk Scoring — Aggregates findings into file-level and organization-level risk scores
Compliance Mapping — Maps findings to regulatory frameworks (HIPAA, PCI-DSS, SOC 2, GDPR)

AI Classification Pipeline

Slim.io uses a multi-provider AI pipeline for classification and false positive reduction:

Provider	Role
Primary	Default classification engine for high-throughput scanning
Fallback	Automatic failover when the primary provider is unavailable
Secondary Fallback	Tertiary option ensuring classification availability during outages

The orchestrator automatically fails over between providers, ensuring classification remains available even during provider outages.

Security Architecture

Authenticated AES-256 Encryption — PII tokenization uses authenticated encryption with per-tenant key isolation
HMAC-Signed API Calls — Inter-service communication is signed to prevent tampering
mTLS — Mutual TLS between internal services
Audit Logging — All sensitive operations (scans, policy changes, connector modifications, decryption requests) are recorded with actor identity and timestamps

BYOC Deployment

For organizations requiring data residency within their own infrastructure, Slim.io supports Bring Your Own Cloud (BYOC) deployments where the scanning engine runs inside your VPC. See BYOC for details.

BYOC deployments maintain feature parity with the hosted platform. The scanning agent communicates with the Slim.io control plane for orchestration while all data processing remains within your network boundary.