Skip to Content
Getting StartedArchitecture

Architecture

Slim.io is a cloud-native DSPM platform that discovers, classifies, and protects sensitive data across your organization’s cloud storage. This page describes the high-level product architecture and how data flows through the platform.

Core Flow

The platform operates through four stages:

Connect → Scan → Protect → Govern

1. Connect

Slim.io establishes secure, read-only connections to your cloud storage providers using cross-account IAM roles (AWS), Workload Identity Federation (GCP), or Service Principals (Azure). No data leaves your environment during connection setup.

  • Supports AWS S3, Google Cloud Storage, and Azure Blob Storage
  • Credentials are encrypted at rest and never stored in plaintext
  • Each connector is scoped to specific buckets or prefixes

2. Scan

The scanning engine processes files in parallel through a multi-layered detection pipeline:

  • Pre-screening filters out files unlikely to contain sensitive data
  • Classifiers (regex, dictionary, proximity, checksum, ML) analyze content
  • Confidence scoring assigns certainty levels to each finding
  • LLM Assist disambiguates borderline detections using AI

Scans can run as full scans, incremental scans (only changed files), or event-driven scans triggered by new uploads.

3. Protect

When sensitive data is found, Slim.io applies protective actions defined in your policies:

  • Tokenization — Replaces PII with AES-256 encrypted tokens (reversible)
  • Masking — Permanently redacts sensitive values (irreversible)
  • Quarantine — Moves files to an isolation bucket
  • Alerting — Notifies your team via Slack, email, or webhooks

4. Govern

Governance policies written in YAML define automated responses to findings. The policy engine evaluates conditions (PII category, confidence, file location) and executes actions declaratively.

  • Policy-as-Code — Version-controlled YAML definitions
  • Drift Detection — Monitors for policy violations over time
  • Risk Scoring — Aggregates findings into file-level and organization-level risk scores
  • Compliance Mapping — Maps findings to regulatory frameworks (HIPAA, PCI-DSS, SOC 2, GDPR)

AI Classification Pipeline

Slim.io uses a multi-provider AI pipeline for classification and false positive reduction:

ProviderRole
PrimaryDefault classification engine for high-throughput scanning
FallbackAutomatic failover when the primary provider is unavailable
Secondary FallbackTertiary option ensuring classification availability during outages

The orchestrator automatically fails over between providers, ensuring classification remains available even during provider outages.

Security Architecture

  • Authenticated AES-256 Encryption — PII tokenization uses authenticated encryption with per-tenant key isolation
  • HMAC-Signed API Calls — Inter-service communication is signed to prevent tampering
  • mTLS — Mutual TLS between internal services
  • Audit Logging — All sensitive operations (scans, policy changes, connector modifications, decryption requests) are recorded with actor identity and timestamps

BYOC Deployment

For organizations requiring data residency within their own infrastructure, Slim.io supports Bring Your Own Cloud (BYOC) deployments where the scanning engine runs inside your VPC. See BYOC for details.

BYOC deployments maintain feature parity with the hosted platform. The scanning agent communicates with the Slim.io control plane for orchestration while all data processing remains within your network boundary.

Last updated on