BYOC (Bring Your Own Cloud)
BYOC deployments allow you to run Slim.io’s scanning engine inside your own cloud infrastructure. Data never leaves your VPC — only scan metadata and findings are communicated back to the Slim.io control plane.
Choosing between BYOC and In-Customer-Cloud Agentless — both run the scanner in your cloud. BYOC is fully customer-managed: you operate the deployment lifecycle (updates, scaling, monitoring). In-Customer-Cloud Agentless is slim.io-managed inside your cloud: you grant access via a Terraform module, and slim.io handles the scanner lifecycle. Pick BYOC for air-gapped or fully customer-operated environments; pick In-Customer-Cloud Agentless when you need data residency without operating the scanner yourself.
Why BYOC
BYOC addresses requirements that the SaaS model cannot satisfy:
- Data residency — Regulated industries that prohibit data transfer outside the organization’s boundary
- Network restrictions — Environments where storage buckets are not accessible from the public internet
- Compliance mandates — Frameworks that require all data processing to occur within the organization’s infrastructure
- Latency optimization — Scanning large volumes of data co-located with the compute reduces transfer time
Architecture
Customer VPC Slim.io Control Plane
┌─────────────────────┐ ┌──────────────────────┐
│ Storage Buckets │ │ Coordinator API │
│ ↓ │ │ ↓ │
│ Scanner Workers │ ──────→ │ Findings Database │
│ (containers) │ metadata │ ↓ │
│ ↓ │ only │ Dashboard / API │
│ Local Findings DB │ │ │
│ (optional) │ │ │
└─────────────────────┘ └──────────────────────┘What Stays in Your VPC
- All file content — files are read and processed locally
- Scanning containers — workers run on your compute
- LLM inference (if client-hosted LLM Assist is enabled)
- Optionally, a local findings database for air-gapped deployments
What Leaves Your VPC
- Scan metadata — File paths, finding categories, confidence scores, risk scores
- Worker heartbeats — Progress reports from workers to the coordinator
- No file content — Original data values are never transmitted unless tokenization is configured to store tokens in Slim.io
In default BYOC mode, finding metadata (including PII category and location but NOT the PII value itself) is sent to the Slim.io control plane for dashboard display. For fully air-gapped deployments, see the “Local-Only Mode” section below.
Deployment Options
Docker Compose
For development and small-scale deployments. Choose the scanner profile that matches your data sources:
services:
slim-scanner:
image: slimio/scanner:1.0.0-full # or 1.0.0-cloud-storage, 1.0.0-database, 1.0.0-saas
env_file: .env
volumes:
- scanner-wal:/var/lib/slim-wal
deploy:
resources:
limits:
memory: 4G
cpus: '2'
restart: unless-stopped
healthcheck:
test: ["CMD", "python", "-c", "import scanner.config; print('ok')"]
interval: 30s
timeout: 5s
retries: 3
volumes:
scanner-wal:Create a .env file alongside the compose file (add to .gitignore):
SLIM_REGISTRATION_TOKEN=slim_reg_xxxxxxxxxxxx
SLIM_CONTROL_PLANE_URL=https://api.slim.io
SLIM_SCANNER_GROUPS=default
SLIM_SCANNER_NAME=my-scanner
SLIM_MAX_CONCURRENT_JOBS=4Always use a pinned version tag (e.g., slimio/scanner:1.0.0-full). We do not publish a :latest tag — this prevents accidental upgrades that could introduce breaking changes.
Kubernetes
For production deployments on EKS, GKE, or AKS. Apply this single manifest:
apiVersion: v1
kind: Secret
metadata:
name: slim-scanner-auth
type: Opaque
stringData:
SLIM_REGISTRATION_TOKEN: "slim_reg_xxxxxxxxxxxx"
SLIM_CONTROL_PLANE_URL: "https://api.slim.io"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: slim-scanner-wal
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 1Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: slim-scanner
spec:
replicas: 1
selector:
matchLabels:
app: slim-scanner
template:
metadata:
labels:
app: slim-scanner
spec:
containers:
- name: scanner
image: slimio/scanner:1.0.0-full
envFrom:
- secretRef:
name: slim-scanner-auth
env:
- name: SLIM_SCANNER_GROUPS
value: "default"
- name: SLIM_MAX_CONCURRENT_JOBS
value: "4"
resources:
limits:
memory: "4Gi"
cpu: "2"
volumeMounts:
- name: wal
mountPath: /var/lib/slim-wal
volumes:
- name: wal
persistentVolumeClaim:
claimName: slim-scanner-walkubectl apply -f scanner.yamlAWS Lambda
For serverless scanning in AWS environments:
- Deploy the Slim.io scanner Lambda function from the provided SAM template.
- Configure the function with your connector credentials and Slim.io API key.
- Set up S3 event notifications to trigger the Lambda on file uploads.
- The Lambda processes each file and reports findings to the Slim.io control plane.
Google Cloud Run
For serverless scanning in GCP environments:
gcloud run deploy slim-scanner \
--image slimio/scanner:1.0.0-full \
--set-env-vars SLIM_REGISTRATION_TOKEN=slim_reg_xxxx,SLIM_CONTROL_PLANE_URL=https://api.slim.io \
--memory 4Gi \
--cpu 2 \
--region us-central1 \
--min-instances 1Azure Container Apps
For serverless scanning in Azure environments:
az containerapp create \
--name slim-scanner \
--resource-group slim-io-rg \
--image slimio/scanner:1.0.0-full \
--env-vars SLIM_REGISTRATION_TOKEN=slim_reg_xxxx SLIM_CONTROL_PLANE_URL=https://api.slim.io \
--cpu 2 --memory 4GiAgent Configuration
The BYOC scanner agent supports the following environment variables:
| Variable | Required | Description |
|---|---|---|
SLIM_REGISTRATION_TOKEN | Yes | One-time registration token from the Add Scanner flow |
SLIM_CONTROL_PLANE_URL | Yes | Control plane endpoint (https://api.slim.io) |
SLIM_SCANNER_GROUPS | No | Scanner group assignment (default: default) |
SLIM_SCANNER_NAME | No | Human-readable name for this scanner |
SLIM_MAX_CONCURRENT_JOBS | No | Max parallel scan jobs (default: 4) |
SLIM_LOG_LEVEL | No | Logging verbosity (debug, info, warn, error) |
Verifying the Agent
After deployment, verify the agent is communicating with the control plane:
- Navigate to the connector in the Customer Dashboard.
- Check for a Heartbeat indicator showing the agent’s last check-in time.
- Trigger a test scan and verify findings appear in the Data Catalog.
How Scan Work Reaches Your Scanner
Scanner v0.4.0 and later receive scan tasks over a long-lived connection — new scans dispatch in seconds rather than waiting for a poll cycle. If the connection drops (transient network blip, slim.io platform maintenance), the scanner reconnects automatically with exponential backoff and resumes work. Duplicate-task delivery during a reconnect window is filtered on the scanner side, so the same task never executes twice.
During slim.io platform updates, the scanner receives a clean reconnect signal with random jitter (1–5 seconds) so a fleet of scanners doesn’t reconnect at the same instant. Active scans continue uninterrupted; the scanner reconnects on the new platform revision and resumes.
v0.3.0 scanners continue to work via the legacy poll-based path with no behavior change. v0.4.0 is the recommended upgrade once available — check the Scanner Releases page for image availability.
In-Customer-Cloud Agentless
For environments that need scanner-in-customer-cloud without the operational burden of BYOC, slim.io offers In-Customer-Cloud Agentless. The scanner runs in your AWS, GCP, or Azure account, but the lifecycle (deployment, updates, scaling, monitoring) is managed by slim.io.
How It Works
- Apply slim.io’s Terraform module — provisioned in your account, the module creates the scanner role/identity with read-only permissions on your storage and a registration callback to the slim.io control plane.
- Verify the connection — slim.io tests the cross-account assume-role / Workload Identity Federation / managed-identity flow before activating the scanner.
- Scan — bytes are read by the scanner inside your cloud; only finding metadata is sent to the slim.io control plane (same metadata-only contract as BYOC).
- Air-gapped fallback — if outbound automation from your account is restricted, slim.io supports a manual completion flow. Apply the Terraform module by hand, share the generated identifiers with your slim.io customer success contact, and slim.io completes registration on your behalf.
Comparison with BYOC
| BYOC | In-Customer-Cloud Agentless | |
|---|---|---|
| Scanner runs in | Your cloud | Your cloud |
| Bytes read by | Your scanner | A scanner inside your cloud, slim.io-managed |
| Scanner lifecycle | You operate (updates, scaling) | slim.io operates |
| Onboarding | Deploy from Docker Hub image manually | terraform apply from a slim.io module |
| Air-gapped | Supported via Local-Only Mode | Supported via paste-form fallback |
| When to choose | Custom orchestration, fully customer-operated, regulated environments where slim.io cannot have any operational role | Data residency without the operational burden of operating the scanner yourself |
Both options share the same metadata-only contract: file content never leaves your tenancy. See Hosting Topologies for the platform-level decision matrix.
Local-Only Mode
For fully air-gapped deployments where no data can leave the customer’s network:
- Deploy the scanner agent with
SLIM_MODE=local. - Findings are written to a local PostgreSQL or SQLite database.
- Dashboards and APIs are not available (no control plane connection).
- Export findings as JSON or CSV for offline analysis.
Local-only mode is available as a custom deployment option for Enterprise customers. Contact your account manager for setup assistance.
Updating the Scanner
Scanner images are published to Docker Hub at slimio/scanner. To update:
- Check the Scanner Fleet page for an Update Available badge showing the new version.
- Pull the new image:
docker pull slimio/scanner:1.1.0-full(replace with your profile and the target version). - Update your
docker-compose.ymlor Kubernetes manifest with the new image tag. - Restart — the scanner finishes any active jobs before shutting down, then starts the new version.
- The scanner re-registers with the control plane automatically using its existing identity.
Scanner updates are non-disruptive. The scanner gracefully completes in-progress work before restarting. No scan data is lost during an update.