Detection-as-Code (YAML)

Detection-as-Code allows you to define classifiers and detection rules as YAML files stored in a Git repository. This approach brings software engineering practices — version control, code review, CI/CD — to your data security detection rules.

Concept

Instead of configuring classifiers through the web UI, you define them as YAML files in a Git repository. Changes to these files are synced to Slim.io automatically, providing:

Version History — Full audit trail of every classifier change
Code Review — Detection rule changes go through pull request review
Rollback — Revert to any previous classifier configuration instantly
Environment Promotion — Test rules in dev before deploying to production
Collaboration — Security engineers can propose rules and review each other’s work

File Structure

Organize your classifiers in a directory structure within your repository:


slim-io-config/
  classifiers/
    personal-id/
      us-ssn.yaml
      passport.yaml
      drivers-license.yaml
    financial/
      credit-card.yaml
      bank-account.yaml
    custom/
      internal-employee-id.yaml
      project-codename.yaml
  suppressions/
    test-data.yaml
    known-false-positives.yaml

Classifier YAML Schema

Each classifier file follows this schema:


apiVersion: slim.io/v1
kind: Classifier
metadata:
  name: us-ssn-proximity
  description: "US Social Security Number with contextual validation"
  category: SSN
  tags:
    - personal-id
    - compliance-required
spec:
  type: proximity
  pattern: '\b\d{3}-\d{2}-\d{4}\b'
  keywords:
    - "social security"
    - "ssn"
    - "taxpayer id"
  window: 100
  confidence: high  # high | medium | low — relative to your tuning
  enabled: true

Required Fields

Field	Description
`apiVersion`	Always `slim.io/v1`
`kind`	`Classifier` or `Suppression`
`metadata.name`	Unique identifier (kebab-case)
`metadata.category`	PII category this classifier detects
`spec.type`	Classifier type (`regex`, `dictionary`, `proximity`, `checksum`, `ml`)
`spec.confidence`	Base confidence tier (`high` \| `medium` \| `low`)

Git Sync Configuration

Enable Detection-as-Code in the Customer Dashboard under Settings > Integrations:

Connect your Git repository (GitHub, GitLab, or Bitbucket).
Specify the directory path containing your classifier YAML files.
Select the branch to sync from (typically main or production).
Configure sync frequency (on push via webhook, or polling at a configurable interval).

When Git sync is enabled, the synced classifiers are merged with any classifiers configured through the web UI. If a classifier with the same metadata.name exists in both, the Git version takes precedence.

Validation

Before syncing, Slim.io validates all YAML files:

Schema Validation — Ensures all required fields are present and correctly typed
Regex Compilation — Verifies that regex patterns are valid and compile without errors
Duplicate Detection — Flags classifiers with duplicate names
Conflict Detection — Identifies overlapping patterns that may cause redundant findings

CI/CD Validation

Add Slim.io validation to your CI pipeline:


# Validate all classifier files before merge
curl -X POST https://api.slim.io/api/v1/classifiers/validate \
  -H "Authorization: Bearer $SLIM_IO_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"repository": "org/repo", "path": "slim-io-config/classifiers/"}'

The validation endpoint returns detailed errors and warnings without deploying any changes.

Diffing and Audit

Every sync operation produces a diff that shows:

New classifiers added
Existing classifiers modified (field-level diff)
Classifiers removed
Net impact on detection coverage

This diff is visible in the Customer Dashboard under Classifiers > Sync History and is also available via the API.

Best Practices

Use descriptive names — us-ssn-proximity is better than classifier-1
Set appropriate confidence — Higher confidence for validated patterns (checksum, proximity), lower for raw regex
Tag classifiers — Use tags for filtering and organization (compliance-required, high-priority)
Write suppressions — Proactively suppress known false positive patterns
Test before merge — Use the validation API in CI and test against sample data
Review changes — Require at least one approving review before merging classifier changes