Run Your First Scan
This guide walks you through triggering a scan on an active connector and reviewing the results.
Time required: 5–15 minutes (depending on data volume)
Prerequisites:
- At least one active connector (see connector setup guides)
Step 1: Select a Connector
- Navigate to Connectors in the Customer Dashboard sidebar.
- Locate the connector you want to scan — it should show Active status.
- Click the Scan button on the connector row, or navigate to Scans in the sidebar.
Step 2: Configure the Scan
In the scan configuration dialog:
- Scan Type: Select Full Scan for the initial run.
- Full Scan processes all files in the connector scope
- Incremental Scan (for subsequent runs) only processes new or modified files
- File Filters (optional): Restrict to specific file types (e.g.,
.csv,.json). - Prefix Filters (optional): Limit to specific paths (e.g.,
data/exports/). - LLM Assist (optional): Enable for AI-powered false positive reduction on borderline findings.
- Click Start Scan.
Step 3: Monitor Progress
The Scan Monitor displays real-time progress:
- Files Total — Total files discovered in scope
- Files Processed — Files completed by workers
- Findings — Sensitive data matches detected so far
- Workers — Number of active parallel workers
- Elapsed Time — How long the scan has been running
- Estimated Remaining — Projected time to completion
For small scans (under 1,000 files), the scan typically completes in under a minute. Larger scans (100K+ files) may take 10–30 minutes depending on file sizes and your tier’s worker count.
Step 4: Review Findings
When the scan completes (status changes to Completed):
- Navigate to Data Catalog to see the full inventory of files with findings.
- Sort by Risk Score to prioritize the highest-risk files.
- Click any file to view its findings:
- PII Category — What type of sensitive data was detected (SSN, email, etc.)
- Confidence — How certain the detection is (0.0 to 1.0)
- Classifier — Which detection method identified it (regex, ML, proximity)
- Location — Where in the file the data was found
Step 5: Take Action
Based on your findings, you can:
- Investigate — Navigate to the Investigation page for deeper analysis by category or severity
- Create Policies — Set up governance rules to automatically handle future findings
- Schedule Scans — Configure recurring scans to maintain ongoing visibility
- Export Results — Download findings as CSV or JSON for external reporting
What Happened During the Scan
The scan executed the following pipeline for each file:
- Pre-Screen — A probabilistic check determined whether the file was likely to contain sensitive data
- Download — File content was streamed from cloud storage
- Parse — File format was detected and content extracted
- Classify — All active classifiers ran against the content
- Score — Each match received a confidence score
- AI Disambiguation — (If enabled) Findings in the ambiguous range were escalated to AI for adjudication
- Store — Final findings were written to the Data Catalog
Troubleshooting
| Issue | Solution |
|---|---|
| Scan stuck at “Queued” | Check your tier’s scan quota — you may have reached the monthly limit |
| 0 findings on a known-PII bucket | Verify file types are included in the scan scope; check classifier configuration |
| Scan failed | Check the error log in Scan Monitor — usually a credential or permission issue |
Next Steps
- Create custom classifiers for organization-specific data patterns
- Set up governance policies to automate remediation
- Configure event-driven scanning for real-time detection
Last updated on