Skip to main content
Version: 1.0.0-beta

Data Agent

AI-powered natural language to SQL query engine with RBAC-filtered schema access.

Architecture

Intent Classification

TypeExampleHandler
Data"How many open findings?"Data Agent (SQL)
Document"Summarize last year's IT audit"RAG Pipeline
Hybrid"Show findings similar to FR-042"Both
Conversational"What does CCCER stand for?"Direct LLM

Schema Registry

The Schema Registry provides RBAC-filtered DDL:

  • Each role sees only authorized tables
  • Column-level filtering for sensitive data
  • Schema cached in Redis (5-min TTL) per role+org

SQL Security (5 Layers)

LayerCheck
1. Syntax validationValid SQL syntax
2. Whitelist checkOnly SELECT statements
3. Table accessOnly authorized tables
4. Injection preventionNo semicolons, comments, or DDL
5. Read-only enforcementSET default_transaction_read_only = ON

Caching

  • Results cached in Redis
  • 5-minute TTL
  • Scoped by role + org_id
  • Cache invalidated on data changes

Example

User: "Show me the top 5 business units by number of open findings"

Generated SQL:

SELECT bu.name, COUNT(f.id) as finding_count
FROM finding f
JOIN engagement e ON f.engagement_id = e.id
JOIN audit_universe_item bu ON e.universe_item_id = bu.id
WHERE f.status != 'closed'
GROUP BY bu.name
ORDER BY finding_count DESC
LIMIT 5;

Result: Chart + table showing the data.