Version: 1.0.0-beta

Historical Retrieval (RAG)

AIIA uses Retrieval-Augmented Generation (RAG) to search and retrieve relevant historical audit data. When generating suggestions, the AI accesses past engagements, findings, and workpapers to provide context-aware recommendations.

What RAG Does

RAG enhances AI responses by:

Embedding your audit data into a vector database (pgvector)
Searching for semantically similar historical records when a query is made
Augmenting the AI prompt with relevant context
Generating more accurate, context-aware responses

Use Cases

Feature	RAG Contribution
Recurring Finding Detection	Searches past findings for similar patterns
Test Procedure Suggestions	Retrieves test procedures used in similar engagements
Risk Assessment	Finds historical risk assessments for comparable areas
Narrative Drafting	References past report language for consistency
KRI Insights	Correlates current KRI values with historical trends

Permission-Aware Retrieval

Security

RAG retrieval respects RBAC at all times:

The AI only retrieves records the requesting user has permission to view
User roles (user_roles) are passed to the RAG pipeline
Cross-organization data is never accessible
Engagement-level access controls are enforced

How Data Is Indexed

The RAG system indexes:

Finding descriptions and CCCER components
Workpaper narratives and conclusions
Test procedure descriptions and results
Engagement objectives and scope
Risk and control descriptions

Data is embedded using the configured embedding model (set in Administration → AI Models).

Citations

Every RAG-enhanced response includes citations:

References to specific findings, workpapers, or engagements
Source document identifiers
Relevance scores
Links to the original records (respecting permissions)

Configuration

RAG is configured through:

Embedding Model — set in Administration → AI Models (type: embedding)
Vector Database — PostgreSQL with pgvector extension
Indexing Schedule — data is re-indexed periodically via background worker

On-Premises Support

For air-gapped deployments:

Use Ollama with a local embedding model
All data stays within your infrastructure
No external API calls required

What RAG Does​

Use Cases​

Permission-Aware Retrieval​

How Data Is Indexed​

Citations​

Configuration​

On-Premises Support​