Skip to main content
Version: 1.0.0-beta

Historical Retrieval (RAG)

AIIA uses Retrieval-Augmented Generation (RAG) to search and retrieve relevant historical audit data. When generating suggestions, the AI accesses past engagements, findings, and workpapers to provide context-aware recommendations.

What RAG Does

RAG enhances AI responses by:

  1. Embedding your audit data into a vector database (pgvector)
  2. Searching for semantically similar historical records when a query is made
  3. Augmenting the AI prompt with relevant context
  4. Generating more accurate, context-aware responses

Use Cases

FeatureRAG Contribution
Recurring Finding DetectionSearches past findings for similar patterns
Test Procedure SuggestionsRetrieves test procedures used in similar engagements
Risk AssessmentFinds historical risk assessments for comparable areas
Narrative DraftingReferences past report language for consistency
KRI InsightsCorrelates current KRI values with historical trends

Permission-Aware Retrieval

Security

RAG retrieval respects RBAC at all times:

  • The AI only retrieves records the requesting user has permission to view
  • User roles (user_roles) are passed to the RAG pipeline
  • Cross-organization data is never accessible
  • Engagement-level access controls are enforced

How Data Is Indexed

The RAG system indexes:

  • Finding descriptions and CCCER components
  • Workpaper narratives and conclusions
  • Test procedure descriptions and results
  • Engagement objectives and scope
  • Risk and control descriptions

Data is embedded using the configured embedding model (set in Administration → AI Models).

Citations

Every RAG-enhanced response includes citations:

  • References to specific findings, workpapers, or engagements
  • Source document identifiers
  • Relevance scores
  • Links to the original records (respecting permissions)

Configuration

RAG is configured through:

  • Embedding Model — set in Administration → AI Models (type: embedding)
  • Vector Database — PostgreSQL with pgvector extension
  • Indexing Schedule — data is re-indexed periodically via background worker

On-Premises Support

For air-gapped deployments:

  • Use Ollama with a local embedding model
  • All data stays within your infrastructure
  • No external API calls required