FinDocs RAG
In ProgressRetrieval-augmented Q&A over financial documents, combined with live Companies House data.
Tech
- Python
- FastAPI
- RAG
- Vector DB
- LLM APIs
Problem
Financial analysts and accountants spend hours manually reading through Companies House filings, annual reports, and regulatory documents to answer questions that an LLM could answer in seconds — if it had the right context. There's no good tool that combines Companies House live data with your own document corpus.
Approach
FinDocs RAG chunks and embeds financial documents (PDFs, filings) into a vector store, then combines retrieval with live Companies House API lookups. At query time, the system retrieves the most relevant document chunks, augments them with real-time company data, and routes both through an LLM to generate a grounded, cited answer.
Key technical decisions
Chose RAG over fine-tuning for two reasons: freshness (company data changes weekly) and explainability (retrieved context makes answers auditable, which matters in a regulated domain). Using FastAPI as the backend to keep the API layer thin and testable. Vector store selection is pending a cost/self-hosting tradeoff evaluation.
Outcome
In progress. Success metrics: query accuracy on a benchmark of real financial questions, latency under 3 seconds per query, and citation fidelity. Targeting a working prototype by August 2026.