Document Chat
Upload documents and ask questions — powered by RAG.
Overview
Document Chat lets users upload files and have a conversation with their contents. Upload a contract, a policy document, or a technical spec, and the assistant can answer questions about it, summarise sections, and extract specific information.
This is powered by Retrieval-Augmented Generation (RAG). When you upload a document, Ajutant extracts the text, splits it into meaningful chunks, generates vector embeddings, and stores them in PostgreSQL with pgvector. When you ask a question, the system retrieves the most relevant chunks and includes them in the prompt context.
Supported File Types
| Format | Extension | Notes |
|---|---|---|
.pdf | Text-based and scanned (with OCR) | |
| Word | .docx | Microsoft Word documents |
| PowerPoint | .pptx | Extracts text from slides and notes |
| Excel | .xlsx | Extracts data from worksheets |
| Plain text | .txt | Direct text ingestion |
| Markdown | .md | Preserves structure |
File extraction is handled by Apache Tika, which runs within your tenant alongside the rest of the platform.
How It Works
1. Upload
Drag and drop a file (or click to browse) in the chat interface. Files are uploaded directly to your tenant’s storage — they never transit through any external service.
2. Processing
The document is processed automatically:
- Text extraction — Tika extracts readable text from the file
- Chunking — Text is split into overlapping segments for better retrieval
- Embedding — Each chunk is converted to a vector embedding using the configured embedding model
- Storage — Embeddings are stored in PostgreSQL with pgvector
Processing time depends on document size. A typical 20-page PDF takes 10–30 seconds.
3. Conversation
Once processing completes, you can ask questions about the document. The assistant will:
- Search for relevant passages in the document
- Include those passages as context in the prompt
- Generate an answer grounded in the actual document content
Best Practices
Be specific in your questions. Instead of “summarise this document”, try “what are the key obligations for the contractor in section 4?”
Upload one document per conversation for clarity. If you need to compare documents, start a new conversation for each and summarise your findings.
Check the source. The assistant indicates which parts of the document informed its response. Always verify critical information against the original document.
Limitations
- Maximum file size: 50 MB per document
- Very large documents (500+ pages) may have slower retrieval times
- Scanned PDFs depend on OCR quality — clean scans work best
- Tables and complex formatting may not extract perfectly