Document Chat

Upload documents and ask questions — powered by RAG.

Overview

Document Chat lets users upload files and have a conversation with their contents. Upload a contract, a policy document, or a technical spec, and the assistant can answer questions about it, summarise sections, and extract specific information.

This is powered by Retrieval-Augmented Generation (RAG). When you upload a document, Ajutant extracts the text, splits it into meaningful chunks, generates vector embeddings, and stores them in PostgreSQL with pgvector. When you ask a question, the system retrieves the most relevant chunks and includes them in the prompt context.

Supported File Types

FormatExtensionNotes
PDF.pdfText-based and scanned (with OCR)
Word.docxMicrosoft Word documents
PowerPoint.pptxExtracts text from slides and notes
Excel.xlsxExtracts data from worksheets
Plain text.txtDirect text ingestion
Markdown.mdPreserves structure

File extraction is handled by Apache Tika, which runs within your tenant alongside the rest of the platform.

How It Works

1. Upload

Drag and drop a file (or click to browse) in the chat interface. Files are uploaded directly to your tenant’s storage — they never transit through any external service.

2. Processing

The document is processed automatically:

  • Text extraction — Tika extracts readable text from the file
  • Chunking — Text is split into overlapping segments for better retrieval
  • Embedding — Each chunk is converted to a vector embedding using the configured embedding model
  • Storage — Embeddings are stored in PostgreSQL with pgvector

Processing time depends on document size. A typical 20-page PDF takes 10–30 seconds.

3. Conversation

Once processing completes, you can ask questions about the document. The assistant will:

  • Search for relevant passages in the document
  • Include those passages as context in the prompt
  • Generate an answer grounded in the actual document content
Grounded answers
Document Chat responses are grounded in the uploaded content. The assistant will indicate when it’s drawing from the document versus its general knowledge.

Best Practices

Be specific in your questions. Instead of “summarise this document”, try “what are the key obligations for the contractor in section 4?”

Upload one document per conversation for clarity. If you need to compare documents, start a new conversation for each and summarise your findings.

Check the source. The assistant indicates which parts of the document informed its response. Always verify critical information against the original document.

Limitations

  • Maximum file size: 50 MB per document
  • Very large documents (500+ pages) may have slower retrieval times
  • Scanned PDFs depend on OCR quality — clean scans work best
  • Tables and complex formatting may not extract perfectly