💬
Upload documents in the sidebar,
then ask questions here.
🔍 Retrieving relevant context…
Upload a document to enable chat
About this demo
RAGatha Ch'Bot is a small browser example of a RAG-style chatbot: upload your own documents, ask questions, and see answers grounded in what you added. It is meant as a feature showcase, not a production system.
What you can do
- Upload PDF, DOCX, TXT, or Markdown from the sidebar (drag-and-drop or browse).
- Documents are parsed in your browser (pdf.js + mammoth) and split into overlapping chunks for search.
- Questions use lightweight TF-IDF retrieval over your corpus (top 5 chunks by default), with extra query expansion from chat history and a HyDE-style synthetic paragraph to improve matching.
- Answers stream back as Markdown (sanitized), with source chips showing which document chunks were used.
- After a document loads, you may get suggested questions as chips above the input.
- Copy any assistant reply with the Copy button under the bubble.
- Your corpus lives in IndexedDB; chat and settings (API URL, key, model) persist in localStorage in this browser only.
- Pick any OpenAI-compatible endpoint and model in the top bar—OpenAI, Groq, Ollama, LM Studio, and others work by changing the base URL.
How it works (short version)
- Ingest — Text is extracted, chunked (~1200 chars with overlap), and each chunk gets a simple word-frequency profile.
- Retrieve — Your question (expanded) is scored against chunks; the best matches are bundled as context for the model.
- Answer — The API receives a system prompt with those excerpts plus recent chat; the reply streams into the chat.
Worth knowing
- Retrieval is TF-IDF, not embedding / vector search—good for a demo, not cutting-edge semantic search.
- Scanned PDFs without real text will not work well; image-only pages are out of scope.
- API keys are stored in the browser as plain text; do not use on shared machines.
- Browser storage has size limits; huge libraries may hit quota.