AboutProjectsSkillsEducationGet in Touch
Back to projects

RAG Knowledge Assistant

Ask a question about a fictional enterprise's order-to-cash policy. Answers come from 12 indexed documents covering credit limits, dunning cadences, dispute resolution, and bad-debt provisioning. Every claim is cited back to the source chunk it came from.

Embeddings: Voyage AI · voyage-3Retriever: pluggable (in-memory + Pinecone)Answer model: Claude Haiku 4.5Runtime: Next.js · Vercel
Order-to-Cash knowledge assistant

Ask a question about the fictional company's O2C policy. The assistant only answers from the 12 indexed documents.

How it works

  1. 01
    Build-time indexing
    Twelve markdown policy documents are chunked by section, sent to Voyage as 1024-dimensional embeddings, and written to a compact JSON index shipped with the build.
  2. 02
    Query-time retrieval
    Your question is embedded the same way, then the retriever ranks chunks by cosine similarity. The top four go into the prompt. A toggle swaps the in-memory backend for Pinecone: same interface, different store.
  3. 03
    Grounded generation
    Claude Haiku 4.5 answers using only the retrieved chunks, with an explicit instruction to refuse when the corpus doesn't cover the question. Responses stream token-by-token.
  4. 04
    Inline citations
    Every source chunk surfaces as a chip under the answer. Hovering shows the similarity score; clicking opens the full text so you can verify the model isn't making things up.