Next-Gen Q&A: Retrieval-Augmented AI with Chroma Vector Store
Chandrani Mukherjee

Chandrani Mukherjee @moni121189

About: As a Sr. Solution Enterprise Architect and MS in AI/ML from Liverpool John Moors University , UK, I have been a key contributor to global organizations like Mphasis AI, McKesson, First Abu Dhabi Bank

Location:
New Jersey
Joined:
Jul 5, 2025

Next-Gen Q&A: Retrieval-Augmented AI with Chroma Vector Store

Publish Date: Jul 11
65 2

A Retrieval-Augmented Generation (RAG) agent combines document retrieval with LLM-based response generation to provide intelligent, context-aware answers. In this guide, you’ll build a RAG system using LangChain, ChromaDB, and OpenAI or HuggingFace.

🛠️ Tech Stack
Python

LangChain

ChromaDB

OpenAI or HuggingFace LLMs

SentenceTransformers (all-MiniLM-L6-v2)

📦 Install Dependencies

pip install langchain chromadb sentence-transformers openai
Enter fullscreen mode Exit fullscreen mode

🧱 Folder Structure

.
├── rag_chroma_db/        # Chroma vector store
├── docs/
│   └── my_corpus.txt     # Your source document
└── rag_agent.py          # Main script

Enter fullscreen mode Exit fullscreen mode

📄 Code: RAG Agent with ChromaDB

from langchain.embeddings import SentenceTransformerEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders import TextLoader
from langchain.chains import RetrievalQA
from langchain.llms import OpenAI  # You can also use HuggingFaceHub

# 1. Load documents
loader = TextLoader("docs/my_corpus.txt")
documents = loader.load()

# 2. Split into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = text_splitter.split_documents(documents)

# 3. Embed and store in Chroma
embedding = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")
vectordb = Chroma.from_documents(documents=chunks, embedding=embedding, persist_directory="rag_chroma_db")
vectordb.persist()

# 4. Set up retriever
retriever = vectordb.as_retriever(search_kwargs={"k": 3})

# 5. Set up LLM
llm = OpenAI(temperature=0)

# 6. Create RAG chain
qa = RetrievalQA.from_chain_type(llm=llm, retriever=retriever, return_source_documents=True)

# 7. Ask questions
query = "What is the main topic of the document?"
result = qa({"query": query})

print("Answer:", result["result"])
print("Sources:", result["source_documents"])
Enter fullscreen mode Exit fullscreen mode

🔐 Set Your API Key
Make sure your environment is set with the OpenAI key:

export OPENAI_API_KEY="your-api-key"
Enter fullscreen mode Exit fullscreen mode

Or in Python:

import os
os.environ["OPENAI_API_KEY"] = "your-api-key"
Enter fullscreen mode Exit fullscreen mode

🔄 What’s Next?
📄 Add PDF loader with PyMuPDF or pdfminer.six

🖥️ Add a UI with Streamlit or FastAPI

🤖 Wrap the retriever as a LangChain Tool + Agent

🔌 Run offline using HuggingFace LLMs

💡 Summary
You now have a working Retrieval-Augmented Generation (RAG) agent using:

A local document chunked + embedded with SentenceTransformers

Stored in ChromaDB vector store

Queried using LangChain RetrievalQA

Answered using OpenAI GPT

Comments 2 total

  • Aiden Benjamin
    Aiden BenjaminJul 23, 2025

    Super clean integration of Chroma! This makes RAG pipelines much more manageable and fast to deploy.

  • Lucas Henry
    Lucas HenryJul 23, 2025

    The real-world use case you mentioned gave me some great ideas for internal enterprise tools.

Add comment