A Retrieval-Augmented Generation (RAG) agent combines document retrieval with LLM-based response generation to provide intelligent, context-aware answers. In this guide, you’ll build a RAG system using LangChain, ChromaDB, and OpenAI or HuggingFace.
🛠️ Tech Stack
Python
LangChain
ChromaDB
OpenAI or HuggingFace LLMs
SentenceTransformers (all-MiniLM-L6-v2)
📦 Install Dependencies
pip install langchain chromadb sentence-transformers openai
🧱 Folder Structure
.
├── rag_chroma_db/ # Chroma vector store
├── docs/
│ └── my_corpus.txt # Your source document
└── rag_agent.py # Main script
📄 Code: RAG Agent with ChromaDB
from langchain.embeddings import SentenceTransformerEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders import TextLoader
from langchain.chains import RetrievalQA
from langchain.llms import OpenAI # You can also use HuggingFaceHub
# 1. Load documents
loader = TextLoader("docs/my_corpus.txt")
documents = loader.load()
# 2. Split into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = text_splitter.split_documents(documents)
# 3. Embed and store in Chroma
embedding = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")
vectordb = Chroma.from_documents(documents=chunks, embedding=embedding, persist_directory="rag_chroma_db")
vectordb.persist()
# 4. Set up retriever
retriever = vectordb.as_retriever(search_kwargs={"k": 3})
# 5. Set up LLM
llm = OpenAI(temperature=0)
# 6. Create RAG chain
qa = RetrievalQA.from_chain_type(llm=llm, retriever=retriever, return_source_documents=True)
# 7. Ask questions
query = "What is the main topic of the document?"
result = qa({"query": query})
print("Answer:", result["result"])
print("Sources:", result["source_documents"])
🔐 Set Your API Key
Make sure your environment is set with the OpenAI key:
export OPENAI_API_KEY="your-api-key"
Or in Python:
import os
os.environ["OPENAI_API_KEY"] = "your-api-key"
🔄 What’s Next?
📄 Add PDF loader with PyMuPDF or pdfminer.six
🖥️ Add a UI with Streamlit or FastAPI
🤖 Wrap the retriever as a LangChain Tool + Agent
🔌 Run offline using HuggingFace LLMs
💡 Summary
You now have a working Retrieval-Augmented Generation (RAG) agent using:
A local document chunked + embedded with SentenceTransformers
Stored in ChromaDB vector store
Queried using LangChain RetrievalQA
Answered using OpenAI GPT
Super clean integration of Chroma! This makes RAG pipelines much more manageable and fast to deploy.