This is a submission for the Redis AI Challenge: Beyond the Cache.
What I Built
This project is CLI tool built with Go(Cobra) that leverages Redis Stack for vector storage and Gemini API(Google's LLM) for both embedding and text generation.
This CLI tool demonstrates how Redis Stack can serve as a vector database, powering an intelligent question-answering system - all from terminal.
Instead of Redis being just a cache or pub-sub tool, I used it for storing high-dimensional embeddings and retrieving semantically relevant results using vector similarity search (KNN+cosine).
Tech Stack:
- Go (Cobra) - CLI Interface
- Gemini API - for both generating embedding and responding with LLM.
- Redis Stack 8.x - as a vector store using
FT.CREATE
,HNSW
andKNN
retrieval.
Redis as a Vector Store
Traditionally, Redis is known as a caching layer. I have also started my Go journey with Redis as a caching layer. But in this project, Redis Stack is used to:
Store vector embeddings with metadata
Each chunk of input text is embedded using Gemini and stored as a document with the following schema:
Field | Type | Purpose |
---|---|---|
Command |
TEXT | Original CLI Command name |
os |
TEXT | OS type or system |
text_chunk |
TEXT | The actual context (explanation of the commands) |
embedding |
VECTOR | Embedded vector (FLOAT32, 3072 dimension) |
This is the basic schema I have created using the following command
FT.CREATE pesudo_index ON HASH PREFIX 1 doc: SCHEMA \
command TEXT \
os TEXT \
text_chunk TEXT\
embedding VECTOR HNSW 6 TYPE FLOAT32 DIM 3072 \
DISTANCE_METRIC COSINE
This enables semantic search directly in Redis - no separate vector DB required.
Retrieve relevant data via vector search
From the command line application when the user runs:
$ pesudocli ask "How to install packages on Arch?"
The CLI flows as :
- Embeds the questions using Gemini
- Performs a KNN vector search using
FT.SEARCH
withKNN 3
. - Uses
cosine similarity
to rank results - Sends the top 3 context chunks to Gemini chat model as context.
Redis handles the entire retrieval part without latency or external systems.
During testing, vector search queries consistently returned results in around 150ms, demonstrating Redis’s real-time capabilities even for high-dimensional vector data.
⚙️ Behind the scenes: Vector search
Here's an example of a vector search query in Redis, which I have used:
FT.SEARCH pesudo_index "*=> [KNN 3 @embedding $vec]" \
PARAMS 2 vec <binary_value_of_query>
DIALECT 2
-
KNN 3
: Returns top 3 closest vectors -
COSINE
: Generally used method to find the distance between two vectors -
DIALECT 2
: Required in Redis for vector support.
Under the hood, Redis uses HNSW (Hierarchical Navigable Small World) algorithm, which reduces the brute force check to reduce the latency and enabling efficient approximate nearest neighbor search in high dimensional space.
How to process user query 🧠
Demo
Since this is a CLI I have attached the image of the sample ask
command here.
You can download and try the CLI from the github repo: PesudoCLI Release
How I Used Redis Stack
Redis Stack was crucial to enable Vector Search in this CLI. Here's how it fits into the pipeline:
- Used Redis 8(via Redis Stack) to store vector embeddings created from chunks of input data.
- The
ingest
command:- Retrieves data from embed csv file,
- Uses Gemini to generate embeddings,
- Stores each chunk and its corresponding embedding vector in Redis using RediSearch, along with
metadata
.
- The
ask
command:- Converts user input into a vector using Gemini,
- Performs a KNN vector similarity search on Redis,
- Sends the top 3 matching contexts to Gemini chat model for response generation.
This combination of semantic vector search + LLM made it possible to build an intelligent CLI assistant.
🔧 CLI flow to run
This will give a overall complete run through of the cli code, and displays how the entire program should be run.
# Step 1: set config
$ pesudocli config --gemini-api-key <your-key>
# Step 2: Init the index
$ pesudocli init
# Step 3: Ingest data
$ pesudocli ingest
# Step 4 : Ask a question
$pesudocli ask "Explain about podman?"
After doing the initial 3 steps then we can ask any number of questions using the ask
command.
Conclusion 💻
This project started with a simple idea: build a smart CLI assistant. But along the way, Redis Stack proved to be far more than a cache - it became the core of a semantic search engine.
By combining vector embeddings, KNN search with cosine similarity, and the Gemini API, I was able to build a fast, local-first and entirely terminal-bed AI assistant - without needing a separate vector database or LLM backend.
Redis Stack handled:
- Real-time vector ingestion and search
- Metadata filtering and schema management.
- Seamless performance even with high-dimensional (3072-dim) vectors
All in all, this project shows how Redis Stack can unlock a new class of AI powered applications - from your terminal.
Thanks for reading - and thanks to the Redis team and Dev.to for hosting this challenge ! 🚀
Checkout the PesudoCLI project on Github to try it out or contribute.
Cover image is generated with Gemini, other images are generated with the help of napkin.ai, and text is modified using AI for grammar corrections
This looks so cool!