Building an AI-powered Semantic Email Search in Your Terminal
So, a new quest came out on quira in which we had to make something using MindsDB Knowledge Bases. As I read through the description it struck to me that a semantic powered email client would be a great thing to work on. It would hit off all the required points while also having some actual irl usecase.
And so i made grepmail: a terminal-based CLI app that lets you semantically search your inbox using natural language queries. It uses MindsDB, PGVector, Gemini, and ollama to turn your mailbox into an AI-searchable database.
It’s like
grep
for your email—but smart.
Here is the grepMail github repo incase you want to go through the code. The code is decently large so i won't be able to explain everything in the article itself.
What grepmail Can Do
- Search your email inbox semantically using plain English
- Sync with IMAP mailboxes (Gmail, Outlook, etc.)
- Store vectorized email content in Postgres via PGVector
- Use LLMs (Gemini, Ollama) for embedding and querying
- Display results in a clean, rich terminal UI
⚙️ Tech Stack Overview
Part | Tech |
---|---|
CLI | Typer, Rich |
LLM |
Ollama, nomic-embed-text , Gemini |
Email Access | IMAP via MindsDB Email Engine |
Vector DB | PGVector |
Orchestration | MindsDB Knowledge Base |
Packaging | Poetry |
File structure
├── grepmail
│ ├── __init__.py
│ ├── logger.py
│ ├── main.py
│ ├── mindsdb
│ ├── handlers
│ │ ├── common.py
│ │ ├── email.py
│ │ └── __init__.py
│ └── main.py
├── grepmail.log
├── handlers.txt
├── LICENSE
├── poetry.lock
├── pyproject.toml
└── README.md
The main parts to keep notice of are -
grepmail/main.py
- the cligrepmail/handlers/common.py
- functions for some common tasksgrepmail/handlers/email.py
- functions for email related tasks
Major Components
As i said i wont be going through the entire code but rather the main parts of the system. Connecting these together is a pretty trivial task.
1. Email Access with MindsDB Email Engine
You can access you email account with the MindsDB Email Integration
2. Vectorizing Emails with Ollama
I used nomic-embed-text
for generating vector embeddings locally. With Ollama, it was as easy as:
ollama pull nomic-embed-text
And then, when loading emails for the first time, grepmail chunks and vectorizes each one, storing them in the PGVector store.
3. Setting Up PGVector for Storage
Emails are heavy, so I wanted a fast and efficient vector store. MindsDB has an integration with PGVector, and I used that to store and search through embeddings efficiently.
4. MindsDB Knowledge Base for Semantic Search
Here’s where the magic happens:
- Store email content as vector chunks
- Index them via
CREATE INDEX
- Query using natural language like this:
SELECT * FROM email_kb
WHERE content = 'flight confirmation from last week'
LIMIT 5;
That’s literally it.
5. Local Caching for Speed
The first time you run grepmail, it loads all your emails into:
- MindsDB’s knowledge base
- A local Postgres DB (so future queries are instant)
This took some engineering because:
- IMAP servers are slow
- Ollama runs one model instance at a time
So I had to skip concurrency for now (sorry 🫠), but the system is built to handle these steps incrementally and linearly.
6. CLI Experience with Typer + Rich
I wanted this tool to feel smooth in the terminal. Typer gave me auto-generated help menus and clean argument parsing. Rich handled beautiful tables, syntax highlighting, and progress bars.
7. Syncing Emails
https://docs.mindsdb.com/generative-ai-tables#generative-ai-tables.
Another important task to be done is syncing emails to the local db and knowledge base. MindsDB has a builtin in Jobs module which helped in fetching the email from the server every hour without any deep technical work.
8. Summarizing Emails
Another nice quirk i added was to create an AI table / model which would summarize my email content into a few lines ;))
That's it from my side. The architecture is pretty simple and going through the MindsDB docs you will be able to replicated this kind of application yourself.