How I built a Chrome Extension with AI to understand any web page

In recent months, I've been exploring ways to teach people how to build real products using artificial intelligence. Instead of talking about AI in theoretical terms, I decided to create a practical experience that shows how AI can enhance everyday tasks and turn simple ideas into powerful solutions.

That's how this project was born: a Chrome extension with a smart side panel, capable of understanding any web page, generating summaries, answering questions, and simplifying content — all in real-time, with the help of Bedrock on AWS using the Nova model.

In this article, I share the technical architecture, the decisions I made, the challenges I faced, and how you can replicate (or evolve) this idea.

The Problem: Too much information, too little time

The internet is full of useful content: technical articles, tutorials, papers, news. But the volume of information is massive, and time to digest it all is scarce.

Sometimes we just want to quickly grasp the main point of an article. Other times, we want to translate or simplify technical text. And almost always, we have questions about what we just read.

Despite the popularity of ChatGPT, using AI still requires stepping out of your reading flow, copying text, writing prompts, and pasting content back. It's an experience that interrupts more than it helps.

My hypothesis was simple: what if AI came to you? What if, while browsing any site, you could open a side panel and ask:

"Summarize this text for me?"
"Explain this part in simpler terms?"
"Translate this content into my language?"

With this idea, I began sketching the project.

The Idea: a reading assistant in your browser

The product was born with a clear premise: to help anyone interact with the content of a web page directly in the browser, with the support of a powerful and easy-to-use AI.

To achieve this, I needed three key pillars:

Context: The AI needs to know what’s on the current page.
Command: The user needs to ask a question or trigger an action.
Intelligent Response: The AI must reply in a helpful way, in the user's preferred language.

The result is a Chrome extension that, when activated, opens a side panel where users can ask free-form questions or use quick buttons like "Summarize page," "Extract data," or "Simplify language." The response appears right there, smoothly and in context.

Technical Architecture: simple, modular, and scalable

The project is divided into two repositories:

Frontend: the extension and side panel code.
Backend: a lightweight API that orchestrates prompts and communicates with the language model.

The extension is built with React and TailwindCSS, using Chrome's native APIs for side panels, content scripts, and message passing.

When the user clicks the extension icon, the side panel opens. At that moment, a script extracts the visible content from the current tab — mainly <main>, <article>, or body.innerText, filtering out menus, sidebars, and other noise.

That text is then sent to the backend, which builds a custom prompt based on the selected action and the user's language, and sends it to the model (in this case, Amazon’s Nova via Bedrock). The response is returned and displayed directly in the panel.

The backend was implemented with Node.js and Express.js, deployed using Serverless Framework v3 on AWS ECS Fargate, and exposed through an Application Load Balancer. The container model was chosen, even though it's serverless, because the interaction with Bedrock returns a data stream and can exceed the 30-second timeout of API Gateway.

Why AI makes sense here

The choice to use a generative model like Nova was not random. Here are three main reasons:

AI understands language better than any handcrafted logic — Summarizing a technical text, rewriting it informally, or answering open-ended questions requires context and nuance, something that regex or traditional scripts can't handle well. LLMs solve this brilliantly.
Generalization is key — This product isn't made for one type of content. It needs to work with blogs, news, documentation, scientific articles, content in English, Portuguese, Spanish... A model like GPT generalizes very well without manual tuning.
Real-time with clear value — The experience is immediate. You see content, click "Summarize," and in seconds you have a helpful response. It's not a generic AI — it's useful because it's contextual.

Prompting and Language Control

On the backend, each predefined action is transformed into a custom prompt. The user's selected language is always included explicitly at the beginning of the prompt to ensure the model responds correctly.

This approach provides coherent and localized answers without needing to train additional models or implement separate translation logic.

Optimizing token usage

Sending an entire page to an LLM that charges per token is impractical — both technically and financially. So I implemented a smart extraction and truncation strategy:

Filter out irrelevant blocks (menus, footers, scripts).
Prioritize <article>, <main>, and visible content.
Limit the text to a maximum of 4000 tokens per request (this limit is self-imposed, but customizable).

In future versions, I plan to use embeddings to semantically compress content before sending it to the model (you can contribute to this — see the GitHub repo at the end).

Results and next steps

With this simple structure, I validated a functional MVP that meets the goal: make AI useful within the web reading flow. Users can get answers and content transformations with a click, in real time, without leaving the page.

Next possible steps:

Interaction history per page
PDF and Google Docs analysis
User-customized prompts
Shareable summary links

What I learned building this

I developed this project using a mindset I've been refining to build AI-powered products in a robust and productive way. I systematize each stage with heavy use of AI combined with my knowledge of product and development. With this, I can create solutions like this in minutes — maybe hours, but never days.

This project also taught me a lot about applying AI in a practical and user-centered way. Here are three important lessons:

AI doesn’t need to be complex to be useful — The value lies in the integrated and fluid experience, not in technical sophistication.
Prompt engineering is as important as code — A good prompt is worth more than a thousand rules.
AI products need to be opinionated — Too many options can confuse users. Creating clear buttons like "Summarize" and "Simplify" guides the experience.

Conclusion

This project is a great example of how generative AI can be applied with purpose in simple and useful products. It shows that with a good idea, some code, and thoughtful decisions, it's possible to transform web browsing into a smarter, more accessible experience.

If you want to use this as the basis for a workshop, hackathon, or your own product, the repositories are open. Happy building — and happy prompting!

Frontend & Extension: https://github.com/epiresdasilva/ai-chrome-sidepanel-frontend
Backend: https://github.com/epiresdasilva/ai-chrome-sidepanel-backend

About Me

This is Evandro Pires. I'm a husband, father of two, but also an AWS Serverless Hero, Serverless Guru Ambassador, Entrepreneur, CTO, Podcaster, and Speaker.

Cut costs and boost innovation by building a serverless-first mindset with sls.guru

Join our team and help transform the digital landscape for companies worldwide!

Evandro Pires @epiresdasilva