This is a submission for the Bright Data AI Web Access Hackathon
What I Built
An AI agent that turns breaking news + live Reddit reactions into snackable audio summaries.
Problem Solved: Staying informed requires juggling news sites and social pulse-checks. NewsNinja automates this: give it topics, and it silently scrapes headlines and Reddit threads (real-time, yes its unbelievable–thanks to BrightData's MCP), then uses AI to craft a 2-minute audio briefing. No more tab overload.
Demo
- GitHub Repo: https://github.com/AIwithhassan/newsninja
- My Youtube Channel (I create AI projects): https://www.youtube.com/@AI.with.Hassan/featured
How I Used Bright Data's Infrastructure
- Web Unlocker for News Scraping
- Bright Data MCP Server for Reddit Scraping Reddit’s anti-bot measures usually make scraping feel like this:
❌ "Are you human?" CAPTCHAs
❌ Shadow-banned IPs
❌ Empty JSON responses
With MCP Server:
✅ Discover: Tracked trending subreddits in real-time
✅ Access: Rotated residential proxies to mimic human behavior
✅ Extract: Parsed awards/upvotes from dynamically loaded comments
✅ Interact: Auto-scrolled infinite scroll pages
Architecture Diagram
Performance Improvements
- Reddit’s Anti-Bot Shields Neutralized
- Auto-scroll + human-like interaction patterns reduced CAPTCHAs
- Residential proxies rotated 3000+ IPs during testing with zero bans
- News Paywalls Defeated
- Web Unlocker bypassed 15+ premium sites at scale
- Handled multiple concurrent users during load tests
- Most importantly MCP provides a single prompt access to the live data on the internet which can be scraped at scale!