Building an AI Receipt Analyzer in 72 Hours: GPT-4 Vision + Streamlit
Ajaypartap Singh Maan

Ajaypartap Singh Maan @ajaymaan13

About: Full-stack developer passionate about AI & machine learning. Building intelligent solutions that solve real-world problems. Currently exploring GPT-4 Vision applications.

Location:
Toronto, Canada
Joined:
Sep 22, 2023

Building an AI Receipt Analyzer in 72 Hours: GPT-4 Vision + Streamlit

Publish Date: Jun 8
5 2

Image description


🎯 The Challenge

We've all been there – staring at a pile of receipts at month-end, manually typing each item into a spreadsheet. It's tedious, error-prone, and provides zero insights beyond basic math. I decided to solve this with AI.

Goal: Build a production-ready app that transforms receipt photos into structured data and spending insights.

Timeline: 72 hours (weekend hackathon style)

Result: Live application with 90%+ accuracy on clear images


🏗️ Architecture Overview

📱 Streamlit Frontend → 🖼️ Image Processing → 🤖 GPT-4 Vision → 📊 Data Analytics
Enter fullscreen mode Exit fullscreen mode

Tech Stack:

  • Frontend: Streamlit (rapid prototyping champion)
  • AI: OpenAI GPT-4 Vision API
  • Processing: PIL for image handling
  • Deployment: Streamlit Cloud
  • Data: JSON + Session storage

📅 Day 1: Foundation & Core Logic

Setting Up the Project Structure

smart-receipt-analyzer/
├── streamlit_app.py      # Main UI
├── processor.py          # AI logic
├── requirements.txt      # Dependencies
├── .env                 # API keys
└── README.md            # Documentation
Enter fullscreen mode Exit fullscreen mode

The Core Processor Module

The heart of the application lives in processor.py. Here's the key function:

import openai, base64, json, os
from PIL import Image

def analyze_receipt(image_file):
    try:
        # Check image quality first
        quality_info = check_image_quality(image_file)
        image_file.seek(0)

        # Convert to base64 for API
        image_data = base64.b64encode(image_file.read()).decode()
        image_file.seek(0)

        # Crafted prompt for structured output
        prompt = f"""Analyze this receipt (Quality: {quality_info['quality_score']}).
Return JSON: {{"items": [{{"name": "item", "price": 1.99}}], 
"total": 15.99, 
"insights": ["insight1", "tip2", "observation3"], 
"confidence": "high/medium/low"}}"""

        response = openai.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": [
                {"type": "text", "text": prompt},
                {"type": "image_url", {"image_url": {"url": f"data:image/jpeg;base64,{image_data}"}}}
            ]}],
            max_tokens=800
        )

        result = json.loads(response.choices[0].message.content)
        # Add quality metadata
        result.update({
            "quality_score": quality_info['quality_score'], 
            "quality_issues": quality_info['issues']
        })

        return result

    except json.JSONDecodeError:
        return {"items": [{"name": "Parse error", "price": 0}], 
                "total": 0, "insights": ["Invalid AI response"], 
                "confidence": "low"}
Enter fullscreen mode Exit fullscreen mode

Key insights from Day 1:

  • GPT-4 Vision needs explicit JSON format instructions
  • Image quality directly impacts accuracy
  • Error handling is crucial for production apps

📅 Day 2: UI/UX & Image Quality Assessment

Building the Streamlit Interface

Streamlit's simplicity was perfect for rapid prototyping:

import streamlit as st
from processor import analyze_receipt
from datetime import datetime

# Page setup
st.set_page_config(page_title="AI Receipt Analyzer", page_icon="🧾", layout="wide")

# Main upload interface
uploaded_file = st.file_uploader("Choose receipt image", 
                                type=['jpg', 'png', 'jpeg'])

if uploaded_file:
    col1, col2 = st.columns(2)

    with col1:
        st.subheader("📸 Your Receipt")
        st.image(uploaded_file, use_column_width=True)

    with col2:
        st.subheader("📋 Results")
        with st.spinner("🤖 Analyzing..."):
            results = analyze_receipt(uploaded_file)

        # Display extracted items
        for i, item in enumerate(results['items'], 1):
            st.write(f"{i}. **{item['name']}** - ${item['price']:.2f}")

        st.metric("💰 Total", f"${results['total']:.2f}")
Enter fullscreen mode Exit fullscreen mode

Image Quality Assessment

One major insight: garbage in, garbage out. I built a quality checker:

def check_image_quality(image_file):
    try:
        image = Image.open(image_file)
        width, height = image.size
        file_size = len(image_file.getvalue())

        issues = []
        if width < 800 or height < 600: 
            issues.append("Low resolution")
        if file_size < 100000: 
            issues.append("Small file size")
        if height / width < 1.2: 
            issues.append("Use portrait mode")

        quality = ["Excellent", "Good", "Fair", "Poor"][min(len(issues), 3)]
        return {"quality_score": quality, "issues": issues}
    except:
        return {"quality_score": "Unknown", "issues": ["Analysis failed"]}
Enter fullscreen mode Exit fullscreen mode

This gives users immediate feedback on whether their photo will work well.


📅 Day 3: Analytics, Deployment & Polish

Adding Session-Based Analytics

def save_receipt(results):
    if 'receipts' not in st.session_state: 
        st.session_state.receipts = []

    st.session_state.receipts.append({
        "date": datetime.now().strftime("%Y-%m-%d"), 
        **results
    })
    return len(st.session_state.receipts)

def get_stats():
    receipts = st.session_state.get('receipts', [])
    total = sum(r['total'] for r in receipts)
    return len(receipts), total, total/len(receipts) if receipts else 0
Enter fullscreen mode Exit fullscreen mode

Sidebar Dashboard

count, total, avg = get_stats()
with st.sidebar:
    st.header("📊 Your Stats")
    st.metric("Receipts", count)
    st.metric("Total Spent", f"${total:.2f}")
    st.metric("Average", f"${avg:.2f}")
Enter fullscreen mode Exit fullscreen mode

Production Deployment

Streamlit Cloud made deployment trivial:

  1. Push to GitHub
  2. Connect Streamlit Cloud to repo
  3. Add OPENAI_API_KEY to secrets
  4. Deploy! 🚀

🧠 The AI Prompt Engineering Journey

Getting consistent JSON output from GPT-4 Vision took several iterations:

❌ First attempt (too vague):

"Extract items and prices from this receipt"
Enter fullscreen mode Exit fullscreen mode

Result: Inconsistent formats, narrative responses

❌ Second attempt (better but still inconsistent):

"Return a JSON object with items array containing name and price fields"
Enter fullscreen mode Exit fullscreen mode

Result: Sometimes worked, sometimes returned markdown

✅ Final approach (explicit and structured):

prompt = f"""Analyze this receipt (Quality: {quality_info['quality_score']}).
Return JSON: {{"items": [{{"name": "item", "price": 1.99}}], 
"total": 15.99, 
"insights": ["insight1", "tip2", "observation3"], 
"confidence": "high/medium/low"}}"""
Enter fullscreen mode Exit fullscreen mode

Result: 90%+ consistent JSON output

Key learnings:

  • Provide exact JSON schema examples
  • Include quality context in prompts
  • Add confidence scoring for reliability
  • Error handling is essential

📊 Performance Results

After testing 50+ receipts:

Metric Result
Accuracy 90%+ on clear images
Processing Time 3-7 seconds
Supported Types Grocery, restaurants, gas, retail
Quality Impact Poor images: 60% accuracy

Common failure modes:

  • Handwritten receipts
  • Very faded thermal paper
  • Extreme angles or lighting
  • Screenshots of receipts

💡 Key Technical Insights

1. Image Quality Matters More Than Model Choice

Spending time on quality assessment improved results more than prompt tweaking.

2. Streamlit is Perfect for AI Prototypes

  • Built-in file upload handling
  • Session state management
  • Easy deployment
  • Mobile-responsive by default

3. Error Handling is Critical

Users will upload anything. Plan for:

  • JSON parsing failures
  • API timeouts
  • Invalid image formats
  • Network issues

4. User Feedback Drives Accuracy

The quality checker gives users actionable feedback, leading to better inputs.


🚀 Production Considerations

Security

# Environment variables for API keys
openai.api_key = os.getenv("OPENAI_API_KEY")

# Input validation
if uploaded_file.size > 5_000_000:  # 5MB limit
    st.error("File too large")
Enter fullscreen mode Exit fullscreen mode

Performance

# Image compression for faster uploads
if image.size[0] > 1200:
    image.thumbnail((1200, 1200), Image.Resampling.LANCZOS)
Enter fullscreen mode Exit fullscreen mode

User Experience

  • Loading spinners for AI processing
  • Clear error messages
  • Progressive disclosure of features
  • Mobile-first design

🔮 What's Next?

Immediate improvements:

  • [ ] Bulk upload processing
  • [ ] Export to CSV/Excel
  • [ ] Category classification
  • [ ] Budget tracking alerts

Future features:

  • [ ] Mobile app (React Native)
  • [ ] Database persistence
  • [ ] Multi-user support
  • [ ] Integration with accounting software

💼 Business Impact

This 72-hour project demonstrates:

Rapid AI prototyping - Idea to production in 3 days

Real-world problem solving - 95% faster than manual entry

Production deployment - Live app handling real users

Scalable architecture - Foundation for enterprise features


🔗 Try It Yourself

🚀 Live Demo: Receipt Analyzer

💻 Source Code: GitHub Repository

🛠️ Run Locally:

git clone https://github.com/AjayMaan13/smart-script-analyzer.git
cd smart-script-analyzer
pip install -r requirements.txt
export OPENAI_API_KEY="your-key"
streamlit run streamlit_app.py
Enter fullscreen mode Exit fullscreen mode

🎯 Key Takeaways

  1. Start with the core problem - Focus on the main use case first
  2. Quality over quantity - Better image quality beats complex prompts
  3. User feedback is crucial - Guide users toward success
  4. Deploy early and often - Get real user feedback quickly
  5. Error handling is not optional - Plan for every failure mode

Building this receipt analyzer was a masterclass in rapid AI development. The combination of GPT-4 Vision's power and Streamlit's simplicity made it possible to go from idea to production in just 72 hours.

What would you build with GPT-4 Vision? Share your ideas in the comments! 👇


🤝 Let's Connect!

Found this helpful? I'd love to connect and hear about your AI projects!

What would you build with GPT-4 Vision? Drop your ideas in the comments below! 👇

Try the Receipt Analyzer yourself!

Comments 2 total

Add comment