Is Llama 4 really a failure or disaster ?
Ahamed Musthafa R S (amrs-tech)

Ahamed Musthafa R S (amrs-tech) @amrs-tech

About: Research Engineer (or trying to be one 😁) Trying to catch up with the world and tech while writing about it.

Location:
Coimbatore, Tamilnadu, India
Joined:
Feb 27, 2025

Is Llama 4 really a failure or disaster ?

Publish Date: Apr 19
0 0

A few days ago, Meta released Llama 4 in three variants — Behemoth, Maverick and Scout. Groq have Llama 4 Scout in their model hub and it can be easily accessed with their playground and the API service as well. There are several discussions around the model metrics. Some say llama 4 failed to give what people expected (did not get any better than llama 3), good thing is I did not set one :)

Let us look at a basic tutorial on creating a web application to get an image input from the user and then try to generate a table data (if the image is chart, graph or anything equivalent to a tabular data)

Image generated by AI (Flux.1-Schnell)

Application Code

I liked the name TabWiz :) but not necessarily. App code is explained briefly in sections.

app.py

import os
from flask import Flask, render_template, request, jsonify, url_for, send_file
from werkzeug.utils import secure_filename
from groq import Groq
import base64
import pandas as pd
from io import StringIO
Enter fullscreen mode Exit fullscreen mode

Firstly, libraries that are required for web server, data (tabular) processing and accessing the Llama 4 model via Groq library, are imported in the beginning.

# Function to encode the image
def encode_image(image_path):
  with open(image_path, "rb") as image_file:
    return base64.b64encode(image_file.read()).decode('utf-8')
Enter fullscreen mode Exit fullscreen mode

Then, the above function is defined to encode the image file as base64 data. This is to process the image and send it to our language model.

app = Flask(__name__)
app.config['UPLOAD_FOLDER'] = 'static/uploads'
os.makedirs(app.config['UPLOAD_FOLDER'], exist_ok=True)
Enter fullscreen mode Exit fullscreen mode

static folder is created to store the file uploads done through the web form. The folder is added to the application config.

client = Groq(api_key='')
Enter fullscreen mode Exit fullscreen mode

Then, Groq API client object is created with the api_key (generated from the groq console). You can create an account as a developer in Groq platform and can explore the features of the console (https://console.groq.com) to create a free API key.

@app.route('/')
def index():
    return render_template('index.html')
Enter fullscreen mode Exit fullscreen mode

Then, the home/root API endpoint is created with the HTML template to display the file upload form.

@app.route('/chat', methods=['POST'])
def chat():
    user_message = request.form.get('message')
    image_file = request.files.get('image')
    file_ext = ''
    image_url = None
    if image_file:
        filename = secure_filename(image_file.filename)
        filepath = os.path.join(app.config['UPLOAD_FOLDER'], filename)
        image_file.save(filepath)
        image_url = url_for('static', filename=f'uploads/{filename}', _external=True)
        file_ext = filename.split('.')[-1]

    messages = [
        {
            "role": "system",
            "content": "You are an expert in parsing the given image and fetch all details very accurately and then return a csv data to be enough to create a table. Do not return any other text except for the csv."
        },
        {
            "role": "user",
            "content": [
                # {
                #     "type": "text",
                #     "text": user_message
                # },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/{file_ext};base64,{encode_image(filepath)}"
                    }
                }
            ]
        }
    ]

    completion = client.chat.completions.create(
        model="meta-llama/llama-4-scout-17b-16e-instruct",
        messages=messages,
        temperature=0.7,
        max_completion_tokens=1024,
        top_p=1,
        stream=False,
        stop=None,
    )

    response_text = completion.choices[0].message.content

    # Create a pandas DataFrame from the raw CSV content
    df = pd.read_csv(StringIO(response_text))
    # Save the DataFrame to a CSV file
    df.to_csv('output.csv', index=False)
    # download
    return send_file('output.csv', as_attachment=True)

    return jsonify({'response': response_text})
Enter fullscreen mode Exit fullscreen mode

Then, the Chat API endpoint is created only with POST method allowed. This is called when the file upload form is submitted, and then the image file is processed into base64 data, which in turn is passed as a message to the llama-4-scout model. And the output from the model is expected to be a response just with CSV data, which is flushed into a file for download.

if __name__ == '__main__':
    app.run(debug=True)
Enter fullscreen mode Exit fullscreen mode

That’s it. The above code is added finally to have the script run as a server when executed in shell. Make sure to have the libraries (flask & groq) installed in the environment. Server can be started with the command in the shell python app.py

Image by Author (Input upload for app)

The above image is the sample input file that I’ve uploaded and the model successfully processed it to generate the CSV data as expected.

Image by Author (Output table)

The above table is the generated CSV data from the model. It is pretty good and got generated without any mistakes. I am not sure if it is really a failure or a disaster to people. But for an open model, it is working fine with the multimodal capabilities.

Happy Learning !

Comments 0 total

    Add comment