Running AI Models with Docker Compose
Pradumna Saraf

Pradumna Saraf @pradumnasaraf

About: Developer Advocate | Docker Captain | Open Source Developer | DevOps | Golang | Owner @rebasemedia

Location:
India
Joined:
Aug 9, 2021

Running AI Models with Docker Compose

Publish Date: Aug 19
47 22

Docker Compose has completely changed the game in how we run and connect a multi-service application. Just execute a single line of command, and everything is up and running, and all the services are well interconnected.

When Docker introduced the Docker Model Runner (Or DMR, we call it internally in Docker), there was a missing piece (at least for me). To use an AI model with a Compose application, we separately need to run the model with DMR and then connect our Compose application service by passing the config of that running model.

But Docker knew this, and it sorted it out by adding the capability to describe an AI model in YAML, compose.yml to run and destroy the AI model on demand. Like we write and do the configuration for services, networks, and volumes. We can do the same for the AI models with models.

Prerequisite

  • Docker and Docker Compose are installed
  • Understanding of AI and LLMs

Getting Started

Let’s get started. To have a better understanding of the concept and working, I have created a GitHub project: Pradumnasaraf/Saraf-AI (Yes, it’s my last name “Saraf” and I added “AI” to it :)). It’s a Next.js chat application that communicates with the Docker AI Model with the help of the OpenAI framework. You can clone it down and keep it ready; we will be referencing that many times.

Docker Compose AI models component

First, let’s have a look at the compose.yml. Like we are familiar with the services, volumes, etc, we have defined models as the top-level element. This is the new element for defining AI models.

So what we have done is define a service named saraf-ai that utilises the model llm. We have defined models as an element. And the model definition for llm that references the ai/smollm2 model image.

The complete config can be found in compose.yml in the root of the repo.

# compose.yml
services:
  saraf-ai:
    build:
      context: .
      dockerfile: Dockerfile
    ports:
      - 3000:3000
    # Models to run
    models:
      - llm

models:
  # Model Name
  llm:
    # Model Image
    model: ai/smollm2
Enter fullscreen mode Exit fullscreen mode

Now we understand how the config looks, but how can our app connect and communicate with this AI model? How are we setting up environment variables like model name, URL and API key, as we will be using the OpenAI specification?

This is where Docker shines!

As we add config to use a model in a service, Docker will auto-generate and inject two environment variables into our service application based on the model name (in our case, llm). So the two variables will be:

  • LLM_MODEL: Contains the model name.

  • LLM_URL: Contains the model endpoint to communicate with.

Now we can reference these in our application and use them. If that sounds confusing, you can read more about them here.

Also, if we are using multiple AI models and we want to explicitly define how the variable naming should be. For example, we are defining two models below.

services:
  app:
    image: my-app
    models:
      llm:
        endpoint_var: AI_MODEL_URL
        model_var: AI_MODEL_NAME
      embedding-model:
        endpoint_var: EMBEDDING_URL
        model_var: EMBEDDING_NAME

models:
  llm:
    model: ai/smollm2
  embedding-model:
    model: ai/all-minilm
Enter fullscreen mode Exit fullscreen mode

Now, instead of the default LLM_Model and LLM_URL, the application will be injected with AI_MODEL_URL and AI_MODEL_NAME. And for embedding-model, it will inject EMBEDDING_URL and EMBEDDING_NAME.

Now, let’s look at our Next.js application.

Application config

We have created a Next.js application and are using the OpenAI framework (which is standard in the industry) to communicate with the Docker AI model. And it will automatically pick up those environment variables that Docker injected into the application.

We don’t need apiKey, as it’s not a cloud LLM and quota kind of thing.

Below is the complete code. You will also find the complete code in the src/app/api/chat/route.ts file.

import OpenAI from 'openai';
import { NextResponse } from 'next/server';

const openai = new OpenAI({
  baseURL: process.env.LLM_URL || '',
  apiKey: "key-not-needed"
});

const model = process.env.LLM_MODEL || '';

export async function POST(req: Request) {
  try {
    const { message, messages } = await req.json();

    // Validate input
    if (!message || typeof message !== 'string') {
      return NextResponse.json(
        { error: 'Message is required and must be a string' },
        { status: 400 }
      );
    }

    if (!Array.isArray(messages)) {
      return NextResponse.json(
        { error: 'Messages must be an array' },
        { status: 400 }
      );
    }

    const stream = await openai.chat.completions.create({
      messages: [...messages, { role: 'user', content: message }],
      model,
      stream: true,
      temperature: 0.7,
      max_tokens: 2000,
    });

    return new Response(
      new ReadableStream({
        async start(controller) {
          try {
            for await (const chunk of stream) {
              const text = chunk.choices[0]?.delta?.content || '';
              if (text) {
                controller.enqueue(new TextEncoder().encode(text));
              }
            }
          } catch (streamError) {
            console.error('Streaming error:', streamError);
            controller.error(streamError);
          } finally {
            controller.close();
          }
        },
      }),
      {
        headers: {
          'Content-Type': 'text/plain; charset=utf-8',
          'Cache-Control': 'no-cache',
          'Connection': 'keep-alive',
        },
      }
    );
  } catch (error: unknown) {
    console.error('OpenAI API error:', error);

    const errorMessage = error instanceof Error ? error.message : 'Unknown error';
    const errorStatus = (error as { status?: number })?.status || 500;

    return NextResponse.json(
      { 
        error: 'Failed to get response from AI',
        details: errorMessage,
      },
      { status: errorStatus }
    );
  }
}
Enter fullscreen mode Exit fullscreen mode

Dockerizing the application

Now, let’s Dockerize our application. For that, we will create a Dockerfile.

You will find the Dockerfile file in the root of the project.

# Build stage
FROM node:24-alpine AS builder

WORKDIR /app

# Copy package files
COPY package*.json ./

# Install dependencies
RUN npm ci --only=production

# Copy source code
COPY . .

# Build the application
RUN npm run build

# Production stage
FROM node:24-alpine AS runner

WORKDIR /app

# Create a non-root user
RUN addgroup --system --gid 1001 nodejs
RUN adduser --system --uid 1001 nextjs

# Copy built application from builder stage
COPY --from=builder /app/public ./public
COPY --from=builder /app/.next/standalone ./
COPY --from=builder /app/.next/static ./.next/static

# Set ownership to nextjs user
RUN chown -R nextjs:nodejs /app

USER nextjs

EXPOSE 3000

ENV PORT 3000
ENV HOSTNAME "0.0.0.0"

CMD ["node", "server.js"]
Enter fullscreen mode Exit fullscreen mode

We have implemented a couple of best practices, such as multi-stage builds and a non-root user, to make the container image smaller, faster, and more secure.

Once we are done with that, now, let’s run the Compose application by executing docker compose up command in the terminal. You will see a similar output in the terminal as shown in the screenshot.

code editor screenshot

Now, we can head over to localhost:3000 in our browser and test out the application. You will have a chat window like ChatGPT, type your prompt and ask questions.

Here is a short demo:

project demo

That was it. That’s how you can run Running AI Models with Docker Compose.

As always, I'm glad you made it to the end. Thank you for your support and reading. I regularly share tips on Twitter (It will always be Twitter ;)). You can connect with me there.

Comments 22 total

  • Hritik Raj
    Hritik RajAug 19, 2025

    Definitely planning to try this out soon

  • Parag Nandy Roy
    Parag Nandy RoyAug 19, 2025

    This is dev productivity gold...

  • Roshan Sharma
    Roshan SharmaAug 19, 2025

    Awesome article, Excited to try this out

  • Juan Perez prueba
    Juan Perez pruebaAug 19, 2025

    It's a super interesting post.

  • Ava Nichols
    Ava NicholsAug 19, 2025

    Thanks for this

    • Pradumna Saraf
      Pradumna SarafAug 19, 2025

      Thank you, Ava

    • dokan centre
      dokan centreAug 23, 2025

      ** Download 300+ Beautiful Flower Pictures – FREE!**
      Just complete a quick survey by filling out the form with your info and unlock an exclusive flower photo collection.

      Click here to download>>> bit.ly/flowers300up

      Don’t miss out – Start your download today!

  • Yalda Khoshpey
    Yalda Khoshpey Aug 19, 2025

    it's amazing

    • Pradumna Saraf
      Pradumna SarafAug 20, 2025

      Thank you. Yalda

      • Yalda Khoshpey
        Yalda Khoshpey Aug 20, 2025

        🥰

      • dokan centre
        dokan centreAug 23, 2025

        ** Download 300+ Beautiful Flower Pictures – FREE!**
        Just complete a quick survey by filling out the form with your info and unlock an exclusive flower photo collection.

        Click here to download>>> bit.ly/flowers300up

        Don’t miss out – Start your download today!

  • Nube Colectiva
    Nube ColectivaAug 20, 2025

    How interesting, thanks 👌🏼, a question, are large hardware resources such as RAM, GPU, etc. needed for local use on the PC?

  • David Thomas
    David ThomasAug 21, 2025

    Will try this out

  • jim ross
    jim rossAug 21, 2025

    awesome

    • Pradumna Saraf
      Pradumna SarafAug 22, 2025

      Thank you!

    • dokan centre
      dokan centreAug 23, 2025

      𝕮𝖔𝖓𝖌𝖗𝖆𝖙𝖚𝖑𝖆𝖙𝖎𝖔𝖓𝖘 𝖎𝖋 𝖞𝖔𝖚 𝖒𝖆𝖉𝖊 𝖎𝖙!
      𝕴𝖋 𝖓𝖔𝖙 𝖍𝖊𝖗𝖊'𝖘 𝖞𝖔𝖚𝖗 𝖘𝖊𝖈𝖔𝖓𝖉 𝖈𝖍𝖆𝖓𝖈𝖊 bit.ly/flowers300up

  • Onurcan1
    Onurcan1Aug 23, 2025

    good

    • dokan centre
      dokan centreAug 23, 2025

      ** 𝕯𝖔𝖜𝖓𝖑𝖔𝖆𝖉 300+ 𝕭𝖊𝖆𝖚𝖙𝖎𝖋𝖚𝖑 𝕱𝖑𝖔𝖜𝖊𝖗 𝕻𝖎𝖈𝖙𝖚𝖗𝖊𝖘 – 𝕱𝕽𝕰𝕰!**
      𝕵𝖚𝖘𝖙 𝖈𝖔𝖒𝖕𝖑𝖊𝖙𝖊 𝖆 𝖖𝖚𝖎𝖈𝖐 𝖘𝖚𝖗𝖛𝖊𝖞 𝖇𝖞 𝖋𝖎𝖑𝖑𝖎𝖓𝖌 𝖔𝖚𝖙 𝖙𝖍𝖊 𝖋𝖔𝖗𝖒 𝖜𝖎𝖙𝖍 𝖞𝖔𝖚𝖗 𝖎𝖓𝖋𝖔 𝖆𝖓𝖉 𝖚𝖓𝖑𝖔𝖈𝖐 𝖆𝖓 𝖊𝖝𝖈𝖑𝖚𝖘𝖎𝖛𝖊 𝖋𝖑𝖔𝖜𝖊𝖗 𝖕𝖍𝖔𝖙𝖔 𝖈𝖔𝖑𝖑𝖊𝖈𝖙𝖎𝖔𝖓.

      𝕮𝖑𝖎𝖈𝖐 𝖍𝖊𝖗𝖊 𝖙𝖔 𝖉𝖔𝖜𝖓𝖑𝖔𝖆𝖉>>> bit.ly/flowers300up

      𝕯𝖔𝖓’𝖙 𝖒𝖎𝖘𝖘 𝖔𝖚𝖙 – 𝕾𝖙𝖆𝖗𝖙 𝖞𝖔𝖚𝖗 𝖉𝖔𝖜𝖓𝖑𝖔𝖆𝖉 𝖙𝖔𝖉𝖆𝖞!

Add comment