🧠 How Docker Model Runner Can Accelerate Your Transition
Cláudio Filipe Lima Rapôso

Cláudio Filipe Lima Rapôso @sertaoseracloud

About: At NTT DATA Europe & Latam, my role as a Systems Architect harnesses the power of Typescript, Java and Phyton to creating robust and scalable solutions.

Location:
Brazil
Joined:
Jan 10, 2025

🧠 How Docker Model Runner Can Accelerate Your Transition

Publish Date: May 31
0 0

🚀 What is Docker Model Runner?

Docker Model Runner is a feature introduced in Docker Desktop (starting with version 4.40) that allows you to run AI models locally using the docker model command. It integrates directly with the Docker ecosystem, offering a streamlined experience for developers who want to test and iterate AI models in their own environments.


🛠️ How Docker Model Runner Can Help with Solution Decisions

1. Local Testing of AI Models

With Docker Model Runner, you can run language models locally, allowing developers to test and validate AI solutions without relying on external services. This is especially useful for evaluating model behavior in different scenarios and fine-tuning parameters as needed.

2. Integration with Existing Workflows

The tool integrates seamlessly with Docker Compose, enabling AI models to run as part of multi-container applications. This makes it easier to create consistent and reproducible development environments where different parts of the application can interact with AI models in a coordinated way.

3. OpenAI-Compatible API

Docker Model Runner exposes endpoints compatible with the OpenAI API, making it easy to adapt existing applications that use this API to interact with locally run models. This gives you the flexibility to choose between local execution and cloud services depending on your project’s specific needs.

4. On-Demand Execution and Efficient Resource Management

Models are loaded into memory only when needed and unloaded after a period of inactivity, optimizing system resource usage. This allows developers to run AI models on local machines without significantly impacting overall system performance.

In Summary, When Does It Make Sense to Use?

  • Testing and prototyping LLMs without paying for an external API
  • Working offline (great for events, workshops, security, etc.)
  • Integrating AI into your local workflow via Docker Compose
  • Exploring models as if they were Docker images—but for AI

🔧 Integrating Docker Model Runner with LangChain

1. Prerequisites

  • Docker Desktop 4.40+ (macOS, Windows, or Linux)
  • Python 3.8+
  • pip install langchain_openai

💡 On Windows, you need an NVIDIA GPU with CUDA. On macOS, Apple Silicon (M1, M2, etc.). On Linux, it’s still in preview (requires docker model by CLI, in the interface yet).

2. Enable Docker Model Runner

In Docker Desktop: (YouTube)

  1. Open Docker Desktop. (DEV Community)
  2. Go to Settings.
  3. Click the Features in development tab.
  4. Enable Enable Docker Model Runner.
  5. Also enable Enable host-side TCP support and set the port to 12434.
  6. Click Apply & Restart.

Or via command line:

docker desktop enable model-runner --tcp 12434
Enter fullscreen mode Exit fullscreen mode

3. Pull a Model

Use the docker model pull command to download an available model, for example:

docker model pull ai/llama3.2:3B-Q4_K_M
Enter fullscreen mode Exit fullscreen mode

4. Install LangChain

Install the langchain_openai package using pip:

pip install langchain_openai
Enter fullscreen mode Exit fullscreen mode

5. Integrate LangChain with Docker Model Runner

Create a Python script to interact with the local model:

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="ai/llama3.2:3B-Q4_K_M",
    base_url="http://localhost:12434/engines/v1",
    api_key="ignored"
)

response = llm.invoke("Explain the Doppler effect in simple terms.")
print(response)
Enter fullscreen mode Exit fullscreen mode

This script configures LangChain to communicate with Docker Model Runner through an OpenAI-compatible API.

6. Run the Script

Save the above script to a file (e.g., test_model.py) and run it:

python test_model.py
Enter fullscreen mode Exit fullscreen mode

You’ll see the model’s response directly in the terminal.


📚 Additional Resources


Using Docker Model Runner together with LangChain enables developers to test and integrate language models locally, making it easier to prototype and build AI-based solutions without relying on cloud services.


Liked It?

Want to chat about AI, cloud, and architecture? Follow me on social:

I post technical content straight from the trenches. And when I find a tool that saves time and gets the job done—like this one—you’ll hear about it too.

Comments 0 total

    Add comment