Docker Model Runner: Simplifying Local LLM Model Execution

Originally published at ssojet Currently in preview with Docker Desktop 4.40 for macOS on Apple Silicon, Docker Model Runner allows developers to run models locally and iterate on application code without disrupting their container-based workflows. Using local LLMs for development offers benefits such as lower costs, improved data privacy, reduced network latency, and greater control over the model. Docker Model Runner addresses several pain points for developers integrating LLMs into containerized applications, such as dealing with different tools and managing models outside their containers. Key Features of Docker Model Runner Docker Model Runner includes an inference engine built on top of llama.cpp and accessible through the OpenAI API. This integration allows developers to avoid the performance overhead of virtual machines by using host-based execution, which runs models directly on Apple Silicon and leverages GPU acceleration. For model distribution, Docker uses the OCI standard, which aims to unify both container and model workflows. Developers can pull models easily from Docker Hub and soon will be able to push their own models, integrating with any container registry. Developers can utilize the docker model command to manage models easily, similar to managing containers. For example, commands like docker model pull ai/smollm2:360M-Q4_K_M and docker model run ai/smollm2:360M-Q4_K_M "Give me a fact about whales." are straightforward, enabling effective model management without creating containers. Integration with OpenAI-Compatible Clients Model Runner can be used with any OpenAI-compatible client or framework via its endpoint at [http://model-runner.docker.internal/engines/v1](http://model-runner.docker.internal/engines/v1). This endpoint is accessible from within containers and from the host, provided TCP host access is enabled. Docker Hub hosts various ready-to-use models for Model Runner, including smollm2, llama3.3, and gemma3. Docker has also published tutorials on integrating models into applications, providing useful guidance for developers looking to make the most of these models. Docker MCP Catalog and Toolkit Docker recently announced the Docker MCP Catalog and Toolkit, providing a centralized way for developers to discover verified and curated Model Context Protocol (MCP) tools. This initiative aims to simplify AI agent development by integrating Docker's simplicity and security into the MCP ecosystem. Features of MCP Catalog and Toolkit The Docker MCP Catalog allows developers to discover and manage over 100 MCP servers from various providers directly from Docker Desktop. The MCP Toolkit enables developers to run, authenticate, and manage these tools with the expected usability from Docker. Docker’s partnership with industry leaders helps shape a secure, developer-first ecosystem for MCP tools. This collaboration facilitates the integration of powerful AI agents into real workflows, allowing developers to build, test, and ship applications seamlessly. Importance of Security and Compliance As AI applications continue to grow, the need for secure software delivery becomes paramount. The Docker MCP Catalog addresses this by ensuring tools are verified, secure, and easy to run, allowing developers to focus on building applications instead of handling complex integrations. Future updates will introduce capabilities for teams to publish and manage their MCP servers, enhancing security controls and facilitating compliance. Docker Desktop 4.40 Features The latest release of Docker Desktop 4.40 includes new tools that simplify GenAI app development and support scalable development. Among these features is the Docker Model Runner in beta, which allows for local model execution, GPU acceleration for Apple Silicon, and standardized model packaging using OCI Artifacts. Image courtesy of Docker Enhancements in Docker AI Agent The Docker AI Agent has been enhanced with MCP integration, enabling seamless communication with various tools and applications. This agent now supports various developer tasks, such as running shell commands, managing files, and performing Git operations. The AI Tool Catalog extension allows developers to explore different MCP servers and connect the Docker AI agent to other tools or LLMs, simplifying the process of managing multiple server configurations and credential management. Opportunities for Secure Authentication As AI applications evolve, the integration of secure authentication methods becomes increasingly important. SSOJet offers API-first solutions for enterprises that include secure Single Sign-On (SSO), Multi-Factor Authentication (MFA), and Passkey management. With directory synchronization, SAML, and OIDC support, SSOJet ensures streamlined user management and enhanced security for enterprise clients. Explore our services at ssojet.com to learn how we can s

Apr 23, 2025 - 06:09

Docker Model Runner: Simplifying Local LLM Model Execution

Originally published at ssojet

Currently in preview with Docker Desktop 4.40 for macOS on Apple Silicon, Docker Model Runner allows developers to run models locally and iterate on application code without disrupting their container-based workflows. Using local LLMs for development offers benefits such as lower costs, improved data privacy, reduced network latency, and greater control over the model. Docker Model Runner addresses several pain points for developers integrating LLMs into containerized applications, such as dealing with different tools and managing models outside their containers.

Key Features of Docker Model Runner

Docker Model Runner includes an inference engine built on top of llama.cpp and accessible through the OpenAI API. This integration allows developers to avoid the performance overhead of virtual machines by using host-based execution, which runs models directly on Apple Silicon and leverages GPU acceleration. For model distribution, Docker uses the OCI standard, which aims to unify both container and model workflows. Developers can pull models easily from Docker Hub and soon will be able to push their own models, integrating with any container registry.

Developers can utilize the docker model command to manage models easily, similar to managing containers. For example, commands like docker model pull ai/smollm2:360M-Q4_K_M and docker model run ai/smollm2:360M-Q4_K_M "Give me a fact about whales." are straightforward, enabling effective model management without creating containers.

Integration with OpenAI-Compatible Clients

Model Runner can be used with any OpenAI-compatible client or framework via its endpoint at [http://model-runner.docker.internal/engines/v1](http://model-runner.docker.internal/engines/v1). This endpoint is accessible from within containers and from the host, provided TCP host access is enabled. Docker Hub hosts various ready-to-use models for Model Runner, including smollm2, llama3.3, and gemma3.

Docker has also published tutorials on integrating models into applications, providing useful guidance for developers looking to make the most of these models.

Docker MCP Catalog and Toolkit

Docker recently announced the Docker MCP Catalog and Toolkit, providing a centralized way for developers to discover verified and curated Model Context Protocol (MCP) tools. This initiative aims to simplify AI agent development by integrating Docker's simplicity and security into the MCP ecosystem.

Features of MCP Catalog and Toolkit

The Docker MCP Catalog allows developers to discover and manage over 100 MCP servers from various providers directly from Docker Desktop. The MCP Toolkit enables developers to run, authenticate, and manage these tools with the expected usability from Docker.

Docker’s partnership with industry leaders helps shape a secure, developer-first ecosystem for MCP tools. This collaboration facilitates the integration of powerful AI agents into real workflows, allowing developers to build, test, and ship applications seamlessly.

Importance of Security and Compliance

As AI applications continue to grow, the need for secure software delivery becomes paramount. The Docker MCP Catalog addresses this by ensuring tools are verified, secure, and easy to run, allowing developers to focus on building applications instead of handling complex integrations. Future updates will introduce capabilities for teams to publish and manage their MCP servers, enhancing security controls and facilitating compliance.

Docker Desktop 4.40 Features

The latest release of Docker Desktop 4.40 includes new tools that simplify GenAI app development and support scalable development. Among these features is the Docker Model Runner in beta, which allows for local model execution, GPU acceleration for Apple Silicon, and standardized model packaging using OCI Artifacts.

Image courtesy of Docker

Enhancements in Docker AI Agent

The Docker AI Agent has been enhanced with MCP integration, enabling seamless communication with various tools and applications. This agent now supports various developer tasks, such as running shell commands, managing files, and performing Git operations.

The AI Tool Catalog extension allows developers to explore different MCP servers and connect the Docker AI agent to other tools or LLMs, simplifying the process of managing multiple server configurations and credential management.

Opportunities for Secure Authentication

As AI applications evolve, the integration of secure authentication methods becomes increasingly important. SSOJet offers API-first solutions for enterprises that include secure Single Sign-On (SSO), Multi-Factor Authentication (MFA), and Passkey management. With directory synchronization, SAML, and OIDC support, SSOJet ensures streamlined user management and enhanced security for enterprise clients.

Explore our services at ssojet.com to learn how we can support your authentication needs.