Deploy JetBrains Mellum Your Way: Now Available via NVIDIA NIM

Deploy Mellum as a production-grade LLM inside your own infrastructure – with NVIDIA. JetBrains Mellum – our open, focused LLM specialized on code completion – is now available to run as a containerized microservice on NVIDIA AI Factories. Using the new NVIDIA universal LLM NIM container, Mellum can be deployed in minutes on any NVIDIA-accelerated […]

Jun 11, 2025 - 19:00
 0
Deploy JetBrains Mellum Your Way: Now Available via NVIDIA NIM

Deploy Mellum as a production-grade LLM inside your own infrastructure – with NVIDIA.

JetBrains Mellum – our open, focused LLM specialized on code completion – is now available to run as a containerized microservice on NVIDIA AI Factories. Using the new NVIDIA universal LLM NIM container, Mellum can be deployed in minutes on any NVIDIA-accelerated infrastructure, whether in the cloud, on-premises, or across hybrid environments.

Mellum is part of the early launch cohort of models that showcase coding capabilities on AI factories. We’re proud to be among the first teams contributing to this new enterprise ecosystem.

But wait – isn’t Mellum already in JetBrains IDEs and on Hugging Face?

Yes – and that’s not changing. Mellum is tightly integrated into our developer tools via JetBrains AI Assistant and is also available on Hugging Face. But some teams often have very different requirements, such as:

  • Deployment on their own hardware, in environments they control
  • Integration into custom pipelines, CI/CD flows, observability platforms
  • Fine-tuning or customization for domain-specific use cases
  • Security, compliance, and performance guarantees

That’s where NVIDIA Enterprise AI Factory validated design comes in – it’s a reference platform for building full-stack enterprise AI systems. Mellum, available via NVIDIA NIM, becomes a plug-and-play model block that can fit directly into those pipelines. In our testing, we wanted to ensure that, as the Mellum family grows, we are able to offer JetBrains models on a performant inference solution that is enterprise-ready.

What are NVIDIA NIM microservices?

NVIDIA NIM microservices  are part of NVIDIA AI Enterprise, and do something very straightforward but invaluable: wrap complex AI model infrastructure into simple, fast-deployable containers optimized for inference. With the new universal LLM NIM container designed to work with a broad range of open and specialized LLMs, Mellum can now be deployed securely on NVIDIA-accelerated computing – on-premises, in the cloud, or across hybrid environments.

From a technical standpoint, it means Mellum is now available through a single container interface that supports major backends like NVIDIA TensorRT-LLM, vLLM, and SGLang. This helps teams run inference efficiently and predictably using open-source models they can inspect, adapt, and improve.

We’re particularly excited about how this ecosystem can help enterprise users evolve from basic chatbot integrations to deeply integrated AI assistants embedded across software engineering workflows. 

Do I still need JetBrains AI Assistant if Mellum runs on NIM?

Some users ask:
“Is the open-source Mellum (via NIM) the same as what’s in JetBrains AI Assistant?”

Not exactly.

The open-source Mellum,now deployable via NIM, is great for custom, self-hosted use cases. But JetBrains AI Assistant uses enhanced proprietary versions of Mellum, with deeper IDE integration and a more polished developer experience.

In short:

  • Use NIM and Mellum for flexible, custom deployment.
  • Use AI Assistant for the best out-of-the-box experience inside JetBrains tools.

Try it now

Mellum deployment is now one button away from you, so check it out here.