Nvidia Introduces NIM to Streamline the Implementation of AI Models in Production

Nvidia today unveiled Nvidia NIM, a new software platform intended to expedite the deployment of customized and pre-trained AI models into production applications, at its GTC conference. By combining a given model with an optimized inferencing engine and placing this inside a container, which makes it accessible as a microservice, NIM leverages the software work Nvidia has done around inferencing and optimizing models and makes it simply accessible.

Nvidia says that normally, it would take developers weeks, if not months, to ship comparable containers—and that’s assuming the business even has any AI talent on staff. With NIM, Nvidia is making it plain that its goal is to build an ecosystem of AI-ready containers, with its hardware serving as the base and these carefully chosen microservices as the main software layer, enabling businesses looking to accelerate their AI roadmaps.

In addition to open models from Google, Hugging Face, Meta, Microsoft, Mistral AI, and Stability AI, NIM now supports models from NVIDIA, A121, Adept, Cohere, Getty Images, and Shutterstock. Nvidia has already established partnerships with Google, Amazon, and Microsoft to enable these NIM microservices on Azure AI, Kubernetes Engine, and SageMaker, respectively. Additionally, Deepset, LangChain, and LlamaIndex frameworks will incorporate them.

Manuvir Das, Nvidia’s head of enterprise computing, stated at a press conference held prior to today’s announcements, “We believe that the Nvidia GPU is the best place to run inference of these models on […] and we believe that NVIDIA NIM is the best software package, the best runtime, for developers to build on top of so that they can focus on the enterprise applications — and just let Nvidia do the work to produce these models for them in the most efficient, enterprise-grade manner, so that they can just do the rest of their work.””

TensorRT, TensorRT-LLM, and the Triton Inference Server will be the inference engines used by Nvidia. The Earth-2 model for weather and climate simulations, cuOpt for routing optimizations, and Riva for modifying speech and translation models are a few of the Nvidia microservices that will be made available through NIM.

The business intends to introduce more features in the future. One such feature is the availability of the Nvidia RAG LLM operator as a NIM, which should make it much simpler to create generative AI chatbots that can bring in unique input.

Without a couple of announcements from partners and customers, this wouldn’t be a developer conference. Companies like Box, Cloudera, Cohesity, Datastax, Dropbox, and NetApp are among NIM’s current clients.

NVIDIA founder and CEO Jensen Huang stated, “Established enterprise platforms are sitting on a goldmine of data that can be transformed into generative AI copilots,” “Created with our partner ecosystem, these containerized AI microservices are the building blocks for enterprises in every industry to become AI companies.”