top of page

Powering Your Private AI: Seamlessly Importing LLMs from Hugging Face with Nutanix Enterprise AI

  • Writer: Taylor Norris
    Taylor Norris
  • Oct 8
  • 3 min read
A digital illustration showing the concept of securely importing large language models (LLMs) from Hugging Face into a private AI environment. The image features a glowing globe with the Hugging Face logo connected by a data stream to a cube-shaped server representing AI model deployment, with the text “Powering Your Private AI – Seamlessly Importing LLMs.

In the world of enterprise AI, rapid deployment and secure management of Large Language Models (LLMs) are critical priorities. Nutanix Enterprise AI is designed specifically as a comprehensive inference endpoint management product to streamline and optimize your AI model orchestration experience on a Kubernetes cluster.


A cornerstone of this capability is our deep model access integration with leading providers, including Hugging Face. By leveraging the vast ecosystem of models available on Hugging Face, Nutanix Enterprise AI allows you to select, deploy, and manage generative AI LLMs efficiently.


This post will guide you through the seamless process of importing Hugging Face models directly into your Nutanix Enterprise AI environment.


Essential Prerequisites: Securing Your Access


Before you can begin importing models from Hugging Face, you must first establish secure connectivity to the Model Hub.


The Hugging Face Access Token

To download an LLM from Hugging Face to Nutanix Enterprise AI, you must add an Hugging Face access token.


You can add this token via the Nutanix Enterprise AI user interface from the left navigation pane by selecting Settings.

Screenshot of the Nutanix Enterprise AI interface showing the Settings section highlighted in the left navigation pane. The main panel displays the Third Party Credentials page, where users can manage tokens for external integrations. The page includes a configured Hugging Face Model Hub Token with fields for “Token Name” (set as hf_token) and a masked “Access Token.” Below it, there is an option to create an NVIDIA NGC Personal Key

You can access detailed instructions for that process here.


Step-by-Step: Importing a Model


If it is your first time importing a model, you will see the import model option on the main dashboard.


Click Import Model


Screenshot of the Nutanix Enterprise AI dashboard showing the welcome page. The main panel displays a welcome message and instructions for deploying the first model. Step 1, “Import Model,” is highlighted in red, with a note explaining that users can import validated models from the Hugging Face Model Hub, NVIDIA NGC Catalog, or upload manually from a shared location. The sidebar menu on the left includes navigation options for Dashboard, Models, Endpoints, and Settings.

Otherwise, the import process is initiated from the Models page, which lists all LLMs imported to Nutanix Enterprise AI.

Screenshot of the “Models” page in the Nutanix Enterprise AI interface. The left navigation panel highlights the “Models” tab. At the top, a blue “Import Models” dropdown menu is expanded, showing three import options: “From Hugging Face Model Hub,” “From NVIDIA NGC Catalog,” and “Using Manual Import.” Below, a table lists one active model — llama32-1b, developed by Meta, imported via Hugging Face, categorized as a Text Generation model, and marked with a green “Active” status indicator

Click Import From Hugging Face Model Hub.

Screenshot of the “Models” page in the Nutanix Enterprise AI interface showing no models currently available. The main panel displays three import options: “From NVIDIA NGC Catalog,” “From Hugging Face Model Hub,” and “Import Model.” The “Import from Hugging Face Model Hub” button in the center is highlighted with a red box, indicating it as the selected option. The left sidebar includes navigation options for Dashboard, Models, Endpoints, and Settings.

This page displays all LLMs validated to run on Nutanix Enterprise AI.


Select the radio button beside an LLM and click Import.


💡Crucial Step: Ensure you have accepted the usability terms and licenses of the LLM in Hugging Face and agreed to share your contact information with the repository author.

Screenshot of the “Import Model – Hugging Face Model Hub” window in the Nutanix Enterprise AI interface. The page lists 17 validated models available for import, displaying columns for Model Name, Type, Developer, and Model Size (in GiB). The model meta-llama/Llama-3.2-1B-Instruct from Meta is selected, highlighted with a red box. At the bottom right corner, the blue “Import” button is highlighted, next to a “Cancel” button

In the Model Instance Name field, enter a name (Nutanix recommends using the actual LLM name suffixed with a meaningful identifier)


Click Import

Screenshot of the “Import Model – Hugging Face Model Hub” window in Nutanix Enterprise AI. A pop-up titled “Import Model” is displayed with a warning message at the top explaining that some models require accepting terms on Hugging Face before import. The model meta-llama/Llama-3.2-1B-Instruct is selected, with fields showing the Model Repo Name, Model Type (Text Generation), and Description. The “Model Instance Name” field is filled in as llama32-1b, highlighted with a red box. At the bottom right, the blue “Import” button is highlighted next to a gray “Cancel” button

Monitoring the Import Status

After initiating the import, you can monitor its progress directly on the Models page. The status will display one of the following states:


Pending: The system is waiting for necessary resources, such as sufficient storage space, to become available.


Screenshot of the “Models” page in the Nutanix Enterprise AI interface showing the model import status. The table lists one model instance named llama32-1b, corresponding to meta-llama/Llama-3.2-1B-Instruct developed by Meta. The model was imported from Hugging Face and is classified as a Text Generation model. The “Status” column on the right shows a yellow dot with the label “Pending,” highlighted with a red box. The sidebar on the left displays navigation options for Dashboard, Models, Endpoints, and Settings

Processing: The system is actively downloading the LLM from Hugging Face.


Screenshot of the “Models” page in the Nutanix Enterprise AI interface showing the import progress of a model. The table lists one model instance named llama32-1b, corresponding to meta-llama/Llama-3.2-1B-Instruct developed by Meta. The model was imported via Hugging Face and is categorized as a Text Generation model. In the “Status” column, a blue dot with the label “Processing” is highlighted with a red box, indicating that the model import is currently in progress

Active: The LLM is successfully imported and is ready to be used.

Screenshot of the “Models” page in the Nutanix Enterprise AI interface showing a successfully imported model. The table lists one model instance named llama32-1b, corresponding to meta-llama/Llama-3.2-1B-Instruct developed by Meta. The model was imported via Hugging Face and is categorized as a Text Generation model. In the “Status” column, a green dot with the label “Active” is highlighted with a red box, indicating the model is fully deployed and ready for use.

Once the model is marked as active, it is ready to use.



Understanding Your Models: Key Attributes on the Models Page

After an LLM is successfully imported, the Models page provides a summary of all imported Large Language Models. The following attributes help you track and manage your models:

Attribute

Description

Model Instance Name

This is the unique, user-provided name you specify while importing a model to Nutanix Enterprise AI. Nutanix recommends suffixing the actual LLM name with a meaningful identifier.

Model

This is the actual name of the model as stored in the original model hub, such as the Hugging Face or NVIDIA database.

Developer

This field displays the name of the company or entity that developed the model.

Import Mode

This denotes the method used to import the model. Available methods include direct import from Hugging Face, import from NVIDIA NIM (NGC Catalog), or Manual Upload (often used for dark sites or custom models).

Type

This defines the purpose or category of the imported model. Some possible values include Text Generation, Embedding, and Vision.

Status

The current operational state of the model. The status can be Ready (the model is imported and ready to deploy), Processing (the model is being downloaded), Pending (waiting for required resources to save the LLM), or Failed (the import operation encountered an error).



Next Steps: Deploying to an Endpoint


Once your LLM status shows as Ready, the model is stored within Nutanix Enterprise AI and is available for deployment.

The next step in the Nutanix Enterprise AI workflow is to create an endpoint.


After successfully importing the LLM, you can deploy it to an AI inference endpoint for real-time applications. This inference endpoint accepts requests and sends back responses, enabling your AI applications to communicate with the model.


We will walk through this process in the next blog post of this series.


Conclusion

Importing Large Language Models (LLMs) from Hugging Face into Nutanix Enterprise AI is a straightforward yet powerful process that enables organizations to accelerate their generative AI initiatives with confidence. By combining secure access management, seamless integration, and comprehensive monitoring capabilities, Nutanix Enterprise AI simplifies model orchestration on Kubernetes, allowing teams to focus on innovation rather than infrastructure.


With your models now imported and ready for deployment, you’re one step closer to harnessing the full potential of AI inference endpoints. Stay tuned for the next post in this series, where we’ll guide you through deploying your imported models to an endpoint for real-time AI applications.


Additional Links














Comments


bottom of page