top of page

NAI Create API Endpoint

  • Writer: Taylor Norris
    Taylor Norris
  • 2 days ago
  • 3 min read

nutanix enterprise ai create api endpoint

Nutanix Enterprise AI (NAI) turns your AI infrastructure into a production-ready platform for serving models at scale. One of the most powerful features of the platform is the ability to create inference endpoints—secure, scalable API interfaces that allow your applications to communicate with your deployed large language models (LLMs).

Whether you're building chatbots, copilots, or search assistants, inference endpoints are the bridge between your LLMs and the applications that depend on them.


In this guide, we’ll walk through how to create an inference endpoint in Nutanix Enterprise AI, what each configuration setting means, and why these endpoints are essential for powering modern AI workflows.


Start by logging in to NAI and clicking on the "Models" tab. You need an imported model with an Active status that is available for deployment.

For detailed instructions on deploying a model, please reference my previous blog post: Powering Your Private AI: Seamlessly Importing LLMs from Hugging Face with Nutanix Enterprise AI.

nai import models


The next step in the Nutanix Enterprise AI workflow is to create an endpoint. After successfully importing the LLM, you can deploy it to an AI inference endpoint for real-time applications. This inference endpoint accepts requests and sends back responses, enabling your AI applications to communicate with the model.


By streamlining the LLM import process and integrating closely with Hugging Face, Nutanix Enterprise AI ensures that you can rapidly bring state-of-the-art models into your secure, managed enterprise environment.


To create an API endpoint, click the "Endpoints" tab in the left hand navigation menu


nai create new endpoint

Click "Create New Endpoint" (note: Your screen will look like this if it is the first time you create an endpoint)


nai create new endpoint

(note: If you already have existing endpoints, you will see them here. In this case, click "Create Endpoint") Next, give your API Endpoint a name.

The name can include only lowercase letters, numbers, or '-'. It must start with a letter and end with either a letter or a number.


create endpoint


I will call this one "llama-3-2-1b".


Select the imported model you want to associate with this endpoint. I will uncheck the box for use gpu as my system doesn't have a GPU, but you would leave this checked if your system does. Next, click "Create a New API Key"

create api key

Give the key a name.

Click Create

nai create api key

Click "Copy Key" and store this information somewhere safe (like on a notepad or in a password manager).

Click "Close"

copy api key

Click "Create" to create the API Key.

done on api key

On the Endpoints tab, you will notice your new endpoint change status from "Pending" to "Processing" to "Active".


create api endpoint status

create api endpoint status
create api endpoint status

Once the endpoint status is active, click on the "Endpoint Name" hyperlink to test your endpoint.


test api endpoint


Then click on the "Test" button.

test api endpoint

Here you can use the built in sample request or do a custom request.

test api endpoint

If I click on the built in request, the output from the inference will appear in the right hand pane.

test api endpoint

Alternatively, you could click the "Custom Request" radio button and do a use case specific test.

test api endpoint

When you are satisfied with your test, click "Done".


Click the "back arrow" to navigate back to the endpoint dashboard landing page.


test api endpoint

The last thing we will review is how to view the code for a sample api request.


Click "View Sample Request"



view sample request


test api endpoint

The endpoints can be accessed using the URL and the API Key. Share the sample code with the developers to access the endpoints and integrate endpoints into their applications.


Summary

Inference endpoints are the critical link that turn large language models into usable, production-ready services. With Nutanix Enterprise AI, creating these endpoints is a streamlined, secure process that enables real-time interaction between your applications and deployed LLMs. From selecting an imported model and configuring GPU usage to generating API keys, testing requests, and sharing sample code with developers, NAI provides everything needed to operationalize AI quickly and safely. By simplifying endpoint creation and tightly integrating with model sources like Hugging Face, Nutanix Enterprise AI empowers teams to move from experimentation to enterprise-grade AI applications with confidence.


Additional Links





Comments


bottom of page