Integrate with InferenceService
This page shows how to leverage the Alauda Build of Kueue's scheduling and resource management capabilities when running inferenceService in Alauda AI.
TOC
Prerequisites
- You have installed the Alauda AI.
- You have installed the Alauda Build of Kueue.
- You have installed the Alauda Build of Hami(for demonstrating vGPU).
- The Alauda Container Platform Web CLI has communication with your cluster.
Procedure
-
Create a project and namespace in Alauda Container Platform, for example, the project name is
test, and the namespace name istest-1. -
Switch to Alauda AI, click Namespace Manage in Admin > Management Namespace, and select the previously created namespace to complete the management
-
Create the assets by running the following command:
-
Create a
InferenceServiceresource in the Alauda AI UI with labelkueue.x-k8s.io/queue-name: test: -
Observe pods of the
InferenceService:You will see that this pod is in a
SchedulingGatedstate: -
Update the
nvidia.com/total-gpucoresquotas:You will see that this pod is in a
Runningstate: