The following licenses and keys are required for deploying AIDR in Hybrid mode. If your organization doesn't have a license, contact HiddenLayer for more information. See Hybrid and Disconnected Modes for information about Model Scanner Hybrid mode.
AIDR License Key: HiddenLayer Support will provide you with a license key. This key is required to start the LLM proxy container, and it will not run without a valid key. The license can be set as an environment variable and the installer will not run without the license being set as a value.
Credentials to download AIDR container: Credentials for the HiddenLayer container repository are required to download the appropriate images. These can also be obtained from HiddenLayer Support or from your HiddenLayer technical contact.
API Client ID and Client Secret: HiddenLayer API Client ID and Client Secret to interact with the AISec Platform Console. Get these from the Console or your Console Admin.
API Permissions
- Inferences: write
- Model Inventory: write
Links to Console
- Link to the US Console
- Link to the EU Console
The following licenses and keys are required for deploying AIDR in Disconnected mode. If your organization doesn't have a license, contact HiddenLayer for more information. See Hybrid and Disconnected Modes for information about Model Scanner Disconnected mode.
- AIDR License Key: HiddenLayer Support will provide you with a license key. This key is required to start the LLM proxy container, and it will not run without a valid key. The license can be set as an environment variable and the installer will not run without the license being set as a value.
- Credentials to download AIDR container: Credentials for the HiddenLayer container repository are required to download the appropriate images. These can also be obtained from HiddenLayer Support or from your HiddenLayer technical contact.
Disconnected mode does not require an API Client ID and Secret.
The following tools are required for deploying AIDR for GenAI.
Docker Desktop: Docker Desktop is used to deploy the container to your Kubernetes cluster.
kubectl: The official Kubernetes CLI tool, used to issue commands to your Kubernetes cluster.
Kubernetes cluster
- For cloud deployments, you need
kubectlaccess to the cluster. - For local deployment, use Minikube (or something similar) that can run on your local system.
- For cloud deployments, you need
The following are resource recommendations for the virtual machine running the LLM Proxy in a production environment. Azure AKS is used as an example. These resource recommendations can be applied to any virtual environment. For information about an Azure AKS Dv3 virtual machine, see the Azure documentation.
| Resource | Resource Type | Count |
|---|---|---|
| Azure AKS | Standard_D32_v3 | 2 |
| Resource Type | CPU (vCPU) | Memory (GB) | GPU |
|---|---|---|---|
| Standard_D32_v3 | 32 | 128 | 0 |
For testing AIDR for GenAI, the image can run with 8 CPU cores and 16GB memory (most modern laptops). This is not recommended for production environments due to latency.
AIDR is horizontally scalable. The latency and throughput for each replica depends on many factors in the deployed environment, including underlying node type, network conditions, and resource contention.
To make the best use of your underlying hardware, we recommend the following:
Replica Count
- Allocate 8 Kubernetes CPU units for each replica
- Allocate as many replicas as 8 CPU replicas can fit on to the node
Example
If the underlying node type is Azure’s Standard_D32_v3, we recommend setting the following Kubernetes parameters:
replicas: min: 4 max: 4 resources: requests: cpu: 8 memory: 4096MiThread Count per Replica
- Set the environment variable OMP_NUM_THREADS: 8
Replica CountThis value will only improve performance if the guidance in the previous step (Replica Count) is applied.