The following licenses and keys are required for deploying AI Runtime Security in Hybrid mode. If your organization doesn't have a license, contact HiddenLayer for more information. See Hybrid and Disconnected Modes for information about Runtime Security Hybrid mode.
Runtime Security License Key: HiddenLayer Support will provide you with a license key. This key is required to start the LLM proxy container, and it will not run without a valid key. The license can be set as an environment variable and the installer will not run without the license being set as a value.
Credentials to download Runtime Security container: Credentials for the HiddenLayer container repository are required to download the appropriate images. These can also be obtained from HiddenLayer Support or from your HiddenLayer technical contact.
API Client ID and Client Secret: HiddenLayer API Client ID and Client Secret to interact with the AISec Platform Console. Get these from the Console or your Console Admin.
API Permissions
- Inferences: write
- Model Inventory: write
Links to Console
- Link to the US Console
- Link to the EU Console
The following licenses and keys are required for deploying Runtime Security in Disconnected mode. If your organization doesn't have a license, contact HiddenLayer for more information. See Hybrid and Disconnected Modes for information about Runtime Security Disconnected mode.
- Runtime Security License Key: HiddenLayer Support will provide you with a license key. This key is required to start the LLM proxy container, and it will not run without a valid key. The license can be set as an environment variable and the installer will not run without the license being set as a value.
- Credentials to download Runtime Security container: Credentials for the HiddenLayer container repository are required to download the appropriate images. These can also be obtained from HiddenLayer Support or from your HiddenLayer technical contact.
Disconnected mode does not require an API Client ID and Secret.
The following tools are required for deploying Runtime Security for GenAI.
Docker Desktop: Docker Desktop is used to deploy the container to your Kubernetes cluster.
kubectl: The official Kubernetes CLI tool, used to issue commands to your Kubernetes cluster.
Kubernetes cluster
- For cloud deployments, you need
kubectlaccess to the cluster. - For local deployment, use Minikube (or something similar) that can run on your local system.
- For cloud deployments, you need
The following are resource recommendations for the virtual machine running the LLM Proxy in a production environment. Azure AKS is used as an example. These resource recommendations can be applied to any virtual environment. For information about an Azure AKS Dv3 virtual machine, see the Azure documentation.
| Resource | Resource Type | Count |
|---|---|---|
| Azure AKS | Standard_D32_v3 | 2 |
| Resource Type | CPU (vCPU) | Memory (GB) | GPU |
|---|---|---|---|
| Standard_D32_v3 | 32 | 128 | 0 |
For testing Runtime Security for GenAI, the image can run with 8 CPU cores and 16GB memory (most modern laptops). This is not recommended for production environments due to latency.
Runtime Security is horizontally scalable. The latency and throughput for each replica depends on many factors in the deployed environment, including underlying node type, network conditions, and resource contention.
To make the best use of your underlying hardware, we recommend the following:
Replica Count
- Allocate 8 Kubernetes CPU units for each replica
- Allocate as many replicas as 8 CPU replicas can fit on to the node
Example
If the underlying node type is Azure’s Standard_D32_v3, we recommend setting the following Kubernetes parameters:
replicas: min: 4 max: 4 resources: requests: cpu: 8 memory: 4096MiThread Count per Replica
- Set the environment variable OMP_NUM_THREADS: 8
Replica CountThis value will only improve performance if the guidance in the previous step (Replica Count) is applied.