Resource Requirements
Copy for LLM
Copy page as Markdown for LLMs
View as Markdown
Open this page as Markdown
Open in ChatGPT
Get insights from ChatGPT
Open in Claude
Get insights from Claude
Connect to Cursor
Install MCP server on Cursor
Connect to VS Code
Install MCP server on VS Code

License Keys

Hybrid Mode

The following licenses and keys are required for deploying AI Runtime Security in Hybrid mode. If your organization doesn't have a license, contact HiddenLayer for more information. See Hybrid and Disconnected Modes for information about Runtime Security Hybrid mode.

Runtime Security License Key: HiddenLayer Support will provide you with a license key. This key is required to start the LLM proxy container, and it will not run without a valid key. The license can be set as an environment variable and the installer will not run without the license being set as a value.
Credentials to download Runtime Security container: Credentials for the HiddenLayer container repository are required to download the appropriate images. These can also be obtained from HiddenLayer Support or from your HiddenLayer technical contact.
API Client ID and Client Secret: HiddenLayer API Client ID and Client Secret to interact with the AISec Platform Console. Get these from the Console or your Console Admin.
- API Permissions
  - Inferences: write
  - Model Inventory: write
- Links to Console
  - Link to the US Console
  - Link to the EU Console

Disconnected Mode

The following licenses and keys are required for deploying Runtime Security in Disconnected mode. If your organization doesn't have a license, contact HiddenLayer for more information. See Hybrid and Disconnected Modes for information about Runtime Security Disconnected mode.

Runtime Security License Key: HiddenLayer Support will provide you with a license key. This key is required to start the LLM proxy container, and it will not run without a valid key. The license can be set as an environment variable and the installer will not run without the license being set as a value.
Credentials to download Runtime Security container: Credentials for the HiddenLayer container repository are required to download the appropriate images. These can also be obtained from HiddenLayer Support or from your HiddenLayer technical contact.

API Keys for Disconnected Mode Not Required

Disconnected mode does not require an API Client ID and Secret.

Tools

The following tools are required for deploying Runtime Security for GenAI.

Docker Desktop: Docker Desktop is used to deploy the container to your Kubernetes cluster.
kubectl: The official Kubernetes CLI tool, used to issue commands to your Kubernetes cluster.
Kubernetes cluster
- For cloud deployments, you need kubectl access to the cluster.
- For local deployment, use Minikube (or something similar) that can run on your local system.

Resource Guidelines

The following are resource recommendations for the virtual machine running the LLM Proxy in a production environment. Azure AKS is used as an example. These resource recommendations can be applied to any virtual environment. For information about an Azure AKS Dv3 virtual machine, see the Azure documentation.

Resource	Resource Type	Count
Azure AKS	Standard_D32_v3	2

CPU and Memory

Resource Type	CPU (vCPU)	Memory (GB)	GPU
Standard_D32_v3	32	128	0

Testing

For testing Runtime Security for GenAI, the image can run with 8 CPU cores and 16GB memory (most modern laptops). This is not recommended for production environments due to latency.

Scaling Recommendations

Runtime Security is horizontally scalable. The latency and throughput for each replica depends on many factors in the deployed environment, including underlying node type, network conditions, and resource contention.

To make the best use of your underlying hardware, we recommend the following:

Replica Count
- Allocate 8 Kubernetes CPU units for each replica
- Allocate as many replicas as 8 CPU replicas can fit on to the node
Example
If the underlying node type is Azure’s Standard_D32_v3, we recommend setting the following Kubernetes parameters:
```
replicas:
  min: 4
  max: 4
resources:
  requests:
    cpu: 8
    memory: 4096Mi
```
Thread Count per Replica
- Set the environment variable OMP_NUM_THREADS: 8
Replica Count
This value will only improve performance if the guidance in the previous step (Replica Count) is applied.

Resource RequirementsCopyCopy for LLMCopy page as Markdown for LLMsView as MarkdownOpen this page as MarkdownOpen in ChatGPTGet insights from ChatGPTOpen in ClaudeGet insights from ClaudeConnect to CursorInstall MCP server on CursorConnect to VS CodeInstall MCP server on VS Code