Deploy AI Runtime Security using Helm

Deploy AI Runtime Security to a Kubernetes cluster using Helm.

Prerequisites

Access to a Kubernetes cluster
kubectl — Kubernetes command-line tool
Helm — Kubernetes package manager (v3+)
Resource Requirements — license keys, tools, and scaling guidance
Hybrid and Disconnected Modes — connection mode details

Create a Helm Values File

Create a values.yaml file to override the chart's default settings. Application configuration goes inside config.settings.yaml as an embedded YAML string block. Secrets (auth credentials and license) go in the env: block.

Handling Secrets

Sensitive values in env: should use the secret: prefix followed by the base64-encoded value (e.g., the output of echo -n "my-value" | base64). The chart automatically creates a Kubernetes Secret and injects it via secretKeyRef, so the plaintext value never appears in the pod spec.

Choose the connection mode for your deployment. See Hybrid and Disconnected Modes for details on what data is sent in each mode.

In Hybrid mode, metadata per inference is sent to the HiddenLayer Console to power visualizations and alerting. This requires authentication credentials.

By default, prompts and responses are also sent to the Console so you can review Interactions in context. To disable prompt collection, set log-chat-context to false under aidr-genai.detector.engine.

aidr_genai:
  config:
    settings.yaml: |
      platform:
        auth-n:
          base-url: "https://auth.hiddenlayer.ai"
        api-connection:
          type: "hybrid"
          base-url: "https://api.us.hiddenlayer.ai"  # US region
          # base-url: "https://api.eu.hiddenlayer.ai"  # EU region
      aidr-genai:
        proxy:
          log-level: "info"
          device:
            type: "cpu"
        detector:
          engine:
            log-chat-context: true  # Set to false to disable sending prompts to the HL Console
  env:
    HL_LLM_PROXY_CLIENT_ID: "secret:<base64-encoded-client-id>"
    HL_LLM_PROXY_CLIENT_SECRET: "secret:<base64-encoded-client-secret>"
    HL_LICENSE: "secret:<base64-encoded-license-key>"

Deployment

Log In to the HiddenLayer Helm Registry

Run the following command in a terminal to log in to the HiddenLayer registry.
- The username is your Registry username.
- The password is your License ID.
- For more information, see Resource Requirements.
```
helm registry login registry.hiddenlayer.ai --username <email specified for registry> --password <License ID>
```

Using Your Own Repository

For users that would like to use their own repository, see the steps below on how to pull the relevant images. Skip these steps if this is not applicable to you.

Download AIDR Images - Click to expand

Run the following commands in the terminal to download the AIDR images.

Docker Command Fails

When using Docker commands, like docker pull, if you get a permission denied message, try using sudo docker, like sudo docker pull.

Alternatively, add the user to the appropriate Docker group for Docker daemon permissions.

Run the following command in a terminal to log in to the HiddenLayer image registry.
- The username is the Registry Username.
- The password is the License ID.
```
docker login images.hiddenlayer.ai --username <email specified for registry> --password <License ID>
```

Run each of the following commands to pull the images.

docker pull --platform linux/amd64 images.hiddenlayer.ai/proxy/aidr-genai/ghcr.io/hiddenlayer-engineering/distro-enterprise-aidr-genai:26.5.0

docker pull --platform linux/amd64 images.hiddenlayer.ai/proxy/aidr-genai/ghcr.io/hiddenlayer-engineering/replicated-sdk-image:v1.14.0
docker pull --platform linux/amd64 images.hiddenlayer.ai/proxy/aidr-genai/ghcr.io/hiddenlayer-engineering/replicated-license-enforcer:0.6.0

Tag the images to a private registry. Replace %YOUR-REGISTRY% with your private registry information.

docker tag images.hiddenlayer.ai/proxy/aidr-genai/ghcr.io/hiddenlayer-engineering/distro-enterprise-aidr-genai:26.5.0 %YOUR-REGISTRY%/distro-enterprise-aidr-genai:26.5.0

docker tag images.hiddenlayer.ai/proxy/aidr-genai/ghcr.io/hiddenlayer-engineering/replicated-sdk-image:v1.14.0 %YOUR-REGISTRY%/replicated-sdk-image:v1.14.0
docker tag images.hiddenlayer.ai/proxy/aidr-genai/ghcr.io/hiddenlayer-engineering/replicated-license-enforcer:0.6.0 %YOUR-REGISTRY%/replicated-license-enforcer:0.6.0

Push the images to a private registry. Replace %YOUR-REGISTRY% with your private registry information. Note: Make sure you are logged in to your private registry before pushing the images.
```
docker push %YOUR-REGISTRY%/distro-enterprise-aidr-genai:26.5.0
```
```
docker push %YOUR-REGISTRY%/replicated-sdk-image:v1.14.0
docker push %YOUR-REGISTRY%/replicated-license-enforcer:0.6.0
```

Install

Create a values.yaml file to customize installation.
- See Create a Helm Values File above.

Run the following command to deploy Runtime Security.

helm upgrade --install aidr-genai oci://registry.hiddenlayer.ai/aidr-genai/stable/aidr-genai \
  --namespace aidr-genai --create-namespace \
  -f values.yaml

Verify the Deployment

Check that all pods are running:
```
kubectl get pods -n aidr-genai
```

Port-forward the service to your local machine:

kubectl port-forward svc/aidr-genai 8000:80 -n aidr-genai

Verify the health endpoint:
```
curl http://localhost:8000/health
```

Using the Interactions Endpoint

Once deployed, you can analyze LLM input and output by sending requests to the Interactions endpoint:

curl -X POST http://<service-endpoint>:8000/detection/v1/interactions \
  -H "Content-Type: application/json" \
  -d '{
    "metadata": {
      "model": "gpt-5",
      "requester_id": "user-1234",
      "provider": "openai"
    },
    "input": {
      "messages": [
        {
          "role": "user",
          "content": "What is the largest moon of Jupiter?"
        }
      ]
    },
    "output": {
      "messages": [
        {
          "role": "assistant",
          "content": "The largest moon of Jupiter is Ganymede."
        }
      ]
    }
  }'

For SDK examples and the full response format, see Getting Started with Interactions.

Additional Configuration

The following sections cover additional configuration beyond the baseline. The examples in Create a Helm Values File include working defaults for all of these — adjust as needed after verifying your deployment.

Device Configuration

The aidr-genai.proxy.device.type setting in config.settings.yaml controls which hardware is used for running the ML-based detection models (e.g., prompt injection classifier). It does not affect LLM provider routing.

CPU is the default device type. Allocate 8 CPU units per replica for production workloads.

aidr_genai:
  config:
    settings.yaml: |
      aidr-genai:
        proxy:
          device:
            type: "cpu"

Horizontal Autoscaling

The chart creates a Horizontal Pod Autoscaler (HPA) when resources.targetUtilization.cpu and resources.requests.cpu are set. Use replicas.min and replicas.max to control scaling bounds.

Allocate 8 CPU per replica and set OMP_NUM_THREADS: 8 to match. Scale replicas to fill node capacity — for example, 4 replicas on a 32-vCPU node. See Resource Requirements for detailed scaling guidance.

aidr_genai:
  resources:
    targetUtilization:
      cpu: 75
    requests:
      cpu: 8
      memory: 4096Mi
    limits:
      memory: 4096Mi
  replicas:
    min: 2
    max: 8

Detection Policy

Configure detection policy inside the config.settings.yaml block:

aidr_genai:
  config:
    settings.yaml: |
      aidr-genai:
        detector:
          prompt-injection:
            enabled: true
            severity: high
            on-alert:
              proxy-action: allow  # allow | block
          personally-identifiable-information:
            enabled: true
            on-alert:
              proxy-action: allow
              redaction-type: replace-with-entity