Skip to content

Policy Configuration

There are different policy groups, each with specific environment variables and header keys, which can be tailored to the specifications and requirements of your organization. Additionally, you can set the conviction severity levels to determine the appropriate threat level for your organization to trigger a conviction or a block. By default, the policy is to alert only for all detections (all blocks are set to False by default).

Policy configurations can in most (not all) cases additionally be sent at runtime via an additional request header. As stated above, please note that the headers will override deployment-level policy settings, enabling unique policies for different use cases within a single LLM proxy deployment.

Global

By default, the policy will be alert only for all detections.

Most configuration settings are true or false, with false being the default setting. For configurations with different settings, the settings are identified in the Description.

Environment KeyHeader KeyDescriptionDefaultRequired
HL_LLM_BLOCK_MESSAGEn/aThe message that displays when a message is blocked.Message was blocked.False
HL_LLM_BLOCK_UNSAFEX-LLM-Block-UnsafeIf overall verdict is true, the message will be blocked.FalseFalse
HL_LLM_BLOCK_UNSAFE_INPUTX-LLM-Block-Unsafe-InputIf unsafe input verdict is true, the message will be blocked.FalseFalse
HL_LLM_BLOCK_UNSAFE_OUTPUTX-LLM-Block-Unsafe-OutputIf unsafe output verdict is true, the message will be blocked.FalseFalse
HL_LLM_CHAT_COMPLETION_CONTEXT_WINDOWX-LLM-Chat-Completion-Context-WindowSize of chat completion window to perform analysis on FULL or LAST.

LAST - Only analyze the last message in a chat completion request.

FULL - Analyze all messages in a chat completion request.
LASTFalse
HL_LLM_INCLUDE_BLOCK_MESSAGE_REASONSX-LLM-Include-Block-Message-ReasonsWhen enabled, the block message reasons will be included in the response.TrueFalse
HL_LLM_PROXY_ENABLE_PASSTHROUGH_STREAMINGX-LLM-Proxy-Enable-Passthrough-StreamingWhen enabled, the proxy will immediately start streaming the response back to the requester. Currently available for OpenAI.FalseFalse
HL_LLM_PROXY_ENABLE_HEADER_POLICYn/aEnable security rules to be set per request via HTTP headers.

Note: Recommended to set to False for production environments.
TrueFalse
HL_LLM_PROXY_ENABLE_UNSECURED_ROUTE_PASSTHROUGHX-LLM-Proxy-Enable-Unsecured-Route-PassthroughWhen using AIDR as a reverse proxy, all transparent upstream requests passthrough for unsecured routes.TrueFalse
HL_LLM_PROXY_MAX_REQUEST_SIZE_BYTESX-LLM-Proxy-Max-Request-Size-BytesThe maximum size for a request or a response, in bytes.1000000False
n/ax-requester-idThe ID for the requester. This value takes precedence over hl-user-id, the requesting IP address (defined as the IP address communicating directly with the AIDR endpoint), and HL_LLM_PROXY_MLDR_DEFAULT_REQUESTER.Requesting IPFalse
n/ahl-user-idThe ID for the HiddenLayer user. This value takes precedence over IP address and HL_LLM_PROXY_MLDR_DEFAULT_REQUESTER.NoneFalse
HL_LLM_PROXY_MLDR_DEFAULT_REQUESTERn/aThe ID used if no other identification for the requester is found. The default is unknown.UnknownFalse

Prompt Injection

By default, the policy will be alert only for all detections.

Most configuration settings are true or false, with false being the default setting. For configurations with different settings, the settings are identified in the description.

Environment KeyHeader KeyDescriptionDefaultRequired
HL_LLM_SKIP_PROMPT_INJECTION_DETECTIONX-LLM-Skip-Prompt-Injection-DetectionFlag to skip prompt injection detection.FalseFalse
HL_LLM_BLOCK_PROMPT_INJECTIONX-LLM-Block-Prompt-InjectionIf prompt injection category is true, the message will be blocked.FalseFalse
HL_LLM_PROMPT_INJECTION_SCAN_TYPEX-LLM-Prompt-Injection-Scan-TypeType of prompt injection scan to perform FULL or QUICK.FULLFalse
HL_LLM_PROXY_PROMPT_INJECTION_ALLOW_{{id}}X-LLM-Prompt-Injection-Allow-{{id}}Optional

Identifier for custom Prompt Injections Allowed Expression.

Note: {{id}} must contain only alpha-numeric characters without spaces.

Note: The Allow list takes priority over the Block list.
n/aFalse
HL_LLM_PROXY_PROMPT_INJECTION_ALLOW_{{id}}_SUBSTRINGX-LLM-Prompt-Injection-Allow-{{id}}-SubstringThe substring to indicate a prompt is benign if detected within a prompt flagged as a Prompt Injection, mapped to its identifier. This is a string match.

Note: {{id}} must contain only alpha-numeric characters without spaces.

Caution: Take care when creating the substring to allow. A substring that is a commonly used word or phrase could allow more than expected.
n/aFalse
HL_LLM_PROXY_PROMPT_INJECTION_BLOCK_{{id}}X-LLM-Prompt-Injection-Block-{{id}}Optional

Identifier for custom Prompt Injections Blocklist Expression.

Note: {{id}} must contain only alpha-numeric characters without spaces.

Note: The Allow list takes priority over the Block list.
n/aFalse
HL_LLM_PROXY_PROMPT_INJECTION_BLOCK_{{id}}_SUBSTRINGX-LLM-Prompt-Injection-Block-{{id}}-SubstringThe substring to indicate a prompt is malicious if detected as an input, mapped to its identifier. This is a string match.

Note: {{id}} must contain only alpha-numeric characters without spaces.

Caution: Take care when creating the substring to allow. A substring that is a commonly used word or phrase could block more than expected.
n/aFalse

Prompt Injection Scan Types

  • QUICK - Only run the classifier on a single pass with 512 tokens.
  • FULL - Run classifier with multiple passes. This will strip certain characters and run the classifier on each line. Additional latency is added when using a FULL scan and increases with the size of input.

Examples

The following are examples for using keys that include variables.

Prompt Injection Allow

The following is an example config/vaules.yaml where the prompts "digital key" and "digitaler Schlüssel" are allowed.

namespace:
  name: aidr-genai

image:
  repository: quay.io/hiddenlayer/distro-enterprise-aidr-genai

resources:
  requests:
    cpu: 8

replicas:
  min: 1
  max: 1

config:
  HL_LICENSE: <license>
  
  OMP_NUM_THREADS: 8
  
  HL_LLM_PROXY_CLIENT_ID: <client_id>
  HL_LLM_PROXY_CLIENT_SECRET: <client_secret>
  HL_LLM_PROXY_PROMPT_INJECTION_ALLOW_DigitalKey: "digital key"
  HL_LLM_PROXY_PROMPT_INJECTION_ALLOW_DigitalKey_SUBSTRING: "digital key"
  HL_LLM_PROXY_PROMPT_INJECTION_ALLOW_DigitalKeyGerman: "digitaler Schlüssel"
  HL_LLM_PROXY_PROMPT_INJECTION_ALLOW_DigitalKeyGerman_SUBSTRING: "digitaler Schlüssel"

Denial of Service

By default, the policy will be alert only for all detections.

Most configuration settings are true or false, with false being the default setting. For configurations with different settings, the settings are identified in the Description.

Environment KeyHeader KeyDescriptionDefaultRequired
HL_LLM_SKIP_INPUT_DOS_DETECTIONX-LLM-Skip-Input-DOS-DetectionFlag to skip the LLM denial of service detection.FalseFalse
HL_LLM_BLOCK_INPUT_DOS_DETECTIONX-LLM-Block-Input-DOS-DetectionIf the LLM denial of service category is true, the message will be blocked.FalseFalse
HL_LLM_INPUT_DOS_DETECTION_THRESHOLDX-LLM-Input-DOS-Detection-ThresholdThreshold for input denial of service detection.4096False

Personal Identifiable Information (PII)

By default, the policy will be alert only for all detections.

Most configuration settings are true or false, with false being the default setting. For configurations with different settings, the settings are identified in the Description.

Environment KeyHeader KeyDescriptionDefaultRequired
HL_LLM_REDACT_INPUT_PIIX-LLM-Redact-Input-PIIFlag to redact input before sending to the LLM.FalseFalse
HL_LLM_SKIP_INPUT_PII_DETECTIONX-LLM-Skip-Input-PII-DetectionFlag to skip input PII detection.FalseFalse
HL_LLM_BLOCK_INPUT_PIIX-LLM-Block-Input-PIIIf input PII category is true, message will be blocked.FalseFalse
HL_LLM_SKIP_OUTPUT_PII_DETECTIONX-LLM-Skip-Output-PII-DetectionFlag to skip output PII detection.FalseFalse
HL_LLM_BLOCK_OUTPUT_PIIX-LLM-Block-Output-PIIIf output PII category is true, message will be blocked.FalseFalse
HL_LLM_REDACT_OUTPUT_PIIX-LLM-Redact-Output-PIIFlag to redact output before sending to the caller.FalseFalse

HL_LLM_REDACT_TYPE

X-LLM-Redact-Type

Type of redaction to perform ENTITY (ex [PHONE_NUMBER]) / STRICT (ex [REDACTED])

  • ENTITY - Redaction will be done with the entity type identified. The company phone number is <PHONE_NUMBER>.
  • STRICT - Redaction will be made with the word REDACTED. The company phone number is [REDACTED].

ENTITY

False

HL_LLM_ENTITY_TYPEX-LLM-Entity-TypeEntity Groups ALL / STRICT. See LLM Entity Types for a list of available types.STRICTFalse
HL_LLM_PROXY_PII_ALLOW_{{id}}X-LLM-PII-Allow-{{id}}
  • Optional
  • Identifier for custom PII Allowlist Expression.
  • Note: {{id}} must contain only alpha-numeric characters without spaces.
  • Note: The Allow list takes priority over the Block list.
n/aFalse
HL_LLM_PROXY_PII_ALLOW_{{id}}_EXPRESSIONX-LLM-Proxy-PII-Allow-{{id}}-Expression
  • The expression to allow if detected as PII, mapped to its identifier. This is a string match.
  • Note: {{id}} must contain only alpha-numeric characters without spaces.
  • Note: The Allow list takes priority over the Block list.
n/aFalse
HL_LLM_PROXY_PII_CUSTOM_{{name}}X-LLM-PII-Custom-{{name}}Name of custom PII recognizer. If Name is supplied, expression must also be provided under same name.NoneFalse
HL_LLM_PROXY_PII_CUSTOM_{{name}}_ENTITYX-LLM-PII-Custom-{{name}}-EntityThe entity to replace custom PII with, if found.{{REDACTED}}False
HL_LLM_PROXY_PII_CUSTOM_((name))_EXPRESSIONX-LLM-PII-Custom-((name))-ExpressionThe regex expression used to find custom PII.NoneFalse
HL_LLM_OVERRIDE_INPUT_PII_ENTITIESX-LLM-Override-Input-PII-EntitiesOverride list of input PII entities that should be looked for in the text.NoneFalse
HL_LLM_OVERRIDE_OUTPUT_PII_ENTITIESX-LLM-Override-Output-PII-EntitiesOverride list of output PII entities that should be looked for in the text.NoneFalse

LLM Entity Types

ALL

PERSON
LOCATION
ORGANIZATION
EMAIL_ADDRESS
CREDIT_CARD
PHONE_NUMBER
IP_ADDRESS
NATIONAL_ID
IBAN_CODE
US_BANK_NUMBER
UK_NATIONAL_INSURANCE_NUMBER
UK_PASSPORT
DOMAIN_NAME
URL
US_DRIVER_LICENSE
US_PASSPORT
US_SSN
US_ITIN
US_ABA_ROUTING_TRANSIT_NUMBER
US_HEALTHCARE_NPI
URL
EMAIL_ADDRESS

STRICT

PHONE_NUMBER
ORGANIZATION
CREDIT_CARD
IP_ADDRESS
NATIONAL_ID
IBAN_CODE
US_BANK_NUMBER
UK_NATIONAL_INSURANCE_NUMBER
UK_PASSPORT
DOMAIN_NAME
US_DRIVER_LICENSE
US_PASSPORT
US_SSN
US_ITIN
US_ABA_ROUTING_TRANSIT_NUMBER
US_HEALTHCARE_NPI

Code Detection

By default, the policy will be alert only for all detections.

Most configuration settings are true or false, with false being the default setting. For configurations with different settings, the settings are identified in the Description.

Environment KeyHeader KeyDescriptionDefaultRequired
HL_LLM_SKIP_INPUT_CODE_DETECTIONX-LLM-Skip-Input-Code-DetectionFlag to skip code detection.FalseFalse
HL_LLM_BLOCK_INPUT_CODE_DETECTIONX-LLM-Block-Input-Code-DetectionIf the input code detection category is true, the message will be blocked.FalseFalse
HL_LLM_SKIP_OUTPUT_CODE_DETECTIONX-LLM-Skip-Output-Code-DetectionFlag to skip code detection.FalseFalse
HL_LLM_BLOCK_OUTPUT_CODE_DETECTIONX-LLM-Block-Output-Code-DetectionIf the output code detection category is true, the message will be blocked.FalseFalse
HL_LLM_TIMEOUT_INPUT_CODE_SECONDSX-LLM-Timeout-Input-Code-SecondsThe number of seconds the code detector should run for an individual request before timing out. Accepted value: integer (example: 10).FalseFalse
HL_LLM_TIMEOUT_INPUT_CODE_IS_DETECTIONX-LLM-Timeout-Input-Code-Is-DetectionWhen the code detector times out, should the time out be considered a positive detection for code. Accepted values: true or false.FalseFalse
HL_LLM_TIMEOUT_OUTPUT_CODE_SECONDSX-LLM-Timeout-Output-Code-SecondsThe number of seconds the code detector should run for an individual request before timing out. Accepted value: integer (example: 10).FalseFalse
HL_LLM_TIMEOUT_OUTPUT_CODE_IS_DETECTIONX-LLM-Timeout-Output-Code-Is-DetectionWhen the code detector times out, should the time out be considered a positive detection for code. Accepted values: true or false.FalseFalse

Guardrail

By default, the policy will be alert only for all detections.

Most configuration settings are true or false, with false being the default setting. For configurations with different settings, the settings are identified in the Description.

Environment KeyHeader KeyDescriptionDefaultRequired
HL_LLM_SKIP_GUARDRAIL_DETECTIONX-LLM-Skip-Guardrail-DetectionFlag to skip guardrail detection.FalseFalse
HL_LLM_SKIP_GUARDRAIL_CLASSIFICATION_DETECTIONFlag to skip guardrail classificationFalseFalse
HL_LLM_BLOCK_GUARDRAIL_DETECTIONX-LLM-Block-Guardrail-DetectionIf the guardrail detection category is false, the message will be blocked.FalseFalse

Language Detection

Attackers attempting to do prompt injection may use multiple languages. HiddenLayer's language detector provides more visibility into your AI usage and helps control potentially malicious behavior. It runs on input prompts only.

The language detector has two components:

  • When enabled, it predicts whether a prompt is one of the top 20 most spoken languages, or returns unknown.
  • It allows you to select a set of supported languages, which lets only those languages through.
Supported languages

HiddenLayer's prompt injection model is trained and evaluated for seven languages: English, French, German, Italian, Japanese, Korean, and Spanish.

Using language detection can block all unsupported languages, providing an extra layer of security against prompt injection.

Environment KeyHeader KeyDescriptionDefaultRequired
HL_LLM_SKIP_INPUT_LANGUAGE_DETECTIONX-LLM-Skip-Input-Language-DetectionSkips input language detection and allows the prompt to be analyzed.TrueFalse
HL_LLM_BLOCK_INPUT_LANGUAGE_DETECTIONX-LLM-Block-Input-Language-Detection

Blocks inputs that are not on the allowed language list.

See HL_LLM_INPUT_ALLOWED_LANGUAGES for allowed languages.

FalseFalse
HL_LLM_INPUT_ALLOWED_LANGUAGESX-LLM-Input-Allowed-Languages

Allows languages included in your allowed list to be analyzed by the prompt injection model.

When multiple languages are included in the input, AIDR will choose a language it identifies as principal and classify it as such.

The default language values are: en, es, fr, it, de, ja, ko for English, Spanish, French, Italian, German, Japanese and Korean. So not setting this variable means the default languages are used.

Expand to see languages

AR - Arabic

BN - Bengali

DE - German

EN - English

ES - Spanish

FR - French

HI - Hindi

ID - Indonesian

IT - Italian

JA - Japanese

KO - Korean

MR - Marathi

PA - Punjabi

PT - Portuguese

RU - Russian

TA - Tamil

TE - Telugu

TR - Turkish

UR - Urdu

VI - Vietnamese

ZH - Chinese

n/aFalse

Examples

Allowed languages

Example environment keys

HL_LLM_SKIP_INPUT_LANGUAGE_DETECTION="false"
HL_LLM_BLOCK_INPUT_LANGUAGE_DETECTION="false"
HL_LLM_INPUT_ALLOWED_LANGUAGES="en, es, fr, it, de ,ja ,ko, ar, bn, hi, id, mr, pa, pt, ru, ta, te, tr, ur, vi, zh"

Example header key

X-LLM-Skip-Input-Language-Detection="false"
X-LLM-Block-Input-Language-Detection="false"
X-LLM-Input-Allowed-Languages="en, es, fr, it, de, ja, ko, ar, bn, hi, id, mr, pa, pt, ru, ta, te, tr, ur, vi, zh"

Block input language detection

Example environment key

HL_LLM_BLOCK_INPUT_LANGUAGE_DETECTION="true"

Example header key

X-LLM-Block-Input-Language-Detection="true"

Skip input language detection

Example environment key

HL_LLM_SKIP_INPUT_LANGUAGE_DETECTION="true"

Example header key

X-LLM-Skip-Input-Language-Detection="true"

URL Detection

By default, the policy will be alert only for all detections.

Most configuration settings are true or false, with false being the default setting. For configurations with different settings, the settings are identified in the Description.

Environment KeyHeader KeyDescriptionDefaultRequired
HL_LLM_SKIP_INPUT_URL_DETECTIONX-LLM-Skip-Input-URL-DetectionFlag to skip input URL detection.FalseFalse
HL_LLM_SKIP_OUTPUT_URL_DETECTIONX-LLM-Skip-Output-URL-DetectionFlag to skip output URL detection.FalseFalse

Conviction Severity Level

With these variables, you can set the threat level that is required for the model to convict.

Environment KeyHeader KeyDescriptionDefaultRequired
HL_LLM_PROXY_CONVICTION_SEVERITY_GUARDRAILX-LLM-Conviction-Severity-GuardrailSets severity for Guardrail conviction category. Accepted values: "Low", "Medium", "High""Low"False
HL_LLM_PROXY_CONVICTION_SEVERITY_DATA_LEAKAGEX-LLM-Conviction-Severity-Data-LeakageSets severity for Data Leakage conviction category. Accepted values: "Low", "Medium", "High""Medium"False
HL_LLM_PROXY_CONVICTION_SEVERITY_PROMPT_INJECTIONX-LLM-Conviction-Severity-Prompt-InjectionSets severity for Prompt Injection conviction category. Accepted values: "Low", "Medium", "High""High"False
HL_LLM_PROXY_CONVICTION_SEVERITY_DENIAL_OF_SERVICEX-LLM-Conviction-Severity-Denial-Of-ServiceSets severity for Denial-of-Service conviction category. Accepted values: "Low", "Medium", "High""High"False
HL_LLM_PROXY_CONVICTION_SEVERITY_MODALITY_RESTRICTIONX-LLM-Conviction-Severity-Modality-RestrictionSets severity for Modality Restriction conviction category. Accepted values: "Low", "Medium", "High""Medium"False