System Rules and Default Policy

HiddenLayer provides a set of built-in detection rules (System Rules) and a Default Policy provisioned automatically on first use, without any configuration required.

System Rules

System Rules are authored and maintained by HiddenLayer and are available to all tenants before any requests are made. They appear in the Detection Rules list alongside tenant-created rules and can be referenced in any policy. System Rules cannot be edited or deleted, but can be removed from a policy.

Rule	Severity	Default Action	What it detects
[SYSTEM] Prompt Injection	High	Detect	Attempts to override, manipulate, or bypass intended instructions across prompts, model outputs, and tool interactions. Covers direct injection (adversarial user input) and indirect injection (malicious content delivered via tool return values, retrieved documents, or other agent context).
[SYSTEM] URL or Hyperlink Presence	Low	Detect	URLs and hyperlinks across prompts, model outputs, and tool interactions.
[SYSTEM] Source Code Presence	Low	Detect	Source code across prompts, model outputs, and tool interactions.

Severities reflect baseline harm potential in general agentic contexts. In deployments where outbound URLs or source code in tool calls are higher-value signals (for example, coding agents or data-egress-sensitive workflows), tune severity and enforcement action at the policy level. Prompt Injection ships with a Detect default rather than Block to support baselining during initial deployment. The Default Policy itself is fixed; customers who want Block (or other) behavior should create a new policy that references the same System Rule with the desired enforcement action, then assign that policy to the relevant projects.

Default Policy

The Default Policy is created per-tenant on the first API request, at which point System Rules are automatically associated with it. Each project is assigned exactly one policy; the Default Policy is used if a different policy is not assigned.

Detection Rules

The three System Rules above are associated with the Default Policy on creation, with Default Action set to Detect (activity is flagged but not blocked).

PII Redaction

The following entity types are redacted using the mask strategy (detected values are replaced with a placeholder):

EMAIL_ADDRESS
PHONE_NUMBER
CREDIT_CARD
US_SSN
IP_ADDRESS

Secrets Redaction

The GENERIC_API_KEY entity type is redacted and replaced with a placeholder.

Customizing enforcement

System Rules are detection capabilities; policies bind rules to enforcement actions. To change how a System Rule is enforced (for example, Block instead of Detect, or Redact for specific entity types), create a new policy that references the same System Rule with the desired action, and assign that policy to the projects where it should apply. The original System Rule and the Default Policy are unchanged; the new behavior applies only where the new policy is assigned.

System Rule Deprecation

If HiddenLayer improves a System Rule (for example, a better prompt injection detector), existing tenants inherit the update automatically and the rule identity in policies is preserved. If a System Rule is deprecated, it is excluded from newly-created Default Policies; existing policies that already reference it are unaffected until the customer removes it.

The Default Policy cannot be deleted or modified. To apply different rules, enforcement actions, or redaction settings, create a new policy and assign it to the relevant projects. New projects without an explicitly assigned policy continue to inherit the Default Policy.