Skip to content

Red Team Evaluation

Red Team Evaluation simulates real-world adversarial attacks against your AI system by generating prompts based on the APE objectives and techniques.

Prerequisites

Before starting a Red Team evaluation, you should have:

  • A system prompt to test
  • The model(s) that power the application

Best Practices

  • Compare Versions: Run evaluations on both original and enhanced prompts to measure improvement.
  • Review Failed Attacks: Understanding why attacks failed is as important as knowing which succeeded.
  • Use Appropriate Models: Match the target model to what you're actually using in production.
Red Team Table

Run New Evaluation

  1. In the Console, select Security Testing > Red Teaming.

  2. Click Run New Evaluation. The Create Red Team Evaluation slideout displays.

  3. Enter a name for the evaluation.

  4. Enter the target system prompt.

  5. Select a target model.

    • Select a model that is similar to what you have in your environment to simulate the attacks against your system prompt.
    • Disclaimer: Models marked with a beta designation may be subject to lower usage quotas, limited availability, or ongoing development changes. As a result, these models may exhibit unexpected results, reduced performance, or intermittent failures during testing. Users should account for these limitations when selecting beta models.
    Red Team Create Evaluation
  6. Optionally, click Advanced Options to expand the section.

    • Select a project to apply runtime rulesets to interaction tagging.

      • If no project is selected, the default project is used.
    • Select an execution strategy.

      • Single: Runs each technique once per objective.

      • Random: Runs all techniques plus N additional random techniques.

        • Select the number of additional random techniques.
        Red Team Evaluation Random
      • Static prompt set: Uses a predefined set of static prompts for evaluation.

        • Select the prompt set from the drop-down list.
        Red Team Evaluation Static Prompt Set
    • Set the maximum number of conversation turns allowed per technique when attempting to achieve an objective. The minimum is one and the maximum is five.

      • The attack simulator will do multi-turn, trying to attack the target for N turns, and then go to the next session.
      • Note: If you selected static_prompt_set, then Attacker Max Turns to Complete Objective is not available.
    • Set the number of independent sessions to run for each technique. The minimum is one and the maximum is five.

      • This is the number of times you want to run the same technique or static prompt.
  7. Click Start Evaluation.

  8. When the evaluation completes, click the green arrow to view the results. See Red Team Evaluation Summary for more information.

Red Team Table Descriptions

ColumnDescription
NameThe name of the evaluation.
Start TimeThe date and time the evaluation started.
End TimeThe date and time the evaluation ended.
ElapsedThe time it took for the evaluation to end. The time is in hhmmss.
Actions

The actions available for the evaluation.

  • View Summary: View the Evaluation Summary slide-out.
  • Download CSV: Download the evaluation summary as a CSV file.

StatusThe status of the evaluation. Statuses: Completed, Failed.
View Status (green arrow)Click to go to the Evaluation Summary page. This page contains Metrics, Interactions, and Config data. See Evaluation Summary for more information.

Browse Prompt Sets

Clicking Browse Prompt Sets takes you to the Red Team Prompts Sets tab. See Red Team Prompt Sets for more information.

Log In Page

Filter Results

  1. Click Filter. The Filters slide-out displays.
  2. Select the statuses you want to view.
  3. Click Show Results.