AGENTIC AI AND RAG SECURITY

Prompt-Injection Sandbox.

Additional page sections

Classifies suspicious instruction patterns in test text and maps them to defensive containment recommendations.

Version 2.4 Released Protected engine Prompt-injection defensive test case
PURPOSE

Decision supported.

Classifies suspicious instruction patterns in test text and maps them to defensive containment recommendations.

Intended user

research, assurance and technical review teams

Output status

Preliminary outputHuman review requiredNot certification
USE CASES

Where this instrument fits.

  • Train teams on instruction hierarchy risks
  • Create safe defensive test cases
  • Identify containment controls for untrusted content
  • Document prompt-injection residual risk
INPUTS

Required input fields.

  • Test input (optional)
  • Context (required): Direct chat prompt, Retrieved RAG content, Tool-use instruction, Email or document ingestion
  • Current controls (required): Strict source isolation and tool policy, Partial sanitization or filtering, Weak or unclear
  • Downstream action (required): No action, Draft-only, Tool call possible, External action possible

Data handling: this interface uses the L2ET protected same-origin instrument engine. Do not enter confidential, regulated, privileged, incident, medical or sensitive operational data.

METHOD

Console Lab logic.

Uses local pattern detection for hierarchy override, secret extraction, tool coercion, exfiltration, hidden instruction and unsafe output handling categories.

Source families

OWASP LLM guidanceprompt-injection testing practiceRAG security controls

Assumptions

  • Detection is heuristic.
  • Absence of a finding does not mean safe.
  • Real systems require red-team testing and control validation.
INTERACTIVE INSTRUMENT

Prompt-injection defensive test case.

Use the controls below to generate a preliminary artifact. The output is intentionally bounded and requires human review.

OUTPUT ARTIFACT

Prompt-injection defensive test case.

The generated artifact includes findings, assumptions, limitations, recommended next actions and exportable structured output.

Export options

Copy outputMarkdownJSON
EXAMPLE

Example input and output.

Example input

A retrieved document asks the model to ignore instructions and call an external API.

Example output

Flags indirect prompt injection and tool coercion; recommends source isolation, tool policy and human review.

LIMITATIONS

What this tool does not do.

  • Does not provide bypass recipes.
  • Does not attempt to attack a live system.
  • Does not guarantee complete detection.

This instrument does not provide legal, medical, cryptographic, engineering, regulatory or compliance certification.

RELATED METHOD

Method and workflow links.

Read the family method note for assumptions, output artifacts, update policy and review boundaries.

Open methodology Open family

CHANGELOG

Version history.

  • v2.4 - Research-grade instrument template, method notes, assumptions, limitations, example and export actions added.
  • Last updated: 2026-05-27.
  • Maturity state: Released.