Output Guardrails
Output guardrails validate LLM responses after generation but before returning to users. Use them to:
- Detect leaked secrets, API keys, or passwords
- Evaluate response quality with LLM-as-a-judge
- Validate JSON structure and content
- Match output against regex patterns
- Ensure the model didn’t refuse to answer
- Verify that required tools were called
- Enforce minimum response length
How Output Guardrails Work
Section titled “How Output Guardrails Work”User Prompt → LLM → [Output Guardrails] → Response ↓ Block or RetryAfter the LLM generates a response, output guardrails validate it. If validation fails, you can:
- Block: Raise an exception (default)
- Retry: Automatically retry with feedback (see Auto-Retry)
- Log: Log a warning and continue
Basic Usage
Section titled “Basic Usage”from pydantic_ai import Agentfrom pydantic_ai_guardrails import GuardedAgentfrom pydantic_ai_guardrails.guardrails.output import ( secret_redaction, llm_judge, min_length,)
agent = Agent('openai:gpt-4o')
guarded_agent = GuardedAgent( agent, output_guardrails=[ secret_redaction(), min_length(min_chars=50), llm_judge(criteria='Is the response helpful?'), ],)Available Output Guardrails
Section titled “Available Output Guardrails”| Guardrail | Purpose | Key Parameters |
|---|---|---|
secret_redaction() | Detect leaked secrets | patterns |
llm_judge() | LLM-as-a-judge evaluation | criteria, threshold |
json_validator() | Validate JSON output | schema |
regex_match() | Match against patterns | pattern, must_match |
no_refusals() | Detect model refusals | refusal_patterns |
min_length() | Ensure minimum length | min_chars |
require_tool_use() | Ensure tools were called | tool_names |
tool_allowlist() | Restrict allowed tools | allowed_tools |
validate_tool_parameters() | Validate tool arguments | schemas |
Secret Redaction
Section titled “Secret Redaction”Detect API keys, passwords, and other secrets in responses:
from pydantic_ai_guardrails.guardrails.output import secret_redaction
# Default patterns (API keys, passwords, tokens)guardrail = secret_redaction()
# Custom patternsguardrail = secret_redaction( patterns=[ r'sk-[a-zA-Z0-9]{32,}', # OpenAI keys r'AKIA[A-Z0-9]{16}', # AWS keys r'password[=:]\s*\S+', # Passwords ])Default detected patterns:
- OpenAI API keys (
sk-...) - AWS access keys (
AKIA...) - GitHub tokens (
ghp_...,gho_...) - Generic API key patterns
- Password assignments
LLM Judge
Section titled “LLM Judge”Use another LLM to evaluate response quality:
from pydantic_ai_guardrails.guardrails.output import llm_judge
# Single criterionguardrail = llm_judge( criteria='Is the response helpful and accurate?', threshold=0.7,)
# Multiple criteriaguardrail = llm_judge( criteria=[ 'Is the response factually accurate?', 'Is the tone professional?', 'Does it directly answer the question?', ], threshold=0.7, judge_model='openai:gpt-4o-mini', # Use cheaper model for judging)The judge returns a score from 0 to 1. If the score is below threshold, the guardrail triggers.
JSON Validator
Section titled “JSON Validator”Ensure output is valid JSON, optionally matching a schema:
from pydantic_ai_guardrails.guardrails.output import json_validator
# Just validate it's valid JSONguardrail = json_validator()
# Validate against a schemaguardrail = json_validator( schema={ 'type': 'object', 'properties': { 'name': {'type': 'string'}, 'age': {'type': 'integer'}, }, 'required': ['name', 'age'], })Regex Match
Section titled “Regex Match”Validate output against regex patterns:
from pydantic_ai_guardrails.guardrails.output import regex_match
# Output MUST match this patternguardrail = regex_match( pattern=r'^[A-Z][a-z]+', # Must start with capital letter must_match=True,)
# Output must NOT match this patternguardrail = regex_match( pattern=r'TODO|FIXME|XXX', must_match=False, # Block if pattern is found)No Refusals
Section titled “No Refusals”Detect when the model refuses to answer:
from pydantic_ai_guardrails.guardrails.output import no_refusals
guardrail = no_refusals()Detects phrases like:
- “I cannot help with that”
- “I’m not able to”
- “As an AI, I don’t”
- “I apologize, but I cannot”
Tool Validation Guardrails
Section titled “Tool Validation Guardrails”Require Tool Use
Section titled “Require Tool Use”Ensure specific tools were called:
from pydantic_ai_guardrails.guardrails.output import require_tool_use
# At least one of these tools must be calledguardrail = require_tool_use( tool_names=['search', 'calculate'], mode='any', # or 'all' to require all tools)Tool Allowlist
Section titled “Tool Allowlist”Restrict which tools can be called:
from pydantic_ai_guardrails.guardrails.output import tool_allowlist
# Only these tools are allowedguardrail = tool_allowlist( allowed_tools=['search', 'get_weather'],)Validate Tool Parameters
Section titled “Validate Tool Parameters”Validate arguments passed to tools:
from pydantic_ai_guardrails.guardrails.output import validate_tool_parameters
guardrail = validate_tool_parameters( schemas={ 'search': { 'type': 'object', 'properties': { 'query': {'type': 'string', 'minLength': 3}, }, 'required': ['query'], }, })Accessing Message History
Section titled “Accessing Message History”Output guardrails can access the full conversation via GuardrailContext:
from pydantic_ai_guardrails import GuardrailContext, GuardrailResult, OutputGuardrail
async def check_tool_calls( ctx: GuardrailContext, output: str) -> GuardrailResult: # Access message history for msg in ctx.messages or []: # Inspect tool calls in the conversation if hasattr(msg, 'parts'): for part in msg.parts: if hasattr(part, 'tool_name'): print(f"Tool called: {part.tool_name}")
return {'tripwire_triggered': False}
guardrail = OutputGuardrail(check_tool_calls)Auto-Retry on Violation
Section titled “Auto-Retry on Violation”Instead of blocking, you can automatically retry with feedback:
guarded_agent = GuardedAgent( agent, output_guardrails=[secret_redaction()], max_retries=3, # Retry up to 3 times)When a guardrail fails, the library sends structured feedback to the LLM so it can self-correct. See Auto-Retry for details.
Handling Violations
Section titled “Handling Violations”from pydantic_ai_guardrails import OutputGuardrailViolation
try: result = await guarded_agent.run(prompt)except OutputGuardrailViolation as e: print(f"Blocked by: {e.guardrail_name}") print(f"Reason: {e.message}") print(f"Retry count: {e.retry_count}")Next Steps
Section titled “Next Steps”- Auto-Retry - Let the LLM self-correct on violations
- Custom Guardrails - Write your own output validation
- Tool Validation - Deep dive into tool guardrails