Skip to content

Auto-Retry

When an output guardrail fails, you can automatically retry with structured feedback. This gives the LLM a chance to fix issues like PII leakage, quality problems, or policy violations.

Prompt → LLM → [Output Guardrail] → FAIL
Build Feedback
Prompt + Feedback → LLM → [Output Guardrail] → PASS → Response
  1. Output guardrail detects a violation
  2. Feedback is built from the violation’s message and suggestion
  3. Feedback is appended to the prompt
  4. LLM retries with the additional context
  5. Process repeats until success or max retries reached
from pydantic_ai import Agent
from pydantic_ai_guardrails import GuardedAgent
from pydantic_ai_guardrails.guardrails.output import secret_redaction
agent = Agent('openai:gpt-4o')
guarded_agent = GuardedAgent(
agent,
output_guardrails=[secret_redaction()],
max_retries=3, # Retry up to 3 times
)
# If the first response contains secrets, it will retry
result = await guarded_agent.run('Generate an example API configuration')

When a guardrail fails, the library builds feedback from the GuardrailResult:

# Guardrail returns:
{
'tripwire_triggered': True,
'message': 'API key detected in output',
'severity': 'high',
'suggestion': 'Replace API keys with placeholder like [API_KEY]',
}
# Feedback sent to LLM:
"""
The previous response violated the 'secret_redaction' guardrail (severity: high).
Issue: API key detected in output
Suggestion: Replace API keys with placeholder like [API_KEY]
Please revise your response to address this issue.
"""

The LLM receives this feedback and generates a new response.

The suggestion field in your guardrail result is crucial for successful retries:

async def check_pii(output: str) -> GuardrailResult:
if contains_email(output):
return {
'tripwire_triggered': True,
'message': 'Email address detected',
'severity': 'high',
# Good: specific, actionable instruction
'suggestion': (
'Replace all email addresses with generic placeholders '
'like [EMAIL] or describe them without including the '
'actual address (e.g., "contact us at our support email").'
),
}
return {'tripwire_triggered': False}

If multiple guardrails fail, all feedback is combined:

guarded_agent = GuardedAgent(
agent,
output_guardrails=[
secret_redaction(),
min_length(min_chars=100),
],
max_retries=3,
)

Combined feedback:

The previous response violated 2 guardrails. Please revise to address all issues:
1. 'secret_redaction' (severity: high): API key detected
Suggestion: Replace with [API_KEY] placeholder
2. 'min_length' (severity: medium): Response only 45 characters
Suggestion: Provide a more detailed response of at least 100 characters

The exception includes retry information:

from pydantic_ai_guardrails import OutputGuardrailViolation
try:
result = await guarded_agent.run(prompt)
except OutputGuardrailViolation as e:
print(f"Failed after {e.retry_count} retries")
# e.retry_count will be 3 if max_retries=3

If you have telemetry enabled, retry attempts are automatically traced:

from pydantic_ai_guardrails import configure_telemetry
configure_telemetry(enabled=True)
guarded_agent = GuardedAgent(
agent,
output_guardrails=[secret_redaction()],
max_retries=3,
)
# Retry attempts are logged:
# - Attempt 1/3: violation_count=1, feedback="..."
# - Attempt 2/3: violation_count=1, feedback="..."
# - Success on attempt 3

See Logfire Integration for full observability.

# Too few: might not give LLM enough chances
max_retries=1
# Good for most cases
max_retries=3
# Too many: wastes tokens and time if LLM can't fix it
max_retries=10

Auto-retry works best for issues the LLM can fix:

  • PII/secrets in output (LLM can redact)
  • Response too short (LLM can expand)
  • Wrong format (LLM can reformat)
  • Hedging language (LLM can be more direct)

It’s less effective for:

  • Factual errors (LLM might repeat them)
  • Hallucinations (LLM might not know it’s wrong)

Auto-retry only works with on_block='raise':

# Works: retries then raises if all fail
guarded_agent = GuardedAgent(
agent,
output_guardrails=[...],
max_retries=3,
on_block='raise', # Default
)
# Warning logged: retries won't happen
guarded_agent = GuardedAgent(
agent,
output_guardrails=[...],
max_retries=3,
on_block='log', # Retries ignored
)

Use llm_judge for subjective quality that benefits from retry:

from pydantic_ai_guardrails.guardrails.output import llm_judge
guarded_agent = GuardedAgent(
agent,
output_guardrails=[
llm_judge(
criteria='Is the response professional and helpful?',
threshold=0.8,
),
],
max_retries=2,
)

The judge’s feedback helps the LLM improve quality.

import asyncio
import re
from pydantic_ai import Agent
from pydantic_ai_guardrails import (
GuardedAgent,
GuardrailResult,
OutputGuardrail,
OutputGuardrailViolation,
)
async def check_pii(output: str) -> GuardrailResult:
"""Check for PII and provide actionable feedback."""
pii_patterns = {
'email': r'\b[\w.-]+@[\w.-]+\.\w+\b',
'phone': r'\b\d{3}[-.]?\d{3}[-.]?\d{4}\b',
'ssn': r'\b\d{3}-\d{2}-\d{4}\b',
}
found = []
for pii_type, pattern in pii_patterns.items():
if re.search(pattern, output):
found.append(pii_type)
if found:
return {
'tripwire_triggered': True,
'message': f'PII detected: {", ".join(found)}',
'severity': 'high',
'suggestion': (
f'Replace all {", ".join(found)} with placeholders like '
f'[{found[0].upper()}]. Do not include any real personal data.'
),
}
return {'tripwire_triggered': False}
async def main():
agent = Agent(
'openai:gpt-4o',
system_prompt='Generate example user profiles with contact info.',
)
guarded_agent = GuardedAgent(
agent,
output_guardrails=[OutputGuardrail(check_pii)],
max_retries=3,
)
try:
result = await guarded_agent.run('Create 3 example user profiles')
print(result.output)
# Output will have [EMAIL], [PHONE] placeholders
except OutputGuardrailViolation as e:
print(f"Could not generate safe output after {e.retry_count} retries")
asyncio.run(main())