LLM Judge

The llm_judge guardrail uses another LLM to evaluate response quality against specified criteria.

Import

from pydantic_ai_guardrails.guardrails.output import llm_judge

Basic Usage

from pydantic_ai_guardrails import GuardedAgent
from pydantic_ai_guardrails.guardrails.output import llm_judge

guarded_agent = GuardedAgent(
    agent,
    output_guardrails=[
        llm_judge(
            criteria='Is the response helpful and accurate?',
            threshold=0.7,
        ),
    ],
)

Parameters

Parameter	Type	Default	Description
`criteria`	`str \| list[str]`	Required	Evaluation criteria
`threshold`	`float`	`0.7`	Minimum passing score (0.0-1.0)
`judge_model`	`str`	Same as agent	Model to use for judging
`mode`	`'score' \| 'binary'`	`'score'`	Evaluation mode

Examples

Single Criterion

guardrail = llm_judge(
    criteria='Is the response helpful and addresses the user question?',
    threshold=0.7,
)

Multiple Criteria

guardrail = llm_judge(
    criteria=[
        'Is the response factually accurate?',
        'Is the tone professional?',
        'Does it directly answer the question?',
    ],
    threshold=0.7,
)

Custom Judge Model

# Use a faster/cheaper model for judging
guardrail = llm_judge(
    criteria='Is this response appropriate?',
    judge_model='openai:gpt-4o-mini',
    threshold=0.8,
)

Binary Mode

# Pass/fail instead of scored
guardrail = llm_judge(
    criteria='Does this response contain medical advice?',
    mode='binary',  # Returns 0 or 1
    threshold=0.5,  # 0.5 = must pass
)

Violation Result

When triggered, returns:

{
    'tripwire_triggered': True,
    'message': 'Response failed quality evaluation',
    'severity': 'medium',
    'metadata': {
        'score': 0.45,
        'threshold': 0.7,
        'criteria': ['Is the response helpful?'],
    },
    'suggestion': 'Provide a more complete and helpful response that directly addresses the question.',
}

Use Cases

Quality assurance: Ensure responses meet quality standards
Brand voice: Verify tone and style guidelines
Compliance: Check for policy-violating content
Accuracy: Evaluate factual correctness
Completeness: Ensure thorough answers

Combining with Auto-Retry

LLM Judge works great with auto-retry:

guarded_agent = GuardedAgent(
    agent,
    output_guardrails=[
        llm_judge(
            criteria='Is the response professional and helpful?',
            threshold=0.8,
        ),
    ],
    max_retries=2,  # Let LLM improve quality
)

The judge’s feedback helps the LLM understand what to improve.

Example Criteria

Customer Support

criteria=[
    'Does the response address the customer issue?',
    'Is the tone empathetic and professional?',
    'Are next steps clearly provided?',
]

Technical Documentation

criteria=[
    'Is the explanation technically accurate?',
    'Are code examples correct and runnable?',
    'Is the language clear and unambiguous?',
]

Content Moderation

criteria='Does this response comply with content policies and avoid harmful content?'