Skip to content

Input Guardrails

Input guardrails validate user prompts before they’re sent to the LLM. Use them to:

  • Block prompts that are too long or too short
  • Detect and block PII (emails, phone numbers, SSNs)
  • Prevent prompt injection attacks
  • Filter toxic or inappropriate content
  • Rate limit requests
  • Block specific keywords or patterns
User Prompt → [Input Guardrails] → LLM → Response
Block if violated

When you call guarded_agent.run(), input guardrails run first. If any guardrail’s tripwire_triggered is True, the request is blocked before reaching the LLM.

from pydantic_ai import Agent
from pydantic_ai_guardrails import GuardedAgent
from pydantic_ai_guardrails.guardrails.input import (
length_limit,
pii_detector,
prompt_injection,
)
agent = Agent('openai:gpt-4o')
guarded_agent = GuardedAgent(
agent,
input_guardrails=[
length_limit(max_chars=2000),
pii_detector(),
prompt_injection(),
],
)
GuardrailPurposeKey Parameters
length_limit()Limit prompt lengthmax_chars, max_tokens
pii_detector()Detect PIIdetect_types, threshold
prompt_injection()Detect injection attackssensitivity
toxicity_detector()Detect toxic contentcategories, threshold
blocked_keywords()Block specific wordskeywords, case_sensitive
rate_limiter()Rate limit requestsmax_requests_per_minute

Prevent overly long prompts that could be expensive or abusive:

from pydantic_ai_guardrails.guardrails.input import length_limit
# By character count
guardrail = length_limit(max_chars=1000)
# By token count (requires tiktoken)
guardrail = length_limit(max_tokens=500)
# Both
guardrail = length_limit(max_chars=2000, max_tokens=500)

Detect personally identifiable information in prompts:

from pydantic_ai_guardrails.guardrails.input import pii_detector
# Default: detect all PII types
guardrail = pii_detector()
# Specific types only
guardrail = pii_detector(
detect_types=['email', 'phone', 'ssn', 'credit_card']
)

Detected PII types:

  • email - Email addresses
  • phone - Phone numbers
  • ssn - Social Security Numbers
  • credit_card - Credit card numbers
  • ip_address - IP addresses

Detect attempts to manipulate the LLM through prompt injection:

from pydantic_ai_guardrails.guardrails.input import prompt_injection
# Default sensitivity
guardrail = prompt_injection()
# High sensitivity (more false positives, fewer misses)
guardrail = prompt_injection(sensitivity='high')
# Low sensitivity (fewer false positives, more misses)
guardrail = prompt_injection(sensitivity='low')

Detects patterns like:

  • “Ignore previous instructions”
  • “You are now…”
  • “Forget everything”
  • System prompt extraction attempts

Filter toxic, harmful, or inappropriate content:

from pydantic_ai_guardrails.guardrails.input import toxicity_detector
# Default: all categories
guardrail = toxicity_detector()
# Specific categories
guardrail = toxicity_detector(
categories=['hate', 'violence', 'sexual'],
threshold=0.7
)

Block prompts containing specific words or phrases:

from pydantic_ai_guardrails.guardrails.input import blocked_keywords
guardrail = blocked_keywords(
keywords=['confidential', 'secret', 'password'],
case_sensitive=False,
)

Prevent abuse by limiting request frequency:

from pydantic_ai_guardrails.guardrails.input import rate_limiter
# Simple rate limit
guardrail = rate_limiter(max_requests_per_minute=10)
# Per-user rate limiting
guardrail = rate_limiter(
max_requests_per_minute=20,
key_func=lambda ctx: ctx.deps.get('user_id'),
)

Guardrails are evaluated in order. If any fails, the request is blocked:

guarded_agent = GuardedAgent(
agent,
input_guardrails=[
# Fast checks first
length_limit(max_chars=2000),
blocked_keywords(keywords=['hack', 'exploit']),
# More expensive checks last
pii_detector(),
prompt_injection(),
],
)

For better performance, run guardrails in parallel:

guarded_agent = GuardedAgent(
agent,
input_guardrails=[
length_limit(max_chars=2000),
pii_detector(),
prompt_injection(),
],
parallel=True, # Run all guardrails concurrently
)

See Parallel Execution for details.

By default, violations raise InputGuardrailViolation:

from pydantic_ai_guardrails import InputGuardrailViolation
try:
result = await guarded_agent.run(malicious_prompt)
except InputGuardrailViolation as e:
print(f"Blocked by: {e.guardrail_name}")
print(f"Reason: {e.message}")
print(f"Severity: {e.severity}") # low, medium, high, critical

For alternative handling, see Error Handling.