AI / LLM Workflows
Autotel provides all the building blocks for comprehensive AI/LLM observability:
- Automatic LLM instrumentation via OpenLLMetry integration
- Workflow orchestration via nested
trace()calls - Context propagation via AsyncLocalStorage (correlation IDs, user context, etc.)
- Business event tracking via
ctx.setAttribute()andtrack() - Multi-destination events via adapters (PostHog, Mixpanel, etc.)
When to Use OpenLLMetry
Section titled “When to Use OpenLLMetry”| Use Case | Recommendation | Why |
| ---------------------------------------- | -------------------------------- | -------------------------------------------------------------------- |
| Using LLM SDKs (OpenAI, Anthropic, etc.) | Enable OpenLLMetry | Automatic capture of prompts, completions, tokens |
| Custom LLM integrations | Manual trace() only | OpenLLMetry won't detect custom integrations |
| Workflow orchestration | Always use trace() | Critical for tracking workflow steps |
| Business metrics | Always use trace() + track() | Domain events require explicit instrumentation |
| Production applications | Use both together | OpenLLMetry handles LLM internals, trace() handles everything else |
What Each Approach Provides
Section titled “What Each Approach Provides”OpenLLMetry Automatic Instrumentation
Section titled “OpenLLMetry Automatic Instrumentation”When enabled via init({ openllmetry: { enabled: true } }), OpenLLMetry automatically captures:
// Example: Using Vercel AI SDKimport { generateText } from 'ai';
// OpenLLMetry automatically instruments this call - zero code changes needed!const result = await generateText({ model: openai('gpt-4o'), prompt: 'Explain quantum computing',});
// Automatic span attributes captured:// - llm.request.model: "gpt-4o"// - llm.provider: "openai"// - llm.request.temperature: 0.7// - llm.usage.prompt_tokens: 45// - llm.usage.completion_tokens: 128// - llm.usage.total_tokens: 173// - llm.prompts.0.content: "Explain quantum computing"// - llm.completions.0.content: "[full response text]"What you get automatically:
- LLM API request/response details (prompts, completions, model parameters)
- Token usage tracking (prompt, completion, total)
- Timing and latency for each LLM call
- Error capture for failed LLM requests
- Support for streaming responses
- Works with 20+ LLM providers/SDKs (OpenAI, Anthropic, Langchain, LlamaIndex, Vercel AI SDK, etc.)
What you DON'T get:
- Business workflow context (which agent? which step? why called?)
- Business metrics (escalations, user satisfaction, custom events)
- Correlation across workflow steps
- Custom attributes for your domain logic
Manual trace() Instrumentation
Section titled “Manual trace() Instrumentation”Using autotel's trace() function provides full control over observability:
import { trace } from 'autotel';
const triageAgent = trace('agent.triage', (ctx) => async (input: string) => { // Business context ctx.setAttributes({ 'agent.role': 'triage', 'agent.purpose': 'route_to_specialist', 'workflow.step': 1, });
// Call LLM (OpenLLMetry will auto-instrument this call) const result = await generateText({ model: openai('gpt-4o-mini'), prompt: `Triage this request: ${input}`, });
// Business metrics const requiresEscalation = result.text.includes('ESCALATE'); ctx.setAttribute('triage.escalation_required', requiresEscalation);
return { decision: result.text, escalate: requiresEscalation };});What you get with trace():
- Named workflow steps (clear span names like "agent.triage")
- Business attributes (agent roles, workflow state, custom logic)
- Correlation IDs automatically propagated
- Parent-child span relationships for complex workflows
- Integration with events via
track()events - Works with ANY code (LLM or non-LLM)
import { init } from 'autotel';
init({ service: 'my-ai-app', endpoint: process.env.OTLP_ENDPOINT, openllmetry: { enabled: true, // Enable automatic LLM instrumentation options: { disableBatch: process.env.NODE_ENV !== 'production', }, },});Setup Guide
Section titled “Setup Guide”Option 1: OpenLLMetry Only (Not Recommended)
Section titled “Option 1: OpenLLMetry Only (Not Recommended)”If you only enable OpenLLMetry without using trace(), you'll get LLM call details but miss business context:
import { init } from 'autotel';
init({ service: 'my-ai-app', openllmetry: { enabled: true },});
// You'll see LLM spans but no workflow contextconst result = await generateText({ model: openai('gpt-4o'), prompt: 'test' });// No way to know: which agent? which step? which user? why called?Option 2: Manual trace() Only (Good for Custom Models)
Section titled “Option 2: Manual trace() Only (Good for Custom Models)”If you're using custom LLM integrations or direct HTTP calls:
import { trace } from 'autotel';
const callCustomLLM = trace('llm.custom_model', (ctx) => async (prompt: string) => { ctx.setAttributes({ 'llm.model': 'my-custom-model-v2', 'llm.provider': 'self-hosted', 'llm.prompt': prompt, });
const response = await fetch('https://my-llm-api.com/generate', { method: 'POST', body: JSON.stringify({ prompt }), });
const data = await response.json(); ctx.setAttributes({ 'llm.completion': data.text, 'llm.tokens': data.usage.totalTokens, });
return data.text;});Option 3: Both Together (Recommended)
Section titled “Option 3: Both Together (Recommended)”For production applications using LLM SDKs:
import { init, trace } from 'autotel';
init({ service: 'production-ai-app', openllmetry: { enabled: true }, // Auto-instrument LLM SDKs});
// Your workflow code uses trace() for business logicconst workflow = trace('workflow.main', (ctx) => async (input: string) => { // OpenLLMetry will auto-instrument any LLM calls inside // trace() provides workflow context and business metrics // Both appear as child spans in the same trace tree});Quick Decision Tree
Section titled “Quick Decision Tree”Are you using LLM SDKs (OpenAI, Anthropic, Vercel AI SDK, Langchain)?├─ Yes│ └─ Enable OpenLLMetry│ └─ Do you need business context/metrics?│ ├─ Yes → Also use trace() (RECOMMENDED)│ └─ No → OpenLLMetry only (you'll regret this later)│└─ No (custom models, direct HTTP) └─ Use trace() only └─ Add AI semantic conventions manuallyBasic AI Operation
Section titled “Basic AI Operation”import { trace } from 'autotel';
const generateResponse = trace( 'ai.generate', (ctx) => async (prompt: string) => { ctx.setAttributes({ 'ai.model': 'gpt-4o', 'ai.provider': 'openai', });
const response = await llm.generate(prompt); ctx.setAttribute('ai.tokens', response.usage.totalTokens);
return response; },);Core Concepts
Section titled “Core Concepts”Correlation IDs
Section titled “Correlation IDs”Correlation IDs automatically propagate through your entire workflow, making it easy to trace requests across multiple agents, services, and LLM calls.
import { trace, track } from 'autotel';
export const processUserRequest = trace( 'ai.user_request', (ctx) => async (userId: string, message: string) => { // Correlation ID is automatically available console.log('Trace ID:', ctx.traceId); console.log('Correlation ID:', ctx.correlationId); // First 16 chars of traceId
// All nested operations inherit this correlation context const analysis = await analyzeIntent(message); const response = await generateResponse(analysis);
// Events automatically include correlation IDs track('ai.request_completed', { userId, intent: analysis.intent, // correlationId, traceId, spanId are auto-added! });
return response; },);What you get automatically:
ctx.traceId- Full OpenTelemetry trace IDctx.correlationId- Short correlation ID (first 16 chars)ctx.spanId- Current span ID- Automatic propagation to all nested
trace()calls - Enrichment of all
track()events - Inclusion in structured logs (via
autotel/logger)
Multi-Step Workflows
Section titled “Multi-Step Workflows”Create parent-child span hierarchies naturally with nested trace() calls. Each step becomes a child span with automatic error handling and lifecycle management.
import { trace } from 'autotel';
export const processDocument = trace( 'document.processing', (ctx) => async (docId: string) => { ctx.setAttribute('document.id', docId); ctx.setAttribute('workflow.type', 'document_processing');
// Step 1: Load document (creates child span) const document = await trace('document.load', async () => { return await loadDocument(docId); });
// Step 2: Analyze with LLM (creates child span, OpenLLMetry auto-instruments LLM call) const analysis = await trace('document.analyze', async () => { const result = await llm.analyze(document.content); return result; });
// Step 3: Store results (creates child span) const stored = await trace('document.store', async () => { return await storeAnalysis(docId, analysis); });
return stored; },);Span Hierarchy Created:
document.processing (parent)├── document.load (child)├── document.analyze (child)│ └── openai.chat.completions (child, auto-instrumented by OpenLLMetry)└── document.store (child)Domain Events
Section titled “Domain Events”Track business-level events alongside technical telemetry using ctx.setAttribute() for span attributes and track() for events.
import { trace, track } from 'autotel';
export const handleAgentHandoff = trace( 'agent.handoff', (ctx) => async (task: Task) => { const startTime = performance.now();
// Set domain-specific span attributes ctx.setAttributes({ 'agent.from': 'triage', 'agent.to': 'specialist', 'task.priority': task.priority, 'task.category': task.category, });
// Perform handoff const result = await specialistAgent.process(task);
// Track business metric with precise duration track('agent.handoff_completed', { from: 'triage', to: 'specialist', duration_ms: Math.round(performance.now() - startTime), success: true, });
return result; },);Multi-Step Workflow
Section titled “Multi-Step Workflow”const workflow = trace('ai.workflow', (ctx) => async (input: string) => { const analysis = await trace('step1.analyze', async () => { return await analyzeInput(input); });
const response = await trace('step2.generate', async () => { return await generateResponse(analysis); });
return response;});Agent Handoffs
Section titled “Agent Handoffs”const runAgentWorkflow = trace( 'workflow.agents', (ctx) => async (input: string) => { ctx.setAttributes({ 'workflow.type': 'multi_agent', 'workflow.correlation_id': ctx.correlationId, });
const triageResult = await triageAgent(input); ctx.setAttribute('handoff.from', 'triage');
const specialistResult = await specialistAgent(triageResult);
return specialistResult; },);Pattern: Multi-Agent Workflows
Section titled “Pattern: Multi-Agent Workflows”Multi-agent systems require tracking "baton passes" between agents with full context propagation.
Triage, Specialist, and QA Escalation
Section titled “Triage, Specialist, and QA Escalation”import { trace, track } from 'autotel';import { generateText, generateObject } from 'ai';
// Agent 1: Triageconst triageAgent = trace('agent.triage', (ctx) => async (userRequest: string) => { ctx.setAttributes({ 'agent.role': 'triage', 'agent.model': 'gpt-4o-mini', });
const result = await generateText({ model: openai('gpt-4o-mini'), prompt: `Analyze this request and create a plan: ${userRequest}`, });
track('agent.triage_completed', { request_length: userRequest.length, plan_length: result.text.length, });
return { plan: result.text, requiresSpecialist: true, };});
// Agent 2: Specialistconst specialistAgent = trace('agent.specialist', (ctx) => async (plan: string) => { ctx.setAttributes({ 'agent.role': 'specialist', 'agent.model': 'gpt-4o', });
ctx.track('specialist_engaged', { plan_length: plan.length });
const result = await generateText({ model: openai('gpt-4o'), prompt: `Execute this plan: ${plan}`, });
track('agent.specialist_completed', { plan_length: plan.length, response_length: result.text.length, });
return { response: result.text, requiresQA: true, };});
// Agent 3: QAconst qaAgent = trace('agent.qa', (ctx) => async (response: string) => { ctx.setAttributes({ 'agent.role': 'qa', 'agent.model': 'gpt-4o', });
const result = await generateObject({ model: openai('gpt-4o'), schema: z.object({ approved: z.boolean(), feedback: z.string().optional(), requiresFollowUp: z.boolean(), }), prompt: `Review this response for quality: ${response}`, });
ctx.setAttribute('qa.approved', result.object.approved);
track('agent.qa_completed', { approved: result.object.approved, requires_follow_up: result.object.requiresFollowUp, });
return result.object;});
// Orchestrator: Workflow coordinatorexport const runMultiAgentWorkflow = trace( 'workflow.multi_agent_escalation', (ctx) => async (userRequest: string, userId: string) => { ctx.setAttributes({ 'workflow.type': 'multi_agent_escalation', 'workflow.user_id': userId, 'workflow.correlation_id': ctx.correlationId, });
// Step 1: Triage const triage = await triageAgent(userRequest); ctx.track('triage_complete', { requires_specialist: triage.requiresSpecialist });
// Step 2: Specialist (if needed) let response; if (triage.requiresSpecialist) { response = await specialistAgent(triage.plan); ctx.track('specialist_complete', { requires_qa: response.requiresQA }); }
// Step 3: QA (if needed) let qa; if (response?.requiresQA) { qa = await qaAgent(response.response); ctx.track('qa_complete', { approved: qa.approved }); }
// Track workflow completion track('workflow.completed', { workflow_type: 'multi_agent_escalation', user_id: userId, agents_involved: qa ? 3 : response ? 2 : 1, final_approval: qa?.approved ?? true, });
return { plan: triage.plan, response: response?.response, qa: qa, }; },);RAG Pipeline
Section titled “RAG Pipeline”import { trace } from 'autotel';import { embed } from 'ai';import { openai } from '@ai-sdk/openai';
// Step 1: Generate embeddingsconst generateEmbeddings = trace('rag.embeddings', (ctx) => async (query: string) => { ctx.setAttribute('query.length', query.length);
const { embedding } = await embed({ model: openai.embedding('text-embedding-3-small'), value: query, });
ctx.setAttribute('embedding.dimensions', embedding.length);
return embedding;});
// Step 2: Vector searchconst vectorSearch = trace( 'rag.search', (ctx) => async (embedding: number[], topK: number = 5) => { ctx.setAttributes({ 'search.top_k': topK, 'search.embedding_dimensions': embedding.length, });
const results = await vectorDb.search(embedding, topK);
ctx.setAttribute('search.results_count', results.length);
return results; },);
// Step 3: Generate response with contextconst generateWithContext = trace( 'rag.generate', (ctx) => async (query: string, context: string[]) => { ctx.setAttributes({ 'generation.context_chunks': context.length, 'generation.model': 'gpt-4o', });
const prompt = `Context:${context.join('\n\n')}
Question: ${query}
Answer based on the context above: `.trim();
const result = await generateText({ model: openai('gpt-4o'), prompt, });
ctx.setAttributes({ 'generation.tokens_used': result.usage.totalTokens, 'generation.response_length': result.text.length, });
return result.text; },);
// Complete RAG Pipelineexport const ragPipeline = trace( 'rag.pipeline', (ctx) => async (query: string, userId: string) => { ctx.setAttributes({ 'pipeline.type': 'rag', 'pipeline.user_id': userId, 'pipeline.query': query, });
const embedding = await generateEmbeddings(query); ctx.track('embeddings_generated');
const searchResults = await vectorSearch(embedding); ctx.track('search_completed', { results_count: searchResults.length });
const context = searchResults.map((r) => r.content); const response = await generateWithContext(query, context); ctx.track('generation_completed', { response_length: response.length });
track('rag.pipeline_completed', { user_id: userId, query_length: query.length, results_retrieved: searchResults.length, response_length: response.length, });
return { query, response, sources: searchResults.map((r) => r.metadata), }; },);Span Hierarchy:
rag.pipeline (parent)├── rag.embeddings (child)│ └── openai.embeddings (auto-instrumented by OpenLLMetry)├── rag.search (child)│ └── pinecone.query (auto-instrumented by OpenLLMetry)└── rag.generate (child) └── openai.chat.completions (auto-instrumented by OpenLLMetry)Pattern: Streaming Responses
Section titled “Pattern: Streaming Responses”Track streaming LLM responses with progress events and final metrics.
import { trace } from 'autotel';import { streamText } from 'ai';
export const generateStreamingResponse = trace( 'ai.stream', (ctx) => async (prompt: string) => { ctx.setAttributes({ 'stream.model': 'gpt-4o', 'stream.prompt_length': prompt.length, });
const stream = await streamText({ model: openai('gpt-4o'), prompt, });
let chunkCount = 0; let totalLength = 0;
const chunks: string[] = []; for await (const chunk of stream.textStream) { chunkCount++; totalLength += chunk.length; chunks.push(chunk);
// Add event for significant milestones if (chunkCount % 10 === 0) { ctx.track('streaming_progress', { chunks_received: chunkCount, total_length: totalLength, }); } }
ctx.setAttributes({ 'stream.chunks_count': chunkCount, 'stream.total_length': totalLength, 'stream.avg_chunk_size': Math.round(totalLength / chunkCount), });
track('ai.stream_completed', { model: 'gpt-4o', chunks: chunkCount, total_length: totalLength, });
return chunks.join(''); },);Pattern: Evaluation Loops
Section titled “Pattern: Evaluation Loops”Implement quality checks and iterative refinement with full observability.
import { trace } from 'autotel';
const generateContent = trace( 'ai.generate_content', (ctx) => async (prompt: string, model: string) => { ctx.setAttribute('generation.model', model);
const result = await generateText({ model: openai(model), prompt, });
return result.text; },);
const evaluateQuality = trace('ai.evaluate_quality', (ctx) => async (content: string) => { const result = await generateObject({ model: openai('gpt-4o'), schema: z.object({ score: z.number().min(0).max(100), feedback: z.string(), passesThreshold: z.boolean(), }), prompt: `Evaluate this content quality (0-100): ${content}`, });
ctx.setAttributes({ 'evaluation.score': result.object.score, 'evaluation.passes': result.object.passesThreshold, });
return result.object;});
export const generateWithQualityCheck = trace( 'ai.generate_with_qa', (ctx) => async ( prompt: string, options: { maxAttempts?: number; qualityThreshold?: number } = {}, ) => { const { maxAttempts = 3, qualityThreshold = 75 } = options;
ctx.setAttributes({ 'qa.max_attempts': maxAttempts, 'qa.threshold': qualityThreshold, });
let attempt = 0; let content: string; let evaluation: any;
do { attempt++; ctx.track('generation_attempt', { attempt });
content = await generateContent(prompt, 'gpt-4o'); evaluation = await evaluateQuality(content);
if (evaluation.passesThreshold) { ctx.track('quality_passed', { attempt, score: evaluation.score, }); break; } else if (attempt < maxAttempts) { ctx.track('quality_failed_retrying', { attempt, score: evaluation.score, feedback: evaluation.feedback, }); prompt = `${prompt}\n\nPrevious attempt feedback: ${evaluation.feedback}`; } } while (attempt < maxAttempts);
ctx.setAttributes({ 'qa.attempts_used': attempt, 'qa.final_score': evaluation.score, 'qa.success': evaluation.passesThreshold, });
track('ai.qa_loop_completed', { attempts: attempt, final_score: evaluation.score, success: evaluation.passesThreshold, threshold: qualityThreshold, });
return { content, evaluation, attempts: attempt, }; },);Semantic Conventions
Section titled “Semantic Conventions”Following OpenTelemetry semantic conventions ensures consistency across your AI applications.
LLM attributes:
ctx.setAttributes({ 'llm.model': 'gpt-4o', 'llm.provider': 'openai', 'llm.temperature': 0.7, 'llm.max_tokens': 4096, 'llm.response_tokens': 250, 'llm.prompt_tokens': 100, 'llm.total_tokens': 350,});Agent attributes:
ctx.setAttributes({ 'agent.role': 'specialist', 'agent.model': 'gpt-4o', 'agent.provider': 'openai', 'agent.temperature': 0.7,});Workflow attributes:
ctx.setAttributes({ 'workflow.type': 'multi_agent_escalation', 'workflow.correlation_id': ctx.correlationId, 'workflow.user_id': userId, 'workflow.session_id': sessionId,});RAG attributes:
ctx.setAttributes({ 'rag.embedding_model': 'text-embedding-3-small', 'rag.chunks_retrieved': 5, 'rag.search_top_k': 5, 'rag.rerank_enabled': true,});Evaluation attributes:
ctx.setAttributes({ 'evaluation.score': 85, 'evaluation.threshold': 75, 'evaluation.passes': true, 'evaluation.attempts': 2,});Business events:
import { track } from 'autotel';
track('workflow.completed', { type: 'multi_agent', agents_used: 3, // traceId, spanId, correlationId auto-added!});Best Practice: Use Both Together
Section titled “Best Practice: Use Both Together”import { init, trace, track } from 'autotel';
init({ service: 'customer-support-ai', endpoint: process.env.OTLP_ENDPOINT, openllmetry: { enabled: true },});
const handleCustomerQuery = trace( 'workflow.customer_query', (ctx) => async (query: string, userId: string) => { ctx.setAttributes({ 'workflow.type': 'customer_support', 'user.id': userId, });
// Step 1: Triage (OpenLLMetry auto-instruments the LLM call) const triage = await trace('step.triage', async () => { return await generateText({ model: openai('gpt-4o-mini'), prompt: `Triage: ${query}`, }); });
const needsEscalation = triage.text.includes('ESCALATE');
if (needsEscalation) { const specialist = await trace('step.specialist', async () => { return await generateText({ model: openai('gpt-4o'), prompt: `Expert response needed: ${query}`, }); });
track('escalation_occurred', { category: triage.text, userId, correlationId: ctx.correlationId, }); return { response: specialist.text, escalated: true }; }
return { response: triage.text, escalated: false }; },);What you get with both:
Trace Tree:workflow.customer_query (trace)├─ user.id: "user123"├─ workflow.type: "customer_support"├─ correlation.id: "abc-123-def"│├─ step.triage (trace)│ ├─ llm.chat (OpenLLMetry auto-span)│ │ ├─ llm.request.model: "gpt-4o-mini"│ │ ├─ llm.usage.prompt_tokens: 23│ │ ├─ llm.usage.completion_tokens: 45│ │ └─ llm.prompts.0.content: "Triage: ..."│ └─ triage.category: "billing_issue"│└─ step.specialist (trace) ├─ llm.chat (OpenLLMetry auto-span) │ ├─ llm.request.model: "gpt-4o" │ ├─ llm.usage.prompt_tokens: 78 │ ├─ llm.usage.completion_tokens: 234 │ └─ llm.prompts.0.content: "Expert response needed: ..." └─ escalated: true
Events:escalation_occurred├─ category: "billing_issue"├─ userId: "user123"└─ correlationId: "abc-123-def"Key benefits of combining both:
- Zero-effort LLM telemetry: OpenLLMetry captures all SDK calls automatically
- Business context:
trace()adds workflow meaning and business logic - Perfect correlation: All spans and events share the same correlation ID
- Complete picture: See both "what the LLM did" (OpenLLMetry) and "why it did it" (your trace spans)
- Events integration: Business events automatically correlated with technical traces
Real-World Examples
Section titled “Real-World Examples”example-ai-agent— Multi-agent escalation systems (simulated and real LLM with OpenLLMetry), RAG pipelines, and@openai/agentsintegration.
See apps/example-ai-agent/src/multi-agent-workflow-with-openllmetry.ts for a complete example showing OpenLLMetry enabled in init(), multi-agent workflow using trace() for business context, and real OpenAI SDK calls auto-instrumented by OpenLLMetry.
Compare with apps/example-ai-agent/src/multi-agent-workflow.ts which uses simulated LLM calls (no OpenLLMetry needed).
See apps/example-ai-agent/src/rag-pipeline.ts for a complete RAG pipeline example showing embeddings generation tracking, vector search observability, context assembly monitoring, and end-to-end pipeline metrics.