Strands Agents SDK Deploy — Visualize Agent Traces with OpenTelemetry
Table of Contents
Introduction
From part 1 through part 3, we deployed agents to Docker, Lambda, and AgentCore. The deployment target is set, but production operation needs one more thing: a way to monitor agent behavior.
In practical part 5, we used result.metrics.get_summary() to check cycle counts and token usage. But that was post-hoc analysis of a single request. In production, you need to analyze across multiple requests, identify bottlenecks, and detect anomalies.
The Strands Agents SDK has built-in OpenTelemetry support. Just set an environment variable and the agent's reasoning and tool calls are recorded as traces.
A trace records how a single request was processed through the system. Traces consist of multiple spans (individual processing steps), showing the duration and parent-child relationships of each step.
This article covers:
- Starting Jaeger locally — preparing the trace visualization environment
- Enabling OTEL in the Strands agent — adding 2 lines of code to send traces
- Examining traces — verifying span structure and mapping to practical part 5's metrics
- Console exporter — inspecting detailed attributes recorded in each span
- Custom attributes — making traces searchable by session ID and user ID
See the official documentation at Traces.
Setup
Prerequisites:
- The part 1 environment (
strands-agentsinstalled) - Docker installed (used for Jaeger)
Install the OTEL dependency extras:
pip install 'strands-agents[otel]' strands-agents-toolsstrands-agents[otel] adds OpenTelemetry dependencies (opentelemetry-api, opentelemetry-sdk, etc.).
Starting Jaeger Locally
Jaeger is an open-source distributed tracing tool. The all-in-one container provides trace collection and visualization out of the box.
docker run -d --name jaeger \
-e COLLECTOR_OTLP_ENABLED=true \
-p 16686:16686 \
-p 4318:4318 \
jaegertracing/all-in-one:latest- Port
16686— Jaeger UI (view traces in the browser) - Port
4318— OTLP HTTP endpoint (receives traces from the agent)
After starting, verify the Jaeger UI is accessible at http://localhost:16686.
Enabling OTEL in the Strands Agent
Only 2 lines need to be added to the agent code.
import os
os.environ["OTEL_EXPORTER_OTLP_ENDPOINT"] = "http://localhost:4318"
from strands import Agent
from strands.models import BedrockModel
from strands.telemetry import StrandsTelemetry
from strands_tools import calculator
# Enable OTEL (just add these 2 lines)
strands_telemetry = StrandsTelemetry()
strands_telemetry.setup_otlp_exporter()
bedrock_model = BedrockModel(
model_id="us.anthropic.claude-sonnet-4-20250514-v1:0",
region_name="us-east-1",
)
agent = Agent(model=bedrock_model, tools=[calculator], callback_handler=None)
result = agent("What is 123 * 456?")
print(f"Response: {result.message['content'][0]['text']}")Key points:
OTEL_EXPORTER_OTLP_ENDPOINT— The trace destination. Points to Jaeger's OTLP HTTP endpointStrandsTelemetry().setup_otlp_exporter()— Enables the SDK's OTEL exporter. No other agent code changes needed. The SDK automatically records the agent loop, model calls, and tool executions as spans
python -u trace_agent.pyResponse: The result of 123 * 456 is **56,088**.The agent works normally. Traces are sent to Jaeger in the background.
Examining Traces
Open the Jaeger UI (http://localhost:16686), select strands-agents from the Service dropdown, and click "Find Traces". A list of traces appears — click any trace to see the span timeline.
Querying traces via the Jaeger API
curl -s "http://localhost:16686/api/traces?service=strands-agents&limit=1" \
| python3 -c "
import sys, json
data = json.load(sys.stdin)
trace = data['data'][0]
print(f'Trace ID: {trace[\"traceID\"]}')
print(f'Spans: {len(trace[\"spans\"])}')
for span in trace['spans']:
name = span['operationName']
duration = span['duration'] / 1000
print(f' {name}: {duration:.0f}ms')
"In this run, 6 spans were recorded.
invoke_agent Strands Agents (5511ms) ← Entire agent
├── execute_event_loop_cycle (5511ms) ← Cycle 1
│ ├── chat (4312ms) ← Model call (reasoning + tool selection)
│ └── execute_tool calculator (3ms) ← Tool execution
└── execute_event_loop_cycle (1191ms) ← Cycle 2
└── chat (1191ms) ← Model call (final answer)The "2 cycles" and "1 calculator call" we confirmed with result.metrics in practical part 5 are now visualized as trace spans. While result.metrics provided post-hoc numerical data, traces show a timeline with duration and parent-child relationships for each step.
Inspecting Span Details with the Console Exporter
The Jaeger UI shows span hierarchy and duration, but to examine the detailed attributes recorded in each span, the console exporter is useful. Replace (or add alongside) setup_otlp_exporter() in trace_agent.py with the following.
from strands.telemetry import StrandsTelemetry
StrandsTelemetry().setup_console_exporter()Call setup_console_exporter() instead of (or alongside) setup_otlp_exporter() to output spans as JSON to stdout. Key attributes:
console_trace.py (code and run command)
from strands import Agent
from strands.models import BedrockModel
from strands.telemetry import StrandsTelemetry
from strands_tools import calculator
strands_telemetry = StrandsTelemetry()
strands_telemetry.setup_console_exporter()
bedrock_model = BedrockModel(
model_id="us.anthropic.claude-sonnet-4-20250514-v1:0",
region_name="us-east-1",
)
agent = Agent(model=bedrock_model, tools=[calculator], callback_handler=None)
result = agent("What is 123 * 456?")python -u console_trace.pyModel call span (chat):
{
"name": "chat",
"attributes": {
"gen_ai.request.model": "us.anthropic.claude-sonnet-4-20250514-v1:0",
"gen_ai.usage.input_tokens": 1514,
"gen_ai.usage.output_tokens": 69,
"gen_ai.usage.total_tokens": 1583,
"gen_ai.server.time_to_first_token": 3293
}
}Tool execution span (execute_tool calculator):
{
"name": "execute_tool calculator",
"attributes": {
"gen_ai.tool.name": "calculator",
"gen_ai.tool.call.id": "tooluse_1DfAEkjH8xUXal4n5oLtpd",
"gen_ai.tool.status": "success"
}
}The inputTokens, outputTokens, and totalTokens from result.metrics in practical part 5 appear as gen_ai.usage.* attributes in traces. Traces also include information not available in result.metrics, such as gen_ai.server.time_to_first_token (time until the first token is returned).
Making Traces Searchable with Custom Attributes
In production, you often need to track specific users' requests or filter by session ID. The Agent's trace_attributes parameter adds custom attributes to spans. Just add trace_attributes to the Agent(...) in trace_agent.py (run with Jaeger still running).
custom_trace.py (code and run command)
import os
os.environ["OTEL_EXPORTER_OTLP_ENDPOINT"] = "http://localhost:4318"
from strands import Agent
from strands.models import BedrockModel
from strands.telemetry import StrandsTelemetry
from strands_tools import calculator
strands_telemetry = StrandsTelemetry()
strands_telemetry.setup_otlp_exporter()
bedrock_model = BedrockModel(
model_id="us.anthropic.claude-sonnet-4-20250514-v1:0",
region_name="us-east-1",
)
agent = Agent(
model=bedrock_model,
tools=[calculator],
callback_handler=None,
trace_attributes={
"session.id": "user-session-abc123",
"user.id": "taro",
"environment": "production",
},
)
result = agent("What is 123 * 456?")
print(f"Response: {result.message['content'][0]['text']}")python -u custom_trace.pyAfter running the agent, custom attributes are recorded on the invoke_agent span.
session.id: user-session-abc123
user.id: taro
environment: production
gen_ai.agent.name: Strands Agents
gen_ai.request.model: us.anthropic.claude-sonnet-4-20250514-v1:0
gen_ai.usage.total_tokens: 3197In the Jaeger UI, entering user.id=taro in the Tags filter searches for only that user's traces. This is useful for tracking specific requests in production where multiple users' requests are mixed.
Summary
- Only 2 lines of code change — Add
StrandsTelemetry()andsetup_otlp_exporter(), and the SDK automatically sends traces. Agent behavior is unchanged. - Span structure mirrors the agent loop —
invoke_agent→execute_event_loop_cycle→chat/execute_toolmaps directly to the "reasoning → tool selection → tool execution → reasoning" loop from introductory part 1. - Extends
result.metrics— Cycle counts and tool execution times from practical part 5 are visualized as timeline spans. The transition from post-hoc analysis to real-time monitoring. - Works with any OTEL-compatible backend — Since it uses the OpenTelemetry standard, you can switch to AWS X-Ray, Grafana Tempo, Datadog, or any other backend.
trace_attributesadds custom attributes to spans — Record session IDs and user IDs on spans, then search by tag in Jaeger. Useful for tracking specific requests in production.
Cleanup
docker rm -f jaeger