> ## Documentation Index
> Fetch the complete documentation index at: https://docs.honeycomb.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Instrumenting AI Agents

> Track input prompts, responses, token usage, tool calls, and agent invocations using OpenTelemetry.

AI agents may invoke tools, call other agents, or prompt a Generative AI (GenAI) model with multi-step user input.
The distributed and non-deterministic nature of agentic workflows makes monitoring and debugging difficult.

Instrument your AI agents with OpenTelemetry (OTel) GenAI semantic conventions to get full visibility into agent sessions and explore your agents in the [Agent Timeline](/investigate/observe/agent-timeline).

* Refer to [Fast AI Feedback Loops with Honeycomb and OpenTelemetry](https://www.honeycomb.io/blog/fast-ai-feedback-loops-honeycomb-opentelemetry) for an example of agent instrumentation using [Pydantic](https://ai.pydantic.dev/logfire/#setting-opentelemetry-sdk-providers).
* Visit the [example storechat](https://github.com/honeycombio/devrel-opentelemetry-demo/tree/main/src/storechat) application repo for further agent instrumentation examples.

## Enriching your traces with GenAI context

Add the following [OTel GenAI attributes](https://opentelemetry.io/docs/specs/semconv/registry/attributes/gen-ai/) to your agent spans to give Honeycomb the context it needs to group spans into conversations, identify agents, and surface meaningful data in the Agent Timeline.

<ResponseField name="gen_ai.conversation.id" type="string" required>
  Unique identifier for the conversation or session. Used to group all traces and spans belonging to the same agent conversation.
</ResponseField>

<ResponseField name="gen_ai.agent.name" type="string" required>
  Name of the agent emitting the span.

  In multi-agent workflows [each agent should have a unique name](#use-unique-names-for-each-agent).
</ResponseField>

<ResponseField name="gen_ai.operation.name" type="string" required>
  Type of agentic operation occurring:

  * `chat`
  * `create_agent`
  * `embeddings`
  * `execute_tool`
  * `generate_content`
  * `invoke_agent`
  * `invoke_workflow`
  * `retrieval`
  * `text_completion`

  For span naming conventions, refer to the [Naming generative AI operation spans](#naming-generative-ai-operation-spans) section.
</ResponseField>

<ResponseField name="gen_ai.usage.input_tokens" type="int">
  Number of tokens used in the GenAI input prompt.
</ResponseField>

<ResponseField name="gen_ai.usage.output_tokens" type="int">
  Number of tokens used in the GenAI response.
</ResponseField>

<ResponseField name="gen_ai.request.model" type="string">
  Name of the model requested.
</ResponseField>

<ResponseField name="gen_ai.response.model" type="string">
  Name of the model that generated the response. This can differ from the requested model.
</ResponseField>

<ResponseField name="gen_ai.response.finish_reasons" type="string[]">
  Why the model stopped generating tokens.

  Examples: `["stop"]`, `["tool_calls"]`, `["stop", "length"]`
</ResponseField>

<ResponseField name="gen_ai.tool.name" type="string">
  Name of the tool called by the agent.
</ResponseField>

<ResponseField name="gen_ai.tool.call.id" type="string">
  Unique identifier for the tool call.
</ResponseField>

<ResponseField name="gen_ai.tool.call.arguments" type="object | json">
  Parameters passed to the tool call.
</ResponseField>

<ResponseField name="gen_ai.tool.call.result" type="string">
  Result returned by the tool call (if any).

  To learn how to handle failed tool calls, refer to the [Recording errors and exceptions](#recording-errors-and-exceptions) section.
</ResponseField>

### Span events for input prompts, completions, and evaluations

Full prompts, chat history, and completion responses may be too large or contain personally identifiable information (PII) or other sensitive data.
Store them in span events where your OTel Collector can filter them before they reach Honeycomb.

<ResponseField name="gen_ai.input.messages" type="object | json">
  Chat history or input prompts provided to the model.

  <Danger>GenAI prompts or chats may contain PII or other sensitive data.</Danger>
</ResponseField>

<ResponseField name="gen_ai.output.messages" type="object | json">
  Messages returned by the model. Each message represents a specific model response.

  <Danger>GenAI responses may contain PII or other sensitive data.</Danger>
</ResponseField>

<ResponseField name="gen_ai.evaluation.result" type="string">
  Attach `gen_ai.evaluation.result` events to the GenAI operation span to [review evaluations in the GenAI tab](/investigate/analyze/explore-traces#gen-ai-tab).
</ResponseField>

## Keeping agent names unique

The [Agent Timeline](/investigate/observe/agent-timeline) groups spans by `gen_ai.agent.name`, so duplicate or missing names make it impossible to distinguish between agents during an investigation.

Each agent should have its own unique `gen_ai.agent.name`.
Sub-agents should use their own distinct name, instead of inheriting the parent agent's name.

<Note>
  If `gen_ai.agent.name` is omitted on a span, it will show up as `"Unknown"` on the Agent Timeline.
</Note>

## Instrumenting agent-to-agent calls

In multi-agent workflows, correctly attributing spans to the right agent is what makes the Agent Timeline useful.
The calling agent should emit the `invoke_agent` span, not the agent being called.
The called agent then emits its own spans (`chat`, `execute_tool`, and so on) under its own unique `gen_ai.agent.name`, keeping each agent's work distinct and traceable in the timeline.

To learn more about agent invocation spans, visit the OTel documentation:

* [Invoke agent client span (remote invocation)](https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-agent-spans/#invoke-agent-client-span)
* [Invoke agent internal span (same process invocation)](https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-agent-spans/#invoke-agent-internal-span)

## Handling errors and exceptions

Agentic workflows can fail in subtle ways: a tool call that silently drops its result, or an error that surfaces in a child span but never reaches the parent.
Recording errors consistently gives you the full picture when something goes wrong.

Record errors or exceptions following [the OTel specification](https://opentelemetry.io/docs/specs/semconv/general/recording-errors/) and include as many attributes as apply:

* `error.type` / `exception.type`
* `error.message` / `exception.message`
* `error.stacktrace` / `exception.stacktrace`

For tool call failures, propagate the error status to the parent span.

## Naming generative AI operation spans

Generative AI operation spans should follow these naming conventions.
Naming your spans this way ensures the [Agent Timeline](/investigate/observe/agent-timeline) can understand the operation type and display it meaningfully.

| Operation                     | `gen_ai.operation.name` | Span Name Pattern           |
| ----------------------------- | ----------------------- | --------------------------- |
| Chat                          | `chat`                  | `chat {model}`              |
| Create GenAI agent            | `create_agent`          | `create_agent {agent_name}` |
| Tool execution                | `execute_tool`          | `execute_tool {tool_name}`  |
| Agent invocation              | `invoke_agent`          | `invoke_agent {agent_name}` |
| Embeddings                    | `embeddings`            | `embeddings {model}`        |
| RAG retrieval                 | `retrieval`             | `retrieval {data_source}`   |
| Multimodal content generation | `generate_content`      | `generate_content {model}`  |
| Text completions              | `text_completion`       | `text_completion {model}`   |

## Remapping existing telemetry

In some cases, your existing telemetry may not conform to OpenTelemetry semantic conventions.
The [`transform` processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/transformprocessor) lets you remap those attributes before they reach Honeycomb.

For example, [Claude Code can emit telemetry](https://code.claude.com/docs/en/monitoring-usage) using the OpenTelemetry protocol, but does not yet use the OpenTelemetry semantic conventions.
The following configuration remaps Claude Code's emitted traces to match the OpenTelemetry semantic conventions, making them visible in the Agent Timeline.

```yml theme={}
processors:
  # Detailed Beta names spans `claude_code.*` and uses bare-namespace
  # attribute keys (session.id, input_tokens, …). The transform processor
  # remaps both names and attributes to the GenAI semconv on the way
  # through. Statements run in order, per OTTL context.
  transform:
    error_mode: ignore
    trace_statements:
      - context: resource
        statements:
          # Set service.name to identify your agent.
          - set(attributes["service.name"], "<AGENT NAME>")

      - context: span
        statements:
          # ---- Identity attributes added to every span ----
          # The AI Conversations viewer and Agent Timeline both key on
          # gen_ai.conversation.id; Detailed Beta carries the same value
          # under `session.id`.
          - set(attributes["gen_ai.agent.name"], "<AGENT NAME>")
          - set(attributes["gen_ai.conversation.id"], attributes["session.id"]) where attributes["session.id"] != nil

          # ---- claude_code.interaction → invoke_agent claude ----
          # Order matters. Rename the span first, then set
          # gen_ai.operation.name keyed off the new name. Detailed Beta
          # emits exactly one interaction span per Claude invocation.
          - set(name, "invoke_agent <AGENT NAME>") where name == "claude_code.interaction"
          - set(attributes["gen_ai.operation.name"], "invoke_agent") where name == "invoke_agent <AGENT NAME>"

          # ---- claude_code.llm_request → chat {model} ----
          # gen_ai.request.model is already populated by Detailed Beta.
          # If it is missing on some build, leave the original name in
          # place rather than renaming to "chat ".
          - set(name, Concat(["chat ", attributes["gen_ai.request.model"]], "")) where name == "claude_code.llm_request" and attributes["gen_ai.request.model"] != nil
          - set(attributes["gen_ai.operation.name"], "chat") where IsMatch(name, "^chat ")

          # ---- claude_code.tool → execute_tool {tool_name} ----
          # Detailed Beta names the wrapping tool span `claude_code.tool`
          # and carries the tool identity in the `tool_name` attribute.
          - set(name, Concat(["execute_tool ", attributes["tool_name"]], "")) where name == "claude_code.tool" and attributes["tool_name"] != nil
          - set(attributes["gen_ai.operation.name"], "execute_tool") where IsMatch(name, "^execute_tool ")
          - set(attributes["gen_ai.tool.name"], attributes["tool_name"]) where IsMatch(name, "^execute_tool ")

          # ---- Token alias (Detailed Beta llm_request) ----
          # Detailed Beta puts some tokens in gen_ai.* and others in the
          # bare namespace. Alias the bare ones up so the GenAI consumers
          # see the full set on each chat {model} span.
          - set(attributes["gen_ai.usage.input_tokens"], attributes["input_tokens"]) where attributes["input_tokens"] != nil
          - set(attributes["gen_ai.usage.output_tokens"], attributes["output_tokens"]) where attributes["output_tokens"] != nil
          - set(attributes["gen_ai.usage.cache_read_input_tokens"], attributes["cache_read_tokens"]) where attributes["cache_read_tokens"] != nil
          - set(attributes["gen_ai.usage.cache_creation_input_tokens"], attributes["cache_creation_tokens"]) where attributes["cache_creation_tokens"] != nil

          # ---- Tool args / result remap ----
          # With OTEL_LOG_TOOL_DETAILS=1 + OTEL_LOG_TOOL_CONTENT=1
          # claude_code.tool spans carry two content attributes:
          #   * `tool_input`  — the tool arguments JSON, prefixed with
          #                     "[TOOL INPUT: <Name>]\n"
          #   * `new_context` — the tool result JSON, prefixed with
          #                     "[TOOL RESULT: <Name>]\n"
          # Alias them onto gen_ai.tool.call.arguments / .result.
          # gen_ai.tool.call.id has no Detailed Beta equivalent, so it
          # stays absent.
          - set(attributes["gen_ai.tool.call.arguments"], attributes["tool_input"]) where IsMatch(name, "^execute_tool ") and attributes["tool_input"] != nil
          - set(attributes["gen_ai.tool.call.result"], attributes["new_context"]) where IsMatch(name, "^execute_tool ") and attributes["new_context"] != nil
```
