These patterns can help you get more out of Honeycomb MCP, whether you are running quick investigations or building fully autonomous workflows.Documentation Index
Fetch the complete documentation index at: https://docs.honeycomb.io/llms.txt
Use this file to discover all available pages before exploring further.
Querying Honeycomb
Agents can use Model Context Protocol (MCP) tools to explore your data and answer detailed questions about system behavior. Both goal-directed queries (like responding to an alert) and broader investigations (like identifying performance issues) tend to work well with modern large language models (LLMs). To get useful results:- Give specific instructions: If you are responding to a Trigger, mention it by name and tell the agent to use it as a starting point.
- Point to known issues: For example, if you have observed a latency spike or anomaly, describe it in your prompt so the agent can focus on the relevant time window or service.
api-gateway service” can be productive.
In our testing, agents often begin with duration_ms percentiles (p50, p95, p99) as a baseline.
Improving instrumentation
MCP can help agents understand and improve your instrumentation, especially when paired with code access or examples. Some patterns that work well:- Use live examples: Ask the agent to look at how other services are instrumented in your codebase. For example: “Write a new service and base its instrumentation on other Golang services in this repo.”
-
Combine auto-instrumentation with refinement: Apply zero-code OpenTelemetry instrumentation, then let the agent analyze the results using MCP.
The agent can:
- Identify duplicated telemetry
- Consolidate or remove redundant spans
- Create new instrumentation based on actual business logic
- Audit and iterate: Pair with an agent and ask it to evaluate your overall instrumentation quality against your actual data shape. Once the agent builds understanding, you can commit its artifacts or share them with teammates or other agents as part of a continuous instrumentation improvement loop.
Migrating queries to Honeycomb
LLMs are generally very good at translating between observability query languages, especially when you already have telemetry available in Honeycomb that maps to your old system. If you are migrating from PromQL, Datadog, or another system:- Paste the existing query into the prompt.
- Ask the agent to use MCP to generate an equivalent Honeycomb query.
- Let it iterate until the result is either a match or a useful approximation.
Running autonomous agents with Honeycomb
If you are building fully autonomous agents that use Honeycomb regularly, you will get better results by helping your agents build context and avoid unnecessary work. Iterating on your prompts and agent guidance is key.- Be explicit about what matters: Tell the agent exactly how to query your data. For example, list which environments and datasets are relevant. This prevents the agent from relearning the structure of your system each time.
- Reduce ambiguity: Provide access to source-of-truth files beyond Honeycomb, like your telemetry schemas. These help the agent investigate more effectively.
- Capture useful patterns: Save reliable prompts, queries, or instructions in agent memory files. Reusing these lets the agent build on past successes instead of starting from scratch.
Using the Canvas agent
Honeycomb’s Canvas gives teams a collaborative workspace for incident investigations. Because the Canvas agent is exposed over MCP, you can wire it into your own agentic workflows as a durable coordination point for observability work. Agents callcanvas_agent_invoke to initiate a turn and canvas_agent_poll_response to retrieve the result.
Passing the same investigation_id across calls lets multiple turns (or multiple agents) build on the same investigation over time.
Some patterns that work well:
- Hand off observability work to Canvas: Direct your local or cloud coding agent to call the Canvas agent for observability questions rather than running queries inline. This keeps observability context out of your coding session and produces an auditable record of the investigation for the rest of your team. You can also paste a Canvas URL into a local coding agent to give it the full context of an in-progress investigation before making code changes.
- Canvas-driven code review: Code review agents that support MCP can call the Canvas agent as part of their review process. For example, they can project how a PR’s changes will affect system state, verify telemetry changes, or check the health of a canary deployment alongside the diff.
- Coordinate short-lived agents through an investigation: A Canvas investigation is durable and visible to the whole team, which makes it a useful touchpoint for sandboxed or ephemeral agents.
Multiple agents can contribute findings to the same investigation, and another agent or a person can pick up the thread later by referencing the
investigation_id.
Managing Boards, Triggers, and SLOs
Honeycomb MCP includes write tools for creating and editing Boards, Triggers, SLOs, and the notification recipients that route their alerts. You can use these to let agents capture investigation results, bootstrap alerting and reliability targets for new services, or migrate definitions from other observability tools. Some patterns that work well:- Migrate alerts and dashboards: Paste a Datadog monitor definition or Grafana dashboard JSON into the prompt and ask the agent to create equivalent Honeycomb Triggers and Boards. Agents typically read your existing telemetry first to make sure the translation is grounded in what is actually being emitted.
- Bootstrap SLOs for a new service: Ask the agent to look at error rates and latency for a service over the last week, propose a Service Level Indicator (SLI) expression, and create an SLO at a reasonable target.
The
create_slotool auto-creates the SLI derived column as part of the call, so you do not need to define it ahead of time. - Audit and tune existing definitions: Ask the agent to review your Triggers or SLOs against recent data and suggest or apply changes.
- Route alerts to cloud agents: In addition to Honeycomb’s Anomaly Detection features and automatic investigations, you can create webhook recipients for Triggers/SLOs that initiate other agentic workflows.
Using semantic conventions in MCP
OpenTelemetry’s semantic conventions define standard names, types, and units for common attributes, likehttp.request.method, db.system, or service.name.
Honeycomb MCP exposes these conventions to agents through search_semconv, get_semconv_attribute, and list_semconv_namespaces, and overlays them with your team’s custom attribute descriptions from the Weaver registry.
Agents use these tools to write better instrumentation and queries, and to ground their reasoning in standard attribute names rather than relying on training data alone.
Some patterns that work well:
- Generate instrumentation that matches the spec: When asking an agent to add OpenTelemetry instrumentation to a service, tell it to use semantic conventions for any attribute that already has one.
The agent uses
search_semconvandget_semconv_attributeto confirm the canonical name, value type, and units for attributes likehttp.response.status_codeordb.query.textbefore writing code. - Orient an agent in a new domain: For prompts about an unfamiliar area (databases, messaging, GenAI), ask the agent to call
list_semconv_namespacesfirst to see what attribute families exist. This lets it ask better follow-up questions and converge faster than guessing. - Encode team-specific knowledge in Weaver: If your team uses non-standard attributes or has stronger opinions about a standard attribute’s semantics, add them to your Weaver registry.
Agents see your team’s descriptions through
find_columns,get_dataset_columns, andsearch_semconv, so customization translates directly into better suggestions.
Monitoring Claude Code with Honeycomb
If your team uses Claude Code, you can point its OpenTelemetry exporter at Honeycomb and then use MCP to investigate what Claude Code is doing, like token spend, tool failures, hook activity, session errors, and compaction triggers, directly from your agent. This is meta-observability: using Honeycomb’s agent to debug another agent. Claude Code emits traces that follow the OpenTelemetry GenAI semantic conventions, which is the same data shapelist_aiconversations and get_aiconversation are built around.
Once telemetry is flowing, you can ask the Honeycomb agent natural-language questions about specific sessions and get useful answers without writing a query.
To learn how to set up the exporter, visit Anthropic’s monitoring guide.
Enable the traces beta (CLAUDE_CODE_ENHANCED_TELEMETRY_BETA=1) to get the richest data, and point OTEL_EXPORTER_OTLP_ENDPOINT at Honeycomb’s OTLP endpoint with your ingest API key set as the x-honeycomb-team header in OTEL_EXPORTER_OTLP_HEADERS.
Some patterns to ask the Honeycomb agent about:
- Find the most expensive sessions: “Which Claude Code sessions used the most tokens this week?”
The agent sums
gen_ai.usage.input_tokensandgen_ai.usage.output_tokens, grouped bysession.id,user.email,vcs.branch, orsession.cwd. - Triage long or failing sessions: Call
list_aiconversationsto rank conversations by event count and error count, thenget_aiconversationon the worst offender to see every LLM call, tool call, and error in order, with token totals and durations. - Audit tool failures: “Which Claude Code tools have the highest failure rate?”
The agent groups by
tool.nameandtool.outcome, which helps identify brittle MCP servers, hooks, or bash patterns the agent keeps tripping over. - Track permission friction: “Which tool calls got blocked on permission prompts today, and how long did they wait?”
This drills into
claude_code.tool.blocked_on_userspans, which is useful for tuning unattended workflows. - Compare models and skills:
Group token usage by
gen_ai.request.modelorgen_ai.skill.namesto see which models are doing the work and which skills are loading most frequently.