Introduction to Honeycomb

Honeycomb is an observability platform built for understanding complex software systems in production. When something goes wrong, or behaves unexpectedly, Honeycomb gives you the tools to find out why, without knowing in advance what question you need to ask. Many observability tools are built around known failure modes: metrics you decided to track, thresholds you set in advance, dashboards you built before the incident. That works when your systems fail in ways you have already seen. But these tools fall short when your system surprises you, which happens more often as systems grow more distributed and complex.

Why Honeycomb?

The core difference is in how Honeycomb stores and queries data. Many observability tools store raw telemetry but limit what you can query: a fixed set of indexed tags, pre-defined dashboards, or aggregations computed at ingest. Honeycomb runs arbitrary queries across all fields at interactive speed, so the question you ask at 2am during an incident doesn’t have to be one you anticipated when you set up your dashboards. A few specific things follow from that design:

You can instrument richly without penalty: Add as many fields as you want with many unique values per field, such as user IDs, request IDs, or feature flags. Honeycomb is built to handle high-cardinality data efficiently, with pricing based on event volume rather than field cardinality or dimension count.
You can ask questions you haven’t thought of yet: Every field in every event is automatically indexed when it arrives. There’s no schema to define in advance, no index to build, no field to “activate.” If you sent the data, you can query it.
Queries run in seconds, not minutes: Honeycomb’s purpose-built columnar store returns results on terabytes of raw event data at sub-second to low-second query times. The difference between a two-second query and a two-minute query isn’t just convenience; it makes iterative investigation practical.
The whole team can use it: All Honeycomb plans include unlimited seats, so you don’t have to ration or rotate access. An on-call engineer, a product manager, and a senior engineer debugging together can all query, annotate, and share findings in real time without worrying about per-seat charges.
AI fits naturally into this model: Honeycomb’s data is high-cardinality, fast to query, and accessible via API and MCP. That makes it a strong foundation for AI-assisted investigation: the data AI needs to reason about your system is already there, already indexed, and already queryable at the speed investigation requires.

How Honeycomb works

Honeycomb is built around a single foundational concept: the event. An event is a structured record of a single unit of work. It captures what happened, when it happened, how long it took, and any context your instrumentation includes: user IDs, feature flags, service names, error messages, build IDs. Honeycomb supports three telemetry signals, each of which answers different questions about your system:

Traces: Collections of spans that share a trace ID. Each span represents one unit of work in your distributed system. Honeycomb uses the parent-child relationships between spans to render a waterfall view of execution flow across your services.
Logs: Records of discrete events: errors, state changes, and application output. Structured logs map directly to Honeycomb events and are immediately queryable. When sent via OpenTelemetry, logs are automatically correlated with the traces they belong to. To learn more about Honeycomb’s approach to Logs, visit Logs in Honeycomb.
Metrics: Numeric measurements of your system captured over time, such as CPU utilization, request rates, and error counts. Honeycomb stores metrics in dedicated metrics datasets built on the OpenTelemetry Metrics Data Model, separate from your trace and log data. You can query metrics alongside your traces and logs, and Honeycomb can surface relevant metrics automatically when you investigate a latency spike or error pattern. To learn more about Honeycomb’s approach to Metrics, visit Metrics in Honeycomb.

To learn how Honeycomb organizes this data into datasets, environments, and teams, visit Honeycomb’s Data Model.

What you can do with your data

Once your data is in Honeycomb, you can:

Query across any dimension: The Query Builder lets you filter, group, and aggregate on any field in your data, including fields you didn’t know you’d need when you started instrumenting.
Explain unusual behavior: Select any region of your data and BubbleUp highlights which dimensions differ most from the baseline, so you can narrow down what changed without manually checking each field.
Get proactive alerts on anomalies: Anomaly Detection (Early Access) learns normal patterns per service from your trace and event data, such as error rate and data presence, and notifies you when behavior deviates, without requiring you to define a Trigger for every failure mode.
Investigate traces: The trace waterfall shows execution flow across services, with span-level detail and direct links to correlated logs and metrics.
Explore logs: The Logs view surfaces log volume, severity breakdowns, and top messages at a glance, so you can scan and filter log data without building queries from scratch.
Monitor metrics: The Metrics view surfaces time series data alongside your traces and logs, so you can correlate a latency spike with a CPU saturation event without switching tools.
Alert and notify: Set up Triggers to notify your team when your data crosses defined thresholds. SLOs track error budget burn over time and alert when burn rate accelerates.
Share findings: Boards collect queries and visualizations into a reusable view. Query links and trace links preserve full context for teammates.
Investigate with AI: With Honeycomb Intelligence enabled, Canvas can auto-investigate alerts and anomalies you configure, so findings can already be waiting when you open the incident. Query Assistant translates natural language into valid queries, and the Honeycomb MCP server lets supported AI tools query your production data directly. These features build on the same event data and fast query model as the rest of the product.

Where to start

Your first step is getting data in. Honeycomb recommends OpenTelemetry as the standard for instrumentation and ingestion.

Instrument your application

Send traces, logs, and metrics from your application using OpenTelemetry SDKs.

Instrument your infrastructure

Collect logs and metrics from Kubernetes, AWS, and other infrastructure using the OpenTelemetry Collector.

Explore the sandbox

Try Honeycomb with sample data before connecting your own systems.

Understand the core analysis loop

Learn the methodology behind debugging from first principles with Honeycomb.

Start Building

Observability Fundamentals

Honeycomb Basics

Plan + Design

Why Honeycomb?

How Honeycomb works

What you can do with your data

Where to start

Instrument your application

Instrument your infrastructure

Explore the sandbox

Understand the core analysis loop

​Why Honeycomb?

​How Honeycomb works

​What you can do with your data

​Where to start

Instrument your application

Instrument your infrastructure

Explore the sandbox

Understand the core analysis loop

Why Honeycomb?

How Honeycomb works

What you can do with your data

Where to start