Traces, Metrics, and Logs

Honeycomb supports three telemetry signals: traces, metrics, and logs. Each one captures different information about your system and works best for different kinds of questions.

How signals work in Honeycomb

In Honeycomb, traces and logs are stored as structured events: labeled JSON objects sent over HTTP, indexed automatically on every field at ingest, and queryable without a predefined schema. Metrics are different; they arrive as time-series data points in dedicated metrics datasets rather than as events. Understanding this distinction helps explain why each signal is queried differently in Honeycomb. To learn how Honeycomb organizes events into datasets, environments, and teams, visit Honeycomb Resource Structure.

Traces

A trace is a record of a request as it travels through your system. It is made up of spans, each representing one unit of work, such as a database query or a service call. Spans share a trace ID that identifies which trace they belong to. In Honeycomb, each span is stored as a structured event: a labeled JSON object you can filter, group, and aggregate on any field. Honeycomb uses the parent-child relationships between spans, expressed via trace.parent_id and trace.span_id, to render a waterfall view of execution flow across your services. Traces are the right signal when you want to:

Follow a specific request across multiple services
Understand execution flow and latency at the span level
Correlate errors or slowdowns with the exact code path that produced them

You create trace spans by instrumenting your code with OpenTelemetry. This example creates a span, adds context as attributes, and closes it when the work is done:

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider

# Set up the tracer
provider = TracerProvider()
trace.set_tracer_provider(provider)
tracer = trace.get_tracer("my-service")

# Create a span (event) with rich context
with tracer.start_as_current_span("bucketRequest") as span:
    span.set_attribute("http.request.method", "GET")
    span.set_attribute("app.group_bucket", 11)
    span.set_attribute("db.rows_returned", 42)

    # do some work

Here is an example of what a span looks like as a structured event in Honeycomb:

{
    "service.name": "retriever",
    "service.version": "1.4.2",
    "deployment.environment": "production",
    "host.name": "retriever-0a8b688312e490d1c",
    "host.type": "m6gd.2xlarge",
    "cloud.availability_zone": "us-east-1a",
    "http.request.method": "GET",
    "url.path": "/api/v1/bucketRequest",
    "http.response.status_code": 200,
    "http.response_content_length": 2326,
    "app.group_bucket": 11,
    "app.query_source": "trigger-cron",
    "app.total_segments": 0,
    "app.flag.golden_retrievers": false,
    "db.system": "postgresql",
    "db.name": "datasets",
    "duration_ms": 11.668,
    "build.id": 275663,
    "build.commit_hash": "9cb3de12faf709cdc9bca5c9900e8259c1719b02",
    "process.pid": 7585,
    "process.uptime_seconds": 2298,
    "trace.trace_id": "845a4de7-8e3a-4605-8476-6aa1592c3134",
    "trace.span_id": "84c82b34145c22f9",
    "trace.parent_id": "ab0e166fbbea7dc6"
}

With OpenTelemetry, Honeycomb calculates duration_ms from each span’s start and end time during ingest.

Every field is available for querying. You can group by service.name, filter by http.response.status_code, compute a P95 on duration_ms, or use BubbleUp to identify which dimensions differ most across a selected region of your data. To learn how to send traces to Honeycomb, visit Send Data.

Metrics

A metric is a numeric measurement of something in your system, captured at regular intervals over time. Metrics describe the state or behavior of a resource (a host, a service, a database connection pool) rather than a specific unit of work. Honeycomb supports metrics as a native signal built on the OpenTelemetry Metrics Data Model, including gauges, counters, sums, and histograms. Metrics are ingested as time-series data points in dedicated metrics datasets and are queried with metric-specific functions like RATE(), INCREASE(), and LAST() that account for the time-based nature of metric data. Metrics are the right signal when you want to:

Monitor infrastructure health consistently: CPU utilization, memory usage, request rates
Alert on threshold conditions over time using rates and trends
Track known quantities you always care about, regardless of what any individual request is doing

To learn more, visit Metrics in Honeycomb.

How Honeycomb metrics differ from pre-aggregated metrics

Some systems emit pre-aggregated metrics: rollups computed before write time, such as totals, averages, and percentiles calculated before sending to Honeycomb. Pre-aggregated metrics are efficient to produce, but they have a fundamental limitation: you can only ask questions that were anticipated at instrumentation time. For example, if you want to know how your system performs when the storage engine has a cache hit, that dimension needs to have been included in the pre-aggregated rollup:

{
  time: 4:03 pm,
  duration_sec: 60,
  total_hits: 500,
  avg_duration: 113,
  p95_duration: 236,
  ... (and so on) ...
}

Adding a new dimension, such as cache hit status, means adding new rows for every combination: avg_duration_cache_hit_true, avg_duration_cache_hit_false, p95_duration_cache_hit_true, and so on. Each new dimension multiplies the number of pre-aggregated rows required; this is the “curse of dimensionality.” In contrast, Honeycomb’s native metrics preserve the full resolution of your data and let you ask questions at query time rather than instrumentation time. To see this difference in practice, refer to the Events vs. Pre-aggregated Metrics section.

Logs

A log is a record of a discrete event in your system: an error, a state change, a user action, or any output your application emits at a specific point in time. Honeycomb receives logs as structured events, which means you can query them the same way you query trace data.

Structured logs

A structured log has consistent, labeled fields, like JSON output from an application logger or similar to what you might expect to see in a spreadsheet or a comma-separated-value file. Because of this structure, a structured log maps naturally to a structured event: its fields become event fields, immediately queryable in the Query Builder. Here is an example of a structured log line:

host-ip   username  datetime             cmd URL            protocol status size
127.0.0.1 frank     10/Oct/2000:13:55:36 GET /apache_pb.gif HTTP/1.0 200    2326

For new services, the recommended path is to send structured logs via OpenTelemetry, which maps log fields directly to event fields and enables automatic trace correlation. Honeytail is a tool that can ingest existing structured log files for legacy setups and emit structured events.

Unstructured logs

An unstructured log is a sequence of free-form messages, convenient for a person to read but harder to query. When Honeycomb starts its retriever service, the console prints something like this:

Running serve cmd: cd cmd/retriever && go run main.go -debug  -reader
time="2021-01-06T17:44:13-08:00" level=info msg="I'm a reader, using RPC for dataset flushes"
DEBU[2021-01-06T17:44:13.368024402-08:00] starting *secrets.YamlSecrets
DEBU[2021-01-06T17:44:13.368461794-08:00] starting *retrieverclient.YamlConfig named retriever_config
DEBU[2021-01-06T17:44:13.368679356-08:00] starting *config.YamlConfig
DEBU[2021-01-06T17:44:13.369046236-08:00] starting *s3.DefaultService
DEBU[2021-01-06T17:44:13.369518352-08:00] starting *lambda.DefaultService
WARN[2021-01-06T17:44:13.369694698-08:00] debug http server error                       error="listen tcp 127.0.0.1:6060: bind: address already in use"
INFO[2021-01-06T17:44:13.369727168-08:00] Debug service listening on localhost:6061
DEBU[2021-01-06T17:44:13.369728616-08:00] starting *beelineinit.Beeline
WARN[2021-01-06T17:44:13.369839218-08:00] debug http server error                       error="listen tcp 127.0.0.1:6061: bind: address already in use"
INFO[2021-01-06T17:44:13.369933256-08:00] Debug service listening on localhost:6062
DEBU[2021-01-06T17:46:15.618320662-08:00] starting *app.ReadApp
INFO[2021-01-06T17:46:15.618436383-08:00] Serving at 0.0.0.0:8089...
INFO[2021-01-06T17:47:15.618821038-08:00] I'm alive. 2021-01-06 17:47:15.618683749 -0800 PST m=+60.893288862

To understand what happened during startup, you need to read a whole sequence of lines, and even then causality is hard to establish. To find out how long the service took to start, you would need to subtract timestamps from each other. To find out whether an error occurred during startup, you would have to search for lines. Imagine summarizing the same startup as a single structured event:

{
  cmdline: "cd cmd/retriever && go run main.go -debug  -reader",
  startTime: "2021-01-06T17:44:13-08:00",
  mode: "reader",
  yamlSecrets_offset: "368024402",
  yamlConfig_offset: "13.368461794",
  debug_http_error: "listen tcp 127.0.0.1:6061: bind: address already in use",
  servicePort: 8089,
  duration_ms: 180262
  ...
}

In this format, you can query for questions like “Which services take the longest to start?” or “Do any slow startups correlate with particular error states?” To learn more, visit Logs in Honeycomb.

Using signals together

Honeycomb supports all three signals in the same platform, so you can move between them without switching tools. A common investigation pattern:

A metrics alert surfaces a problem: error rate is elevated.
A log query narrows the scope: these specific error messages are occurring most frequently.
A trace investigation explains the cause: this is where in the request execution the error originates.

Correlations

When you run an events-based query, Honeycomb surfaces related metrics in the Correlations view when metrics data exists for your environment. For example, if a latency spike coincides with a host running out of memory, you can see both signals side by side without leaving the Query Builder. To learn more, visit Correlations.

Metrics-based Triggers

You can alert on metrics data using the same Trigger system you use for events. A metrics alert can tell you that something is wrong; a trace investigation tells you why. To learn more, visit Metrics-based Triggers.

Logs as events

Because Honeycomb receives logs as events, you can query log data the same way you query any other dataset: filter, group, and aggregate across any field. To learn more, visit Logs in Honeycomb.

Managing data volume

If event volume is a concern, you can sample your data to reduce volume while keeping aggregates statistically accurate. For example, you could send one in a hundred status:200s but send every status:500. When your instrumentation includes a sample_rate on each event, Honeycomb uses it to scale counts so query results reflect your true traffic volume. To learn more, visit Sampling Guidelines. For metrics, factors like your collection interval and the number of active time series affect how many data points you send. To learn more, visit Manage Metrics Data Volume and How Honeycomb Calculates Usage.

Code examples

The following examples show how the same piece of code can be instrumented in different ways and how the outputs differ. Understanding these differences helps you choose the right signal for each job.

Events vs. pre-aggregated metrics

In this example, the left side is instrumented with events, the right with pre-aggregated metrics. Both capture the duration, but the output is quite different. Events are built up over time, gaining context as they go, whereas pre-aggregated metrics are updated individually and don’t carry that same context.

Events output

In this example, Honeycomb was used for events. Although a graph was rendered, the raw data gives a more equal comparison.

Pre-aggregated metrics output

In this example, pre-aggregated metrics were used.

Comparison

Both examples capture the duration in milliseconds and how many times the example app was run. At first glance, the metrics output appears to have more data, but all of those values can be calculated from the event data at query time. Pre-aggregated metrics tools need to do this calculation before write time, which limits what you can ask later. Events let you add richer context, like input, output, and timestamps, and compute any aggregation at query time.

Events vs. unstructured logs

The example on the left is instrumented with events, the one on the right with logs. Both capture the duration in milliseconds. With events, duration is available as duration_ms on the stored span. With logs, you write it as part of a log statement. In this example, the timestamp for each log statement is included inline and can be parsed out with regex. Each log line includes a description of the data followed by the value itself, in this case, the duration in milliseconds.

Events output

In this example, Honeycomb was used for events. Although a graph was rendered, the raw data gives a more equal comparison.

Logs output

The logged output:

Comparison

All the information is included in both outputs, but events let you see patterns in your data much faster. Both outputs show that one request was much slower than the other. With events, that’s visible at a glance. With logs, you have to read through the entire output to find the duration lines and match them to the right input and output.

Summary

All three approaches take roughly the same amount of time and effort to implement. The events and pre-aggregated metrics required slightly more setup initially because they needed configuration with an outside service. For logs, if you are doing more than writing to a terminal, you will need a similar amount of setup.

Start Building

Observability Fundamentals

Honeycomb Basics

Plan + Design

How signals work in Honeycomb

Traces

Metrics

How Honeycomb metrics differ from pre-aggregated metrics

Logs

Structured logs

Unstructured logs

Using signals together

Correlations

Metrics-based Triggers

Logs as events

Managing data volume

Code examples

Events vs. pre-aggregated metrics

Events output

Pre-aggregated metrics output

Comparison

Events vs. unstructured logs

Events output

Logs output

Comparison

Summary

​How signals work in Honeycomb

​Traces

​Metrics

​How Honeycomb metrics differ from pre-aggregated metrics

​Logs

​Structured logs

​Unstructured logs

​Using signals together

​Correlations

​Metrics-based Triggers

​Logs as events

​Managing data volume

​Code examples

​Events vs. pre-aggregated metrics

​Events output

​Pre-aggregated metrics output

​Comparison

​Events vs. unstructured logs

​Events output

​Logs output

​Comparison

​Summary

How signals work in Honeycomb

Traces

Metrics

How Honeycomb metrics differ from pre-aggregated metrics

Logs

Structured logs

Unstructured logs

Using signals together

Correlations

Metrics-based Triggers

Logs as events

Managing data volume

Code examples

Events vs. pre-aggregated metrics

Events output

Pre-aggregated metrics output

Comparison

Events vs. unstructured logs

Events output

Logs output

Comparison

Summary