Sampling | Honeycomb


Sampling is the concept of selecting a few elements from a large collection and learning about the entire collection by extrapolating from the selected set. This page covers terminology, reasons for sampling, and different sampling methods and tools, such as head sampling, and tail sampling.


It’s important to use consistent terminology when discussing sampling. A trace or span is considered “sampled” or “not sampled”:

  • Sampled: A trace or span is processed and exported. Because it is chosen by the sampler as a representative of the population, it is considered “sampled”.
  • Not sampled: A trace or span is not processed or exported. Because it is not chosen by the sampler, it is considered “not sampled”.

Sometimes, the definitions of these terms get mixed up in conversation or online. You may find someone state that they are “sampling out data” or that data not processed or exported is considered “sampled”. While the behavior they describe may be the same, these are incorrect terms.

Why Sampling 

Primary reasons to sample data, all of which are typically related to one another, include:

  • Reduce total data volume. A representative sample of your data will typically be much smaller than the entire volume of data produced.
  • Ensure you sample interesting traces. The question of representativeness can be nuanced if you have a wide variety of traffic, especially if it is irregular.
  • Filter out noise. For services with predictable traffic patterns, a small sample can often be enough to capture representative behavior of your services.

The biggest question that comes up with sampling is: “How do I make sure I am not missing important data I might need?”. The answer depends on your data. If you do not have a lot of data in the first place, sampling is more trouble than it is worth. If you have a lot of data, but it is fairly uniform or it is not critical that you capture everything that may be interesting to you right now, you can often get away with a simple sampling strategy. If you have a lot of conditions that matter to you, or irregular traffic patterns across your services, you will need a more sophisticated sampling strategy.

Honeycomb offers tools to help you sample your data in a way that is tailored to your needs.

When to Sample: 1000 requests per second 

If your service receives more than 1000 requests per second, you should strongly consider sampling. Your sampling strategy should allow cost and resolution to determine your optimal sample rate.

Sampling and Tracing 

Sampling is an essential part of tracing at a large scale. Consider different kinds of traces:

  • Traces that finish successfully with no errors
  • Traces with specific attributes on them
  • Traces with high latency
  • Traces with errors on them

The large majority of the time, the first kind–traces that finish successfully with no errors–far outnumber all the other kinds. Traces in the first category are still critical to have because they represent what “healthy” behavior looks like, and you will need them for comparison when you are looking at the other kinds of traces. However, you most likely do not need all of them. A sample of these traces will be enough to understand the overall health of your system.

The other three categories are much more interesting, and depending on your needs, you may want to take larger samples (or even 100%) of these traces.

Head Sampling and Tail Sampling 

There are many different techniques for sampling data, all with different tradeoffs. Each technique falls into one of two categories: head sampling and tail sampling.

Head Sampling 

Head sampling is when you sample traces without looking at the entire trace. The decision to sample or not sample a span in a trace is often made as early as possible. In OpenTelemetry, a head sampling decision is made during span creation–unsampled spans are never even created.

The most common form of head sampling is deterministic probability sampling. Given a constant sampling rate that represents a fixed percentage of traces to sample, the sampler will make a decision to sample or not sample spans based on using the trace ID as a random number. Using the trace ID allows disparate samplers to make consistent decisions for all of the spans in a trace.

All of Honeycomb’s SDKs support deterministic probability sampling:

Deterministic probability sampling is also supported by every other OpenTelemetry SDK.

When to Use Head Sampling 

Head sampling is a blunt instrument. It is simple to configure and requires no additional infrastructure or operational overhead.

But what head sampling offers in simplicity, it loses in flexibility:

  • You cannot sample traces based on errors they contain or their overall latency
  • You cannot sample traces based on attributes on different spans in a trace
  • You cannot dynamically adjust your sampling rate based on traffic to a service

To accomplish the above, you need to use tail sampling instead.

Tail Sampling 

Tail sampling is where the decision to sample a trace takes place by considering all or most of the spans within the trace. Honeycomb offers Refinery as a tail sampling solution to install in your environment. Because tail sampling is done by inspecting whole traces, it enables you to apply many different sampling techniques. Some of these techniques include:

  • Dynamic sampling - By configuring a set of fields on a trace that make up a key, the sampler automatically increases or decreases the sampling rate based on how frequently each unique value of that key occurs. For example, a key made up of http.status_code will sample much less traffic for requests that return 200 than for requests that return 404.
  • Rules-based sampling - This enables you to define sampling rates for well-known conditions. For example, you can sample 100% of traces with an error and then fall back to dynamic sampling for all other traffic.
  • Throughput-based sampling - This enables you to sample traces based on a fixed upper bound on the number of spans per second.
  • Deterministic probability sampling - Although deterministic probability sampling is also used in head sampling, it is still possible to use it in tail sampling.

Tail sampling with Refinery lets you combine all of the above techniques in arbitrary ways to create a sampling strategy that is tailored to your needs.

When to Use Tail Sampling 

Tail sampling with Refinery lets you sample traces in just about any way you can imagine. How you configure tail sampling depends on your needs and the complexity of your system.

Most people tend to follow some common patterns:

  • Configure several rules to use a high or low sampling rate for well-known conditions, like keeping all errors in traces and dropping most health checks
  • Configure a dynamic sampler based on a low-cardinality key like http.status_code to sample traces proportionally across all values of that key

The rules and key configuration will often have to take into account attributes that are unique to your system.

The flexibility and sophistication of tail sampling comes at a price: it is more effort to configure and requires additional infrastructure and operational overhead to run. For extremely high-volume systems, you may also need to combine head sampling and tail sampling to protect your infrastructure from huge spikes of data.

How Honeycomb Handles Sampled Data 

When you sample your data with our sampling techniques, each span in a trace is given a SampleRate attribute that represents N when you only sample 1/N traces. This allows Honeycomb to weight counts to compensate for the fact that you are sampling your data.

Example: You are doing head sampling at a 10% sampling rate, meaning that only 10% of traces are exported to Honeycomb:

Trace ID Sample Rate (on each span) duration_ms
abcd1234 10 200
4321dcba 10 1100

In this case, the SampleRate attribute is set to 10 because you are sampling 10% of traces, or 1 in 10 traces. Because Honeycomb has this information, it can calculate accurate values for various aggregations:

  • COUNT of traces: (2 * 10) = 20
  • AVG(duration_ms): ((200 * 10) + (1100 * 10)) / (10 + 10) = 650

In other words, you can send less data and yet still see usefully accurate data in Honeycomb. This is true for SUM and Percentile aggregations as well.

The math done on the Honeycomb backend combined with the flexibility of setting the SampleRate attribute means that you can use sampling techniques as simple or sophisticated as you need, and Honeycomb will do the rest provided that the sampler sets the SampleRate attribute. When you use Refinery, this is done automatically for its dynamic samplers.

COUNT_DISTINCT and Sampling 

Use the COUNT_DISTINCT operator in Query Builder with care when working with sampled data.

COUNT_DISTINCT uses the HyperLogLog algorithm, which is designed to work on an entire population of data. Therefore, it is only possible for COUNT_DISTINCT to count items that are actually present in the data.

COUNT_DISTINCT cannot and does not compensate for sampling rate. However, when using COUNT_DISTINCT in a query, users can view the average sample rate for the query. Locate it in the metadata below the result summary table with elapsed query time and rows examined fields. average sample rate displays the average sample rate across all underlying events included in the query result.

Until there’s a way to accurately count unique values in sampled sets, use COUNT_DISTINCT with caution when working with sampled data. Use Honeycomb’s Usage Mode to examine data without compensating for sample rates.

Usage Mode 

Sometimes, you may want to run calculations on only the sampled data with no weighting of counts. You can use the Usage Center’s Usage Mode feature to do this.