Sampling Techniques

Sampling is the concept of selecting a few elements from a large collection and learning about the entire collection by extrapolating from the selected set. This page covers terminology and different sampling methods and tools, such as head sampling, and tail sampling.

Terminology

It’s important to use consistent terminology when discussing sampling. A trace or span is considered “sampled” or “not sampled”:

Sampled: A trace or span is processed and exported. Because it is chosen by the sampler as a representative of the population, it is considered “sampled”.
Not sampled: A trace or span is not processed or exported. Because it is not chosen by the sampler, it is considered “not sampled”.

Sometimes, the definitions of these terms get mixed up in conversation or online. You may find someone state that they are “sampling out data” or that data not processed or exported is considered “sampled”. While the behavior they describe may be the same, these are incorrect terms.

Head Sampling

Head sampling is when you sample traces without looking at the entire trace. The decision to sample or not sample a span in a trace is often made as early as possible. In OpenTelemetry, a head sampling decision is made during span creation–unsampled spans are never even created.

The most common form of head sampling is deterministic probability sampling. Given a constant sampling rate that represents a fixed percentage of traces to sample, the sampler will make a decision to sample or not sample spans based on using the trace ID as a random number. Using the trace ID allows disparate samplers to make consistent decisions for all of the spans in a trace.

All of OpenTelemetry’s SDKs support deterministic probability sampling:

When to Use Head Sampling

Head sampling is a blunt instrument. It is simple to configure and requires no additional infrastructure or operational overhead.

But what head sampling offers in simplicity, it loses in flexibility:

You cannot sample traces based on errors they contain or their overall latency
You cannot sample traces based on attributes on different spans in a trace
You cannot dynamically adjust your sampling rate based on traffic to a service

To accomplish the above, you need to use tail sampling instead.

Tail Sampling

Tail sampling is where the decision to sample a trace takes place by considering all or most of the spans within the trace. Honeycomb offers Refinery as a tail sampling solution to install in your environment. Because tail sampling is done by inspecting whole traces, it enables you to apply many different sampling techniques. Some of these techniques include:

Dynamic sampling - By configuring a set of fields on a trace that make up a key, the sampler automatically increases or decreases the sampling rate based on how frequently each unique value of that key occurs. For example, a key made up of http.status_code will sample much less traffic for requests that return 200 than for requests that return 404.
Rules-based sampling - This enables you to define sampling rates for well-known conditions. For example, you can sample 100% of traces with an error and then fall back to dynamic sampling for all other traffic.
Throughput-based sampling - This enables you to sample traces based on a fixed upper bound on the number of spans per second.
Deterministic probability sampling - Although deterministic probability sampling is also used in head sampling, it is still possible to use it in tail sampling.

Tail sampling with Refinery lets you combine all of the above techniques in arbitrary ways to create a sampling strategy that is tailored to your needs.

When to Use Tail Sampling

Tail sampling with Refinery lets you sample traces in just about any way you can imagine. How you configure tail sampling depends on your needs and the complexity of your system.

Most people tend to follow some common patterns:

Configure several rules to use a high or low sampling rate for well-known conditions, like keeping all errors in traces and dropping most health checks
Configure a dynamic sampler based on a low-cardinality key like http.status_code to sample traces proportionally across all values of that key

The rules and key configuration will often have to take into account attributes that are unique to your system.

The flexibility and sophistication of tail sampling comes at a price: it is more effort to configure and requires additional infrastructure and operational overhead to run. For extremely high-volume systems, you may also need to combine head sampling and tail sampling to protect your infrastructure from huge spikes of data.

Honeycomb.io Documentation