Honeycomb Refinery

Important

This documentation reflects Honeycomb Refinery 2.0, the latest major release.

If you’re using a previous version of Refinery, we recommend you migrate to Refinery 2.0 or later to take advantage of new and improved features.

Refinery is a tail-based sampling proxy and operates at the level of an entire trace.

Refinery examines whole traces and intelligently applies sampling decisions to each trace. These decisions determine whether to include or discard the trace data in the sampled data sent to Honeycomb.

A tail-based sampling model lets you inspect an entire trace at one time and make a decision to sample based on its contents. For example, your data may have a root span with the HTTP status code to serve for a request, and another span with information on whether the data was served from a cache. Using Refinery, you can choose to keep only traces that had a 500 status code and were also served from a cache.

Refinery’s tail-based sampling capabilities

Refinery support several kinds of tail sampling:

Dynamic sampling - This sampling type configures a key based on a trace’s set of fields and automatically increases or decreases the sampling rate based on how frequently each unique value of that key occurs. For example, using a key based on http.status_code, you can include in your sampled data:
- one out of every 1,000 traces for requests that return 2xx
- one out of every 10 traces for requests that return 4xx
- every request that returns 5xx
Rules-based sampling - This sampling type enables you to define sampling rates for well-known conditions. For example, in your sampled data, you can keep 100% of traces with an error and then apply dynamic sampling to all other traffic.
Throughput-based sampling - This sampling type enables you to sample traces based on a fixed upper-bound for the number of spans per second. The sampler will dynamically sample traces with a goal of keeping the throughput below the specified limit.
Deterministic probability sampling - This sampling type consistently applies sampling decisions without considering the contents of the trace other than its trace ID. For example, you can include 1 out of every 12 traces in the sampled data sent to Honeycomb. This kind of sampling can also be done using head sampling, and if you use both, Refinery takes that into account.
Supports OpenTelemetry traces and logs signals - Handles both OpenTelemetry trace and log data signals. Log records associated with traces are sampled as part of the trace. Unassociated log events are forwarded directly to Honeycomb.

Refinery lets you combine all of the above techniques to achieve your desired sampling behavior.

Next Steps

Explore our Refinery setup instructions.

The default configuration at installation contains the minimum configuration needed to run Refinery. Customize your configuration with general configuration and sampling method configuration.

While configuring, you may need to scale and troubleshoot your Refinery instance.

Honeycomb.io Documentation

Honeycomb Refinery

Refinery’s tail-based sampling capabilities

Next Steps