Honeycomb Refinery

Warning

This set of documentation reflects the behaviors of our latest major release, Honeycomb Refinery 2.0.

If you use Refinery, we recommend that you migrate to Refinery 2.0 or later to take advantage of new and improved features. To explore the benefits of migration and access resources to help you migrate, visit Recommended Migrations: Refinery 2.0.

Refinery is a tail-based sampling proxy and operates at the level of an entire trace. Refinery examines whole traces and intelligently applies sampling decisions to each trace. These decisions determine whether to include or discard the trace data in the sampled data forwarded to Honeycomb.

A tail-based sampling model allows you to inspect an entire trace at one time and make a decision to sample based on its contents. For example, your data may have a root span that contains the HTTP status code to serve for a request, and another span that contains information on whether the data was served from a cache. Using Refinery, you can choose to keep only traces that had a 500 status code and were also served from a cache.

Refinery’s tail-based sampling capabilities 

Refinery support several kinds of tail sampling:

  • Dynamic sampling - This sampling type configures a key based on a trace’s set of fields and automatically increases or decreases the sampling rate based on how frequently each unique value of that key occurs. For example, using a key based on http.status_code, you can include in your sampled data:
    • one out of every 1,000 traces for requests that return 2xx
    • one out of every 10 traces for requests that return 4xx
    • every request that returns 5xx
  • Rules-based sampling - This sampling type enables you to define sampling rates for well-known conditions. For example, in your sampled data, you can keep 100% of traces with an error and then apply dynamic sampling to all other traffic.
  • Throughput-based sampling - This sampling type enables you to sample traces based on a fixed upper-bound for the number of spans per second. The sampler will dynamically sample traces with a goal of keeping the throughput below the specified limit.
  • Deterministic probability sampling - This sampling type consistently applies sampling decisions without considering the contents of the trace other than its trace ID. For example, you can include 1 out of every 12 traces in the sampled data sent to Honeycomb. This kind of sampling can also be done using head sampling, and if you use both, Refinery takes that into account.
  • Supports OpenTelemetry traces and logs signals - Handles both OpenTelemetry trace and log data signals. Log records associated with traces are sampled as part of the trace. Unassociated log events are forwarded directly to Honeycomb.

Refinery lets you combine all of the above techniques to achieve your desired sampling behavior.

Next Steps 

Explore our Refinery setup instructions.

The default configuration at installation contains the minimum configuration needed to run Refinery. Customize your configuration with general configuration and sampling method configuration.

While configuring, you may need to scale and troubleshoot your Refinery instance.