When you sample your data with our sampling techniques, each span in a trace is given a SampleRate
attribute that represents N
when you only sample 1/N
traces.
This allows Honeycomb to weight counts to compensate for the fact that you are sampling your data.
Example: You are doing head sampling at a 10% sampling rate, meaning that only 10% of traces are exported to Honeycomb:
Trace ID | Sample Rate (on each span) | duration_ms |
---|---|---|
abcd1234 | 10 | 200 |
4321dcba | 10 | 1100 |
In this case, the SampleRate
attribute is set to 10
because you are sampling 10% of traces, or 1 in 10 traces.
Because Honeycomb has this information, it can calculate accurate values for various aggregations:
COUNT
of traces: (2 * 10) = 20
AVG(duration_ms)
: ((200 * 10) + (1100 * 10)) / (10 + 10) = 650
In other words, you can send less data and yet still see usefully accurate data in Honeycomb. This is true for SUM and Percentile aggregations as well.
The math done on the Honeycomb backend combined with the flexibility of setting the SampleRate
attribute means that you can use sampling techniques as simple or sophisticated as you need, and Honeycomb will do the rest provided that the sampler sets the SampleRate
attribute.
When you use Refinery, this is done automatically for its dynamic samplers.