We use cookies or similar technologies to personalize your online experience & tailor marketing to you. Many of our product features require cookies to function properly.

Read our privacy policy I accept cookies from this site

Sampling traces

Sampling your data is a great way get data volume to a manageable size.

Many folks are curious about how sampling works with tracing, given that simply sampling 1/N requests at random will not guarantee that you retain all of the spans for a given trace. The story of how sampling and tracing fit together with Honeycomb is still evolving, but here are some thoughts on how to approach it.

Traditionally, the way traces are sampled is head-based sampling: when the root span is being processed, a random sampling decision is made (e.g., if randint(10) == 0, the span will be sampled). If that span is decided to be sampled, it gets sent and propagates that decision out to the descendent spans, who follow suit, usually by a method like HTTP header (something like X-B3-Sampled: 1). That way, all the spans for a particular trace are preserved. Our integrations do not support head-based sampling today out of the box, but you could implement such a system yourself.

Some of our integrations do support what we call deterministic sampling. In deterministic sampling, a hash is made of a specific field in the event/span such as the request ID, and a decision to sample is made based on that hash and the intended sample rate. Hence, an approximately correct number of traces will be selected and the decision whether or not to sample a given trace does not need to be propagated around: actors can sample full traces whether they can communicate or not.

There is another option: tail-based sampling, where sampling decisions are made when the full trace information has been gathered. This ensures that if an error or slowness happens way down the tree of service calls, the full events of the trace are more likely to get sampled in. To use this method of sampling, all spans must be collected at a buffer ahead of time.

Client-based Sampling (at send time)  🔗

Deterministic Sampling  🔗

The Honeycomb Beelines provide out-of-box support for deterministic sampling of traces. By default, when a sampling rate is configured, the trace ID of each trace-enabled event (span) will be be used to compute whether the event should be kept. All events that share the same trace ID will receive the same sampling decision. If you propagate trace context to other services, events originating from those services will also receive the same sampling decision if they are instrumented with a Beeline.

    beeline.Init(beeline.Config{
        WriteKey:   "YOUR_API_KEY",
        Dataset:    "MyGoApp",
        Debug:      true,
        SampleRate: 10,
    })

For more information, see the Go Beeline docs.

require("honeycomb-beeline")({
  writeKey: "YOUR_API_KEY",
  dataset: "my-dataset-name",
  // deterministic sampling enabled at a rate of 10
  // i.e. keep 1 in every 10 traces
  sampleRate: 10
  /* ... additional optional configuration ... */
});

For more information, see the Nodejs Beeline docs.

import io.honeycomb.beeline.tracing.Beeline;
import io.honeycomb.beeline.tracing.Span;
import io.honeycomb.beeline.tracing.SpanBuilderFactory;
import io.honeycomb.beeline.tracing.SpanPostProcessor;
import io.honeycomb.beeline.tracing.Tracer;
import io.honeycomb.beeline.tracing.Tracing;
import io.honeycomb.beeline.tracing.sampling.Sampling;
import io.honeycomb.libhoney.HoneyClient;
import io.honeycomb.libhoney.LibHoney;

public class TracerSpans {
private static final String WRITE_KEY = "test-write-key";
private static final String DATASET = "test-dataset";

    private static final HoneyClient client;
    private static final Beeline beeline;

    static {
        client                          = LibHoney.create(LibHoney.options().setDataset(DATASET).setWriteKey(WRITE_KEY).build());
        // deterministic sampling enabled at a rate of 10
        // i.e. keep one in 10 traces
        SpanPostProcessor postProcessor = Tracing.createSpanProcessor(client, Sampling.DeterministicTraceSampler(10));
        SpanBuilderFactory factory      = Tracing.createSpanBuilderFactory(postProcessor, Sampling.DeterministicTraceSampler(10));
        Tracer tracer                   = Tracing.createTracer(factory);
        beeline                         = Tracing.createBeeline(tracer, factory);
    }

For more information (for example, using the Beeline with Spring), see the Java Beeline docs.

beeline.init(
   writekey='YOUR_API_KEY',
   dataset='my-app',
   service_name='my-app',
   debug=True,
   # deterministic sampling enabled at a rate of 10
   # i.e. keep one in 10 traces
   sample_rate=10,
)

For more information, see the Python Beeline docs.

require 'honeycomb-beeline'

Honeycomb.init(
  # deterministic sampling enabled at a rate of 10
  # i.e. keep one in 10 traces
  sample_rate: 10
)

For more information, see the Ruby Beeline docs.