Getting Fastly Data Into Honeycomb | Honeycomb

Getting Fastly Data Into Honeycomb

Fastly supports streaming logs to get more insight into the behavior of their content distribution system. This data can be sent to Honeycomb for querying.

Basic details on setting this up can be found in the Fastly documentation.

Details about sampling the streamed Fastly data are included below.

Sampling with VCL 

Sampling can be used to reduce data volume in your Honeycomb datasets where you are gathering Fastly data.

To implement sampling, we recommend the following configuration:

  1. Logging rule, which will only forward logs for requests if they are included in the sampled data
  2. VCL snippet(s), which determines if the request should be sampled and at what rate

First, update your configuration to create a logging rule which will forward requests to Honeycomb only if the req.http.log_request local variable is set to "1".

Fastly logging endpoints

Next, create two VCL snippets:

  1. A table of sample rates for status codes.
  2. The code which sets the sample rate based on HTTP status.
Fastly VCL snippet

The code of the first snippet should be similar to the following. The rates here should be adjusted based on your production traffic.

table codes {
    "200s": "20",
    "300s": "5",
    "400s": "3",
    "500s": "1",
}

This table describes the number of events which flow through per sampled event based on status code (1 will not sample at all, 20 samples every 20th event, and so on).

The second VCL snippet should be exactly as follows:

set req.http.samplerate = table.lookup(codes, regsub(resp.status, "^([1-5])..", "\100s"), "1");
if (randombool(1, std.atoi(req.http.samplerate))) {
    set req.http.log_request = "1";
} else {
    set req.http.log_request = "0";
}

This will set the req.http.log_request variable (mentioned above) if sampling should be applied.

Lastly, ensure that the sample rate is included as a property of the JSON event, which is sent to Honeycomb. In the formatted JSON provided as part of the streaming log configuration, add this line at the same level of the time and data keys:

"samplerate": %{req.http.samplerate}V,

This will encode samplerate as a top level key sent to the Honeycomb API, causing all visualizations rendered by Honeycomb to appear as if all of the events, even ones which were sampled out, were sent.

This basic configuration can be extended as you like to sample based on cache status or other fields if desired.

On this page: