Filter and Send Logstash Data

Thanks to Logstash’s flexible plugin architecture, you can send a copy of all the traffic that Logstash is processing to Honeycomb. This topic explains how to use Logstash plugins to convert incoming log data into events and then send them to Honeycomb.

Tip
If your system uses logspout as a log router for Docker containers, you can send logs to Honeycomb with one of the logspout third-party modules that integrates with logstash or fluentd.

Data Format Requirements 

Note
Honeycomb is at its best when the events you send are broad and capture lots of information about a given process or transaction. For guidance on how to think about building events, start with Building Better Events. To learn more, check out the rest of the “Event Foo” series on our blog.

To process the log data coming into Logstash into Honeycomb events, you can use Logstash filter plugins. These filter plugins transform the data into top-level keys based on the original source of the data. We have found these to be especially useful:

  • grok matches regular expressions and has configs for many common patterns (such as the apache, nginx, or haproxy log format).
  • json matches JSON-encoded strings and breaks them up in to individual fields.
  • kv matches key=value patterns and breaks them out into individual fields.

To add and configure filter plugins, refer to Working with Filter Plugins on the Logstash documentation site.

Example: Using Logstash Filter Plugins to Process Haproxy Logs for Honeycomb Ingestion 

Let us say you are sending haproxy logs (in HTTP mode) to Logstash. A log line describing an individual request looks something like this (borrowed from the haproxy config manual):

Feb  6 12:14:14 localhost \
          haproxy[14389]: 10.0.1.2:33317 [06/Feb/2009:12:14:14.655] http-in \
          static/srv1 10/0/30/69/109 200 2750 - - ---- 1/1/1/1/0 0/0 {1wt.eu} \
          {} "GET /index.html HTTP/1.1"

Logstash puts this line in a message field, so in the filter parameter of the logstash.yaml config fragment below, we use the grok filter plugin and tell it to parse the message and make all the content available in top-level fields. And, since we do not need it anymore, we tell grok to remove the message field.

The mutate filter plugin takes the numeric fields extracted by haproxy and turns them into integers so that Honeycomb can do math on them (later).

filter {
  grok {
    match => ["message", "%{HAPROXYHTTP}"]
    remove_field => ["message"]
  }
  mutate {
    convert => {
      "actconn" => "integer"
      "backend_queue" => "integer"
      "beconn" => "integer"
      "bytes_read" => "integer"
      "feconn" => "integer"
      "http_status_code" => "integer"
      "retries" => "integer"
      "srv_queue" => "integer"
      "srvconn" => "integer"
      "time_backend_connect" => "integer"
      "time_backend_response" => "integer"
      "time_duration" => "integer"
      "time_queue" => "integer"
      "time_request" => "integer"
    }
  }
}

Sending Data to Honeycomb 

Now that all the fields in the message are nicely extracted into events, send them on to Honeycomb! To send events, configure an output plugin.

You can use Logstash’s HTTP output plugin to craft HTTP requests to the Honeycomb API.

This configuration example sends the data to a dataset called “logstash.”

filter {
  ruby {
    code => 'event.to_hash.each { |k, v| event.set("[data][" + k + "]" , v) }'
  }
  prune {
    whitelist_names => [ "^data$" ]
  }
}
output {
  http {
    url => "https://api.honeycomb.io/1/batch/logstash" # US instance
    #url => "https://api.eu1.honeycomb.io/1/batch/logstash" # EU instance
    http_method => "post"
    headers => {
      "X-Honeycomb-Team" => "YOUR_API_KEY"
    }
    format => "json_batch"
    http_compression => true
  }
}

To complete configuration in the example above:

  • Use filter to nest the Logstash JSON fields under a data element in the JSON payload to Honeycomb. This filter is required for Honeycomb to ingest your Logstash logs. Learn more about filter in the Elastic documentation.
  • Specify a URL (url) to send the data to:
    • for our US instance: https://api.honeycomb.io/1/batch/<dataset_name>
    • for our EU instance: https://api.eu1.honeycomb.io/1/batch/<dataset_name>
  • Add your Honeycomb API key to "X-Honeycomb-Team" so that Logstash is authorized to send data to Honeycomb.
  • Specify the output format as JSON batch (json_batch).
  • Specify the use of HTTP compression (http_compression => true).

Then, restart Logstash. When it is back up, you will find the new dataset on your landing page.

Set Event Timestamps 

In Logstash, each event has a special @timestamp field. In general, use the date filter plugin to extract the time attribute from log lines.

For example, if you have a JSON log line containing timestamps in the format:

{"timestamp": "2018-02-04T14:55:10Z", "host": "app22", ...}

Then, extract the time value using the following Logstash configuration:

filter {
  mutate{
    rename => {"timestamp" => "time"}
  }
  date {
    match => ["time", "ISO8601"]
  }
  ruby {
    code => 'event.to_hash.each { |k, v| event.set("[data][" + k + "]" , v) unless k == "time" }'
  }
  prune {
    allowlist_names => [ "^data$", "^time$" ]
  }
}
output {
  http {
    url => "https://api.honeycomb.io/1/batch/logstash" # US instance
    #url => "https://api.eu1.honeycomb.io/1/batch/logstash" # EU instance
    http_method => "post"
    headers => {
      "X-Honeycomb-Team" => "YOUR_API_KEY"
    }
    format => "json_batch"
    http_compression => true
  }
}