Send Logs Using the OpenTelemetry Collector

This guide details how to send logs to Honeycomb using the OpenTelemetry Collector.

Using the OpenTelemetry Collector as a Logging Agent 

The OpenTelemetry Collector supports a wide variety of logs sources and formats, and can be used as a drop-in replacement for many logging agents. It works by configuring a logs source, structuring all logs collected, optionally transforming the logs to add/remove/update any fields, and then sending the logs to Honeycomb.

The OpenTelemetry Collector translates any log it collects into the OpenTelemetry Log format, which is a structured log that wraps the bodies of existing logs and optionally correlates them with traces. Having logs in the OpenTelemetry Logs format enables you to centrally process log data along with traces and metrics within the OpenTelemetry Collector.

Use a Receiver to Collect Logs 

The OpenTelemetry Collector supports a large number of receivers that can be used to collect logs from a variety of sources.

Setup 

  1. Make sure you’re using the Collector Contrib distribution of the OpenTelemetry Collector, which contains contributions that are not part of the core repository and core distribution of the OpenTelemetry Collector.
  2. Prepare your collector configuration file by adding the following boilerplate:
receivers:
  # Add your receiver here
  # ...

processors:
  batch:
  # Add any additional processors here
  # ...

exporters:
  otlp/logs:
    endpoint: "api.honeycomb.io:443"
    headers:
      "x-honeycomb-team": "YOUR_API_KEY"
      "x-honeycomb-dataset": "YOUR_LOGS_DATASET_NAME"

service:
  pipelines:
    logs:
      receivers: [receiver1,receiver2,etc]
      processors: [batch]
      exporters: [otlp/logs]
Note
If you are sending OTLP logs from a service with a service.name defined, then the dataset for those logs will be the name of the service, and the x-honeycomb-dataset header will not be used.

Collect any Log with the Filelog Receiver 

The Filelog Receiver supports reading and parsing any arbitrary log written to a file on a server.

The Filelog Receiver is the most flexible receiver, but depending on the shape of your logs, it may require additional configuration to parse your logs correctly.

For example, here is a configuration that reads an NGINX access log and parses it into a structured log:

receivers:
  filelog:
    include: ["/var/log/nginx/access.log"]
    operators:
      - type: "regex_parser"
        regex: "(?P<remote>[^ ]*) - - \\[(?P<time>[^\\]]*)\\] \"(?P<method>\\S+)(?: +(?P<path>[^ ]*) +\\S*)?\" (?P<status>\\d+) (?P<size>\\d+) \"(?P<referer>[^\"]*)\" \"(?P<agent>[^\"]*)\""
        timestamp:
          parse_from: attributes.time
          layout: "%d/%b/%Y:%H:%M:%S %z"

Here’s a configuration that reads a JSON log:

receivers:
  filelog:
    include: [ /var/log/myservice/*.json ]
    operators:
      - type: json_parser
        timestamp:
          parse_from: attributes.time
          layout: '%Y-%m-%d %H:%M:%S'

In Depth File Log Receiver Example 

Sometimes, logs contain mixed structured and unstructured information, such as an informational log that contains a JSON object inside of it. To parse these, you need to parse each piece of the log into a specific element, and also know how to handle the structured information.

For example, let’s say you have a log at /var/log/name-service/0.log that mixes text and JSON:

2024-03-19T14:49:51.998-0600 info name-service/main.go:212 Listening on http://localhost:8000/name {"name": "name-service", "function": "main", "featureflag.allow-future": true, "name-format": "lowercase"}

You require a filelogreceiver configuration that does the following tasks:

  1. Reads the log from /var/log/name-service/*.log.
  2. Parses the full text of the log into a timestamp, severity, file name, message, and “details”.
  3. Parses the “details” text using a JSON parser, which takes each key in the JSON object and turns them into attributes for the log.
  4. Parses out the service name from the file path.
  5. Sets the service.name resource with the parsed service name.

The following filelogreceiver configuration completes the required tasks:

filelog:
  include:
    - /var/log/name-service/*.log
  include_file_path: true
  operators:
    # Parse the text. Each capture group becomes an attribute on the log and the original full string becomes the body
    - type: regex_parser
      regex: (?P<timestamp>^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}-\d{4}) (?P<severity>\w+) (?P<filename>[\w\/\.\:\-]+) (?P<message>[^\{]*) (?P<details>.*)
      timestamp:
        layout: "%Y-%m-%dT%H:%M:%S.%f%z"
        parse_from: attributes.timestamp
      severity:
        parse_from: attributes.severity
    # The "details" attribute is a json string. This operator parses the json string to turn it into a map with Honeycomb will flatten on ingest.
    - type: json_parser
      parse_from: attributes.details
      parse_to: attributes.details
    # We can extract the service name, name-service, from the file path
    - type: regex_parser
      regex: \/var\/log\/(?P<servicename>.*)\/\d+\.log
      parse_from: attributes["log.file.path"]
    # Move the extracted attribute to the service name resource attribute
    - type: move
      from: attributes.servicename
      to: resource["service.name"]

The configuration produces a structured log that, when exported to Honeycomb, contains the following fields:

{
    "body": "2024-03-19T14:49:51.998-0600 info name-service/main.go:212 Listening on http://localhost:8000/name {\"name\": \"name-service\", \"function\": \"main\", \"featureflag.allow-future\": true, \"name-format\": \"lowercase\"}",
    "severity": "info",
    "severity_code": 9,
    "service.name": "name-service",
    "message": "Listening on http://localhost:8000/name",
    "timestamp": 1679265391998,
    "details.featureflag.allow-future": true,
    "details.function": "main",
    "details.name": "name-service",
    "details.name-format": "lowercase",
    "log.file.name": "0.log",
    "log.file.path": "/var/log/name-service/0.log",
    "flags": 0,
}

Use the Explore Data tab in Query Results to view the structured log in Honeycomb.

The OpenTelemetry Collector lets you configure many different receivers that collect logs from a specific source. These are the most popular ones.

AWS Cloudwatch Receiver 

The AWS Cloudwatch Receiver supports autodiscovery of log groups and log streams in AWS Cloudwatch, with optional filtering of those sources.

For example, here is a configuration that autodiscovers only EKS logs from us-west-1:

receivers:
  awscloudwatch:
    region: us-west-1
    logs:
      poll_interval: 1m
      groups:
        autodiscover:
          limit: 100
          prefix: /aws/eks/

Fluent Forward Receiver 

The Fluent Forward Receiver runs a TCP server that accepts logs via the Fluent Forward protocol, which enables collecting logs from Fluentbit and Fluentd.

For example, here is a configuration that reads all logs on port 8006:

receivers:
  fluentforward:
    endpoint: 0.0.0.0:8006

Splunk HEC Receiver 

The Splunk HEC Receiver accents events in the Splunk HEC format.

For example, here is a configuration that reads JSON HEC events and raw log data:

receivers:
  splunk_hec:
    endpoint: 0.0.0.0:8088
  splunk_hec/advanced:
    endpoint: 0.0.0.0:8088
    access_token_passthrough: true
    tls:
      cert_file: /test.crt
      key_file: /test.key
    raw_path: "/raw"
    hec_metadata_to_otel_attrs:
      source: "mysource"
      sourcetype: "mysourcetype"
      index: "myindex"
      host: "myhost"

Collect Logs from Other Sources 

The following sources are less commonly used, but still supported with the OpenTelemetry Collector.

Azure Blob Receiver 

The Azure Blob Receiver reads logs and trace data from Azure Blob Storage.

For example, here is a configuration that reads logs from a specific container in Azure Blob Storage:

receivers:
  azureblob:
    connection_string: DefaultEndpointsProtocol=https;AccountName=accountName;AccountKey=<your-key>;EndpointSuffix=core.windows.net
    event_hub:
      endpoint: Endpoint=sb://oteldata.servicebus.windows.net/;SharedAccessKeyName=otelhubbpollicy;SharedAccessKey=<access-key>;EntityPath=otellhub
    logs:
        container_name: name-of-container

Azure Event Hub Receiver 

The Azure Event Hub Receiver pulls logs from an Azure Event Hub and transforms them.

For example, here is a configuration that reads logs from a specific parition and group, then structures them a structured JSON log:

receivers:
  azureeventhub:
    connection: Endpoint=<your-endpoint>;SharedAccessKeyName=<your-key>;SharedAccessKey=<your-key>;EntityPath=hubName
    partition: my-partition
    group: my-consumer-group

Cloudflare Receiver 

The Cloudflare Receiver accepts logs from CloudFlare’s LogPush Jobs.

For example, here is a configuration that reads logs from a specific LogPush Job:

receivers:
  cloudflare:
    logs:
      tls:
        key_file: some_key_file
        cert_file: some_cert_file
      endpoint: <your-endpoint>
      secret: <your-secret>
      timestamp_field: EdgeStartTimestamp
      attributes:
        ClientIP: http_request.client_ip
        ClientRequestURI: http_request.uri

Google Pubsub Receiver 

The Google Pubsub Receiver reads logs from a Google Pubsub subscription.

For example, here is a configuration that reads raw text logs and wraps them into an OpenTelemetry Log:

receivers:
  googlecloudpubsub:
    project: otel-project
    subscription: projects/otel-project/subscriptions/otlp-logs
    encoding: raw_text

Journald Receiver 

The Journald Receiver parses Journald events from the systemd.

For example, here is a configuration that reads all logs from Journald from some specific units:

receivers:
  journald:
    directory: /run/log/journal
    units:
      - ssh
      - kubelet
      - docker
      - containerd
    priority: info

Kubernetes Receivers 

The OpenTelemetry Collector has several receivers that can be used to collect logs from Kubernetes. To learn more, visit Kubernetes Log Collection and Kubernetes Event Collection.

Kafka Receiver 

The Kafka Receiver reads logs, metrics, and traces from Kafka.

For example, here is a configuration that reads all Kafka data:

receivers:
  kafka:
    protocol_version: 2.0.0

Loki Receiver 

The Loki Receiver allows Promtail instances to send logs to the OpenTelemetry Collector.

For example, here is a configuration that reads all logs from an endpoint:

receivers:
  loki:
    protocols:
      http:
        endpoint: 0.0.0.0:3500
      grpc:
        endpoint: 0.0.0.0:3600
    use_incoming_timestamp: true

MongoDB Atlas Receiver 

The MongoDB Atlas Receiver reads logs from MongoDB Atlas.

For example, here is a configuration that reads all logs from a specific project:

receivers:
  mongodbatlas:
    logs:
      enabled: true
      projects: 
        - name: "project 1"
          collect_audit_logs: true
          collect_host_logs: true

OTLPJson File Receiver 

The OTLPJson File Receiver reads any existing OTLP Logs, Metrics, or Traces from a file on a server.

For example, here is a configuration that reads from a specific directory and excludes a specific file:

receivers:
  otlpjsonfile:
    include:
      - "/var/log/*.log"
    exclude:
      - "/var/log/example.log"

Apache Pulsar Receiver 

The Pulsar Receiver collects logs, metrics, and traces from Apache Pulsar.

For example, here is a configuration that reads data from a Pulsar cluster:

receivers:
  pulsar:
    endpoint: pulsar://localhost:6650
    topic: otlp-spans
    subscription: otlp_spans_sub
    consumer_name: otlp_spans_sub_1
    encoding: otlp_proto
    auth:
      tls:
        cert_file: cert.pem
        key_file: key.pem
    tls_allow_insecure_connection: false
    tls_trust_certs_file_path: ca.pem

SignalFx Receiver 

The SignalFx Receiver reads logs from a SignalFx endpoint.

For example, here is a configuration that reads data from a SignalFx endpoint:

receivers:
  signalfx:
    endpoint: 0.0.0.0:9943
  signalfx/advanced:
    endpoint: 0.0.0.0:9943
    access_token_passthrough: true
    tls:
      cert_file: /test.crt
      key_file: /test.key

Syslog Receiver 

The Syslog Receiver parses Syslogs received over UDP or TCP.

For example, here is a configuration that reads Syslogs from TCP:

receivers:
  syslog:
    tcp:
      listen_address: "0.0.0.0:54526"
    protocol: rfc5424

TCP Receiver 

The TCP Receiver receives logs over TCP.

For example, here is a configuration that reads logs from TCP over a particular address:

receivers:
  tcplog:
    listen_address: "0.0.0.0:54525"

UDP Receiver 

The UDP Receiver receives logs over UDP.

For example, here is a configuration that reads logs from UDP over a particular address:

receivers:
  udplog:
    listen_address: "0.0.0.0:54525"

Webhook Event Receiver 

The Webhook Event Receiver allows for any webhook-style data source to send logs to the OpenTelemetry Collector.

For example, here is a configuration that reads logs from a webhook:

receivers:
    webhookevent:
        endpoint: localhost:8088
        read_timeout: "500ms"
        path: "eventsource/receiver"
        health_path: "eventreceiver/healthcheck"
        required_header:
            key: "required-header-key"
            value: "required-header-value"

Windows Log Event Receiver 

The Windows Log Event Receiver tails and parses logs from the Windows event log API.

For example, here is a configuration that reads logs from a named channel:

receivers:
    windowseventlog:
        channel: application