This guide details how to send logs to Honeycomb using the OpenTelemetry Collector.
The OpenTelemetry Collector supports a wide variety of logs sources and formats, and can be used as a drop-in replacement for many logging agents. It works by configuring a logs source, structuring all logs collected, optionally transforming the logs to add/remove/update any fields, and then sending the logs to Honeycomb.
The OpenTelemetry Collector translates any log it collects into the OpenTelemetry Log format, which is a structured log that wraps the bodies of existing logs and optionally correlates them with traces. Having logs in the OpenTelemetry Logs format enables you to centrally process log data along with traces and metrics within the OpenTelemetry Collector.
The OpenTelemetry Collector supports a large number of receivers that can be used to collect logs from a variety of sources.
receivers:
# Add your receiver here
# ...
processors:
batch:
# Add any additional processors here
# ...
exporters:
otlp/logs:
endpoint: "api.honeycomb.io:443"
headers:
"x-honeycomb-team": "YOUR_API_KEY"
"x-honeycomb-dataset": "YOUR_LOGS_DATASET_NAME"
service:
pipelines:
logs:
receivers: [receiver1,receiver2,etc]
processors: [batch]
exporters: [otlp/logs]
service.name
defined, then the dataset for those logs will be the name of the service, and the x-honeycomb-dataset
header will not be used.The Filelog Receiver supports reading and parsing any arbitrary log written to a file on a server.
The Filelog Receiver is the most flexible receiver, but depending on the shape of your logs, it may require additional configuration to parse your logs correctly.
For example, here is a configuration that reads an NGINX access log and parses it into a structured log:
receivers:
filelog:
include: ["/var/log/nginx/access.log"]
operators:
- type: "regex_parser"
regex: "(?P<remote>[^ ]*) - - \\[(?P<time>[^\\]]*)\\] \"(?P<method>\\S+)(?: +(?P<path>[^ ]*) +\\S*)?\" (?P<status>\\d+) (?P<size>\\d+) \"(?P<referer>[^\"]*)\" \"(?P<agent>[^\"]*)\""
timestamp:
parse_from: attributes.time
layout: "%d/%b/%Y:%H:%M:%S %z"
Here’s a configuration that reads a JSON log:
receivers:
filelog:
include: [ /var/log/myservice/*.json ]
operators:
- type: json_parser
timestamp:
parse_from: attributes.time
layout: '%Y-%m-%d %H:%M:%S'
Sometimes, logs contain mixed structured and unstructured information, such as an informational log that contains a JSON object inside of it. To parse these, you need to parse each piece of the log into a specific element, and also know how to handle the structured information.
For example, let’s say you have a log at /var/log/name-service/0.log
that mixes text and JSON:
2024-03-19T14:49:51.998-0600 info name-service/main.go:212 Listening on http://localhost:8000/name {"name": "name-service", "function": "main", "featureflag.allow-future": true, "name-format": "lowercase"}
You require a filelogreceiver
configuration that does the following tasks:
/var/log/name-service/*.log
.service.name
resource with the parsed service name.The following filelogreceiver
configuration completes the required tasks:
filelog:
include:
- /var/log/name-service/*.log
include_file_path: true
operators:
# Parse the text. Each capture group becomes an attribute on the log and the original full string becomes the body
- type: regex_parser
regex: (?P<timestamp>^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}-\d{4}) (?P<severity>\w+) (?P<filename>[\w\/\.\:\-]+) (?P<message>[^\{]*) (?P<details>.*)
timestamp:
layout: "%Y-%m-%dT%H:%M:%S.%f%z"
parse_from: attributes.timestamp
severity:
parse_from: attributes.severity
# The "details" attribute is a json string. This operator parses the json string to turn it into a map with Honeycomb will flatten on ingest.
- type: json_parser
parse_from: attributes.details
parse_to: attributes.details
# We can extract the service name, name-service, from the file path
- type: regex_parser
regex: \/var\/log\/(?P<servicename>.*)\/\d+\.log
parse_from: attributes["log.file.path"]
# Move the extracted attribute to the service name resource attribute
- type: move
from: attributes.servicename
to: resource["service.name"]
The configuration produces a structured log that, when exported to Honeycomb, contains the following fields:
{
"body": "2024-03-19T14:49:51.998-0600 info name-service/main.go:212 Listening on http://localhost:8000/name {\"name\": \"name-service\", \"function\": \"main\", \"featureflag.allow-future\": true, \"name-format\": \"lowercase\"}",
"severity": "info",
"severity_code": 9,
"service.name": "name-service",
"message": "Listening on http://localhost:8000/name",
"timestamp": 1679265391998,
"details.featureflag.allow-future": true,
"details.function": "main",
"details.name": "name-service",
"details.name-format": "lowercase",
"log.file.name": "0.log",
"log.file.path": "/var/log/name-service/0.log",
"flags": 0,
}
Use the Explore Data tab in Query Results to view the structured log in Honeycomb.
The OpenTelemetry Collector lets you configure many different receivers that collect logs from a specific source. These are the most popular ones.
The AWS Cloudwatch Receiver supports autodiscovery of log groups and log streams in AWS Cloudwatch, with optional filtering of those sources.
For example, here is a configuration that autodiscovers only EKS logs from us-west-1
:
receivers:
awscloudwatch:
region: us-west-1
logs:
poll_interval: 1m
groups:
autodiscover:
limit: 100
prefix: /aws/eks/
The Fluent Forward Receiver runs a TCP server that accepts logs via the Fluent Forward protocol, which enables collecting logs from Fluentbit and Fluentd.
For example, here is a configuration that reads all logs on port 8006:
receivers:
fluentforward:
endpoint: 0.0.0.0:8006
The Splunk HEC Receiver accents events in the Splunk HEC format.
For example, here is a configuration that reads JSON HEC events and raw log data:
receivers:
splunk_hec:
endpoint: 0.0.0.0:8088
splunk_hec/advanced:
endpoint: 0.0.0.0:8088
access_token_passthrough: true
tls:
cert_file: /test.crt
key_file: /test.key
raw_path: "/raw"
hec_metadata_to_otel_attrs:
source: "mysource"
sourcetype: "mysourcetype"
index: "myindex"
host: "myhost"
The following sources are less commonly used, but still supported with the OpenTelemetry Collector.
The Azure Blob Receiver reads logs and trace data from Azure Blob Storage.
For example, here is a configuration that reads logs from a specific container in Azure Blob Storage:
receivers:
azureblob:
connection_string: DefaultEndpointsProtocol=https;AccountName=accountName;AccountKey=<your-key>;EndpointSuffix=core.windows.net
event_hub:
endpoint: Endpoint=sb://oteldata.servicebus.windows.net/;SharedAccessKeyName=otelhubbpollicy;SharedAccessKey=<access-key>;EntityPath=otellhub
logs:
container_name: name-of-container
The Azure Event Hub Receiver pulls logs from an Azure Event Hub and transforms them.
For example, here is a configuration that reads logs from a specific parition and group, then structures them a structured JSON log:
receivers:
azureeventhub:
connection: Endpoint=<your-endpoint>;SharedAccessKeyName=<your-key>;SharedAccessKey=<your-key>;EntityPath=hubName
partition: my-partition
group: my-consumer-group
The Cloudflare Receiver accepts logs from CloudFlare’s LogPush Jobs.
For example, here is a configuration that reads logs from a specific LogPush Job:
receivers:
cloudflare:
logs:
tls:
key_file: some_key_file
cert_file: some_cert_file
endpoint: <your-endpoint>
secret: <your-secret>
timestamp_field: EdgeStartTimestamp
attributes:
ClientIP: http_request.client_ip
ClientRequestURI: http_request.uri
The Google Pubsub Receiver reads logs from a Google Pubsub subscription.
For example, here is a configuration that reads raw text logs and wraps them into an OpenTelemetry Log:
receivers:
googlecloudpubsub:
project: otel-project
subscription: projects/otel-project/subscriptions/otlp-logs
encoding: raw_text
The Journald Receiver parses Journald events from the systemd.
For example, here is a configuration that reads all logs from Journald from some specific units:
receivers:
journald:
directory: /run/log/journal
units:
- ssh
- kubelet
- docker
- containerd
priority: info
The OpenTelemetry Collector has several receivers that can be used to collect logs from Kubernetes. To learn more, visit Kubernetes Log Collection and Kubernetes Event Collection.
The Kafka Receiver reads logs, metrics, and traces from Kafka.
For example, here is a configuration that reads all Kafka data:
receivers:
kafka:
protocol_version: 2.0.0
The Loki Receiver allows Promtail instances to send logs to the OpenTelemetry Collector.
For example, here is a configuration that reads all logs from an endpoint:
receivers:
loki:
protocols:
http:
endpoint: 0.0.0.0:3500
grpc:
endpoint: 0.0.0.0:3600
use_incoming_timestamp: true
The MongoDB Atlas Receiver reads logs from MongoDB Atlas.
For example, here is a configuration that reads all logs from a specific project:
receivers:
mongodbatlas:
logs:
enabled: true
projects:
- name: "project 1"
collect_audit_logs: true
collect_host_logs: true
The OTLPJson File Receiver reads any existing OTLP Logs, Metrics, or Traces from a file on a server.
For example, here is a configuration that reads from a specific directory and excludes a specific file:
receivers:
otlpjsonfile:
include:
- "/var/log/*.log"
exclude:
- "/var/log/example.log"
The Pulsar Receiver collects logs, metrics, and traces from Apache Pulsar.
For example, here is a configuration that reads data from a Pulsar cluster:
receivers:
pulsar:
endpoint: pulsar://localhost:6650
topic: otlp-spans
subscription: otlp_spans_sub
consumer_name: otlp_spans_sub_1
encoding: otlp_proto
auth:
tls:
cert_file: cert.pem
key_file: key.pem
tls_allow_insecure_connection: false
tls_trust_certs_file_path: ca.pem
The SignalFx Receiver reads logs from a SignalFx endpoint.
For example, here is a configuration that reads data from a SignalFx endpoint:
receivers:
signalfx:
endpoint: 0.0.0.0:9943
signalfx/advanced:
endpoint: 0.0.0.0:9943
access_token_passthrough: true
tls:
cert_file: /test.crt
key_file: /test.key
The Syslog Receiver parses Syslogs received over UDP or TCP.
For example, here is a configuration that reads Syslogs from TCP:
receivers:
syslog:
tcp:
listen_address: "0.0.0.0:54526"
protocol: rfc5424
The TCP Receiver receives logs over TCP.
For example, here is a configuration that reads logs from TCP over a particular address:
receivers:
tcplog:
listen_address: "0.0.0.0:54525"
The UDP Receiver receives logs over UDP.
For example, here is a configuration that reads logs from UDP over a particular address:
receivers:
udplog:
listen_address: "0.0.0.0:54525"
The Webhook Event Receiver allows for any webhook-style data source to send logs to the OpenTelemetry Collector.
For example, here is a configuration that reads logs from a webhook:
receivers:
webhookevent:
endpoint: localhost:8088
read_timeout: "500ms"
path: "eventsource/receiver"
health_path: "eventreceiver/healthcheck"
required_header:
key: "required-header-key"
value: "required-header-value"
The Windows Log Event Receiver tails and parses logs from the Windows event log API.
For example, here is a configuration that reads logs from a named channel:
receivers:
windowseventlog:
channel: application