Honeycomb supports rehydrating archived, unsampled OpenTelemetry (OTel) trace and log data stored in an Amazon S3 bucket.
In this guide, you will learn how to configure an OpenTelemetry Collector to export unsampled trace and log data to an Amazon S3 bucket using Honeycomb’s Enhance Indexing S3 Exporter. This exporter extends the standard OpenTelemetry AWS S3 Exporter by adding field-based indexing, so you can rehydrate data more quickly and cost-effectively.
When you send data to your S3 archive, the exporter automatically indexes these fields:
trace.trace_id: Unique identifier of the traceservice.name: Name of the servicesession.id: Unique identifier of the sessionYou can also index additional fields that you frequently query, such as user.id, customer.id, or environment.
When you run a query that requires rehydration, you can use these indexes to locate and retrieve only the relevant subset of your archived data. This gives you:
The exporter requires configuration in these areas.
The exporter uses a Honeycomb Management API key with the enhance:write scope for authentication and usage tracking.
Configure these fields in your exporter configuration:
api_key: Your Honeycomb Management API key.
Must have the enhance:write scope.api_secret: Your Honeycomb Management API secret.api_endpoint: URL of the Honeycomb API endpoint.
https://api.honeycomb.iohttps://api.eu1.honeycomb.ioSet these attributes in the s3uploader section:
s3_bucket: Amazon S3 bucket in which to store the data.
Not required when endpoint is set.region: AWS region for your bucket.
Default: us-east-1.s3_prefix: (Optional) Directory-like prefix for organizing files in your S3 bucket.
Example: traces-logs-directory.s3_partition_format: (Optional) Partition format to use when writing files to S3.
Default: "year=%Y/month=%m/day=%d/hour=%H/minute=%M" (minute-level resolution).compression: Compression algorithm for S3 files.
Options:
gzipnoneretry_mode: (Optional) Retry behavior for failed requests.
Default: standard.
Options:
standard (fixed intervals)adaptive (adjusts based on server response)retry_max_attempts: (Optional) Maximum number of times to retry a failed attempt.
Default: 3.endpoint: (Optional) Custom S3 endpoint for S3-compatible services, such as MinIO or on-premises object storage.
Overrides the default AWS S3 endpoint.
Example: http://localhost:9000.s3_force_path_style: (Optional) Use path-style rather than virtual-hosted style URLs.
Required for MinIO and some S3-compatible services.
Default: false.disable_ssl: (Optional) Allow unencrypted HTTP connections.
Use only for local development.
Default: false.file_prefix attribute is not supported and will cause validation to fail.Choose how telemetry is encoded before being written to S3:
marshaler: Format for encoding telemetry data.
Default: otlp_protobuf.
Options:
otlp_protobuf (smaller files, better performance)otlp_json (human-readable output).Batch data before sending to S3 by configuring the sending_queue section:
flush_timeout: Maximum time before a batch is sent to S3, even if not full.
Must be a non-zero value.min_size: Minimum size of a batch. Measured in units defined by sizer.
Default: 50000max_size: Maximum size of a batch.
Enables batch splitting.
Must be greater than or equal to min_size.
Measured in units defined by sizer.
Default: 50000queue_size: Maximum number the queue can accept.
Measured in units defined by sizer.
Default: 500000.sizer: Unit used to measure the queue and batch size.
Default: items.
Options:
requests: Number of incoming batches of traces and logs (the most performant option).items: Number of the smallest parts of each signal (spans, log records).bytes: Size of serialized data in bytes (the least performant option).You can also set a custom timeout for the exporter:
timeout: Maximum time to wait for each S3 send attempt.
Default: 30s.Control how the exporter retries failed send attempts by configuring the retry_on_failure section:
enabled: Allow the exporter to retry failed sends.initial_interval: Amount of time to delay before the first retry.
Example: 5smax_interval: Maximum amount of time to delay. Retries use exponential backoff, so each retry waits longer than the previous one.
Example: 30sAdd indexed fields beyond the built-in ones (trace.trace_id, service.name, session.id) using the indexed_fields section.
Choose fields that are frequently used in your queries or that help narrow down your investigations. High-cardinality fields are especially useful because they make rehydration more selective. Examples:
user.idcustomer.idenvironmentdeployment.versionindexed_fields:
- "user.id"
- "customer.id"
- "environment"
These examples show common configurations, so you can choose the setup that best fits your needs.
A basic example of the Enhance Indexing S3 Exporter configuration:
exporters:
enhance_indexing_s3_exporter:
# Honeycomb API credentials
api_key: ${env:HONEYCOMB_MANAGEMENT_API_KEY}
api_secret: ${env:HONEYCOMB_MANAGEMENT_API_SECRET}
api_endpoint: https://api.honeycomb.io
# S3 configuration
s3uploader:
region: 'us-east-1'
s3_bucket: 'my-test-bucket'
s3_partition_format: "year=%Y/month=%m/day=%d/hour=%H/minute=%M"
compression: 'gzip'
# Data format
marshaler: 'otlp_protobuf'
Honeycomb supports processing both trace and log OpenTelemetry signal types for data archival and rehydration. Set up a separate pipeline configuration block for each signal type.
In this example, pipelines for logs and traces are labeled to indicate pipelines intended for object storage (storage in Amazon S3):
service:
pipelines:
traces:
// [...]
logs:
// [...]
logs/objectstorage:
exporters:
- enhance_indexing_s3_exporter
receivers:
- otlp
traces/objectstorage:
exporters:
- enhance_indexing_s3_exporter
receivers:
- otlp
The example below shows a simple, but complete OpenTelemetry (OTel) Collector configuration for exporting both log and trace OTel signal types through the Enhance Indexing S3 Exporter:
my-test-bucket.telemetry-data.user.id, customer.id, environment, and deployment_version.receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
exporters:
enhance_indexing_s3_exporter:
# Honeycomb API credentials
api_key: ${env:HONEYCOMB_MANAGEMENT_API_KEY}
api_secret: ${env:HONEYCOMB_MANAGEMENT_API_SECRET}
api_endpoint: https://api.honeycomb.io
#S3 configuration
s3uploader:
region: 'us-east-1'
s3_bucket: 'my-test-bucket'
s3_prefix: 'telemetry-data'
s3_partition_format: "year=%Y/month=%m/day=%d/hour=%H/minute=%M"
compression: 'gzip'
retry_mode: 'adaptive'
retry_max_attempts: 5
# Data format
marshaler: 'otlp_protobuf'
# Custom indexed fields
indexed_fields:
- "user.id"
- "customer.id"
- "environment"
- "deployment.version"
# Batching, timeout, and retry configuration
sending_queue:
batch:
flush_timeout: 30s
max_size: 50000
min_size: 50000
enabled: true
queue_size: 500000
sizer: items
timeout: 30s
# Pipeline configuration
service:
pipelines:
logs:
receivers:
- otlp
exporters:
- enhance_indexing_s3_exporter
traces:
receivers:
- otlp
exporters:
- enhance_indexing_s3_exporter
For local development and testing, you can use MinIO as an S3-compatible object storage service:
exporters:
enhance_indexing_s3_exporter:
# Honeycomb API credentials (required even for local development)
api_key: ${env:HONEYCOMB_MANAGEMENT_API_KEY}
api_secret: ${env:HONEYCOMB_MANAGEMENT_API_SECRET}
api_endpoint: https://api.honeycomb.io
# MinIO configuration
s3uploader:
region: 'us-east-1'
endpoint: 'http://localhost:9000'
s3_bucket: 'telemetry-bucket'
s3_force_path_style: true
disable_ssl: true
s3_partition_format: "year=%Y/month=%m/day=%d/hour=%H/minute=%M"
compression: 'gzip'
# Data format
marshaler: 'otlp_json'
# Custom indexed fields
indexed_fields:
- "user.id"
- "customer.id"