Query results showing no data or incomplete information? Sampling may have filtered out the events you need, or the data may have expired from your retention period. While sampling and retention limits help control costs, they mean some data isn’t available in Honeycomb. Archive rehydration solves this by retrieving your full, unsampled dataset from S3 storage on-demand, so you can investigate without the gaps.
If you have configured an Amazon S3 bucket as an archive for OpenTelemetry trace and log data, you can rehydrate that data and query it in Honeycomb. This is useful for investigating data that was sampled out or data that has expired from your standard retention period.
Time ranges and indexed fields let you filter and retrieve only the part of your archived data needed for your investigation, resulting in faster rehydration and lower costs.
Indexed fields are attributes configured when you set up your S3 exporter.
When you rehydrate data from your archive, Honeycomb:
2024-01-15 10:00 and 2024-01-15 11:00 where app.customer.id=12345.Rehydrated data persists for your standard retention period from the time of ingestion. During this time, you can query it as many times as needed without rehydrating again.
When you request a rehydration, Honeycomb checks which data has already been ingested. If some of the requested data already exists in your environment, only the missing data is ingested. Your queries then run against all the rehydrated data in your environment.
You can enhance an existing query by pulling in relevant archived data that matches your time range and indexed fields.
To enhance a query with archived data from your S3 bucket:
Run a query in the Query Builder and receive your query results.
From the Query Results, select Enhance from Archive.
In the Enhance from Archive modal, define the scope of your rehydration:
| Field | Description |
|---|---|
| Start time | Start of the event time range to rehydrate. Automatically populated with the start time of your query range. |
| End time | End of the event time range to rehydrate. Automatically populated with the end time of your query range. |
| Index | Indexed field to filter by. Automatically populated when the field is included in your query. Choose a field with high cardinality for more precise filtering. |
| Values | Value(s) to filter by. Automatically populated when the field is included in your query. Use specific values to minimize the number of events ingested during rehydration. |
Review your usage estimate to confirm:
Recalculate as often as needed.
Select Rehydrate data to begin ingesting the archived data that matches your query and chosen rehydration scope. You will be redirected to the History () page in your Amazon S3 Environment.
When ingestion completes, Honeycomb automatically re-runs your query using the rehydrated data. A notification appears with a link to your query results.
Select the link in the notification to view your query with the rehydrated data.
When you rehydrate data using a filtered index (such as customer.id), your trace waterfall may show gaps where spans are missing.
This happens because not every span in a trace contains the indexed field you filtered by.
To retrieve the missing spans, rehydrate again using the trace ID:
From the Trace Waterfall view, select Enhance again.
To reach the Trace Waterfall:
From your query results, select a data point on the graph and choose View Trace from the context menu.
In the modal, review the automatically-populated scope of your rehydration:
| Field | Description |
|---|---|
| Start time | Start of the event time range to rehydrate. Automatically set to two hours before the trace timestamp to capture all related spans. |
| End time | End of the event time range to rehydrate. Automatically set to two hours after the trace timestamp to capture all related spans. |
| Index | Indexed field to filter by. Automatically set to your trace ID field to retrieve all spans for this trace. |
| Values | Value(s) to filter by. Automatically populated with the trace ID to ensure all spans are retrieved. |
Review your usage estimate to confirm:
Select Rehydrate data to ingest the missing spans. You will be redirected to the History () page in your Amazon S3 Environment.
When ingestion completes, Honeycomb automatically re-runs your query using the rehydrated data. A notification appears with a link to your results. Your trace waterfall with all spans is now available for investigation.
You can explore archived data independently from your live telemetry by rehydrating and querying it in your dedicated archive environment.
To query only archived data from your S3 bucket:
Select Manage Data () from the navigation menu, and choose Environments.
Select your S3 Archive Environment.
Define the scope of your rehydration:
| Field | Description |
|---|---|
| Start time | Start of the event time range to rehydrate. |
| End time | End of the event time range to rehydrate. |
| Index | Indexed field to filter by. Choose a field with high cardinality for more precise filtering. |
| Values | Value(s) to filter by. Use specific values to minimize the number of events ingested during rehydration. |
Review your usage estimate and adjust your criteria to optimize cost and event count.
Select Rehydrate data.
After rehydration completes, select New Query to begin querying your ingested archived data.
Honeycomb pre-populates the query with your chosen rehydration scope as filters. Add fields, filters, or visualizations to further refine your query.
You can review all rehydration requests for your team in your archive environment:
Select Manage Data () from the navigation menu, and choose Environments.
Select your Amazon S3 Environment.
Select History () from the navigation menu.
For each rehydration request, the history shows:
user.id or trace.trace_id rather than low-cardinality fields like environment to reduce the number of events ingested.