Query Builder allows you to construct queries against your data to produce results for investigation and further exploration. With Query Builder, enter field names, operators, and/or field values in clauses to construct or modify a query.
Use Query Builder to explore your data.
In the left navigation menu, select Query with its bar chart and magnifying glass icon. When the left navigation menu is compact, only the icon appears.
Below is an example of a blank Query Builder.
A query in Honeycomb consists of up to six clauses:
HEATMAP
visualization shows the distribution of data over timeYou can use relational fields in the WHERE and GROUP BY clauses to narrow your query to find traces that contain fields and values within a specific span type. You can identify span types by their prefix.
Relational fields are available for use only within the Query Builder in the Honeycomb UI and only within the WHERE and GROUP BY clauses.
Using relational fields automatically applies AND operators to the containing clause; all conditions within the clause must be met.
Relational field span prefixes follow your trace structure and include:
root.
- Identifies the
within a trace. To find a match, any additional filters in your query will search through fields only in the specified root.
span.
parent.
- Identifies a direct
within a trace. To find a match, any additional filters in your query will search through fields only in the specified parent.
span and its immediate
.
any.
- Identifies any span within a trace matching a particular field. This prefix allows you to find multiple spans with a certain field and/or value rather than limit to a single span result. Any additional filters will search for spans that match the specified any.
span.
Using more than one any.
term will match many fields on a single span.
Using any.
in a GROUP BY clause requires specifying a field.
To use Query Builder, enter terms. For examples to try, refer to our Query Examples. As you type, Honeycomb autocomplete prompts with relational field prefixes and selectable query components. Execute the query by selecting Run Query or pressing Shift + Enter on the keyboard.
?
while on the Query Builder page.When you apply relational fields to your query, you can extend queries to use qualifying relationships that indicate where and how spans appear in a trace.
For example, if you managed an e-commerce platform and were trying to determine how many users received an error when they applied a discount code during checkout, you would be interested in finding any applicable spans in your trace.
This example query uses both the root.
and any.
prefixes to find the count of all spans where an error exists, where any span in the same trace is named getDiscounts
and where its root span is named /cart/checkout
.
CLAUSE | VALUE | DESCRIPTION |
---|---|---|
VISUALIZE | COUNT | Calculates count. |
WHERE | error exists AND any.name = getDiscounts AND root.name = /cart/checkout |
Applies conditions to narrow your query:
|
GROUP BY | error |
Groups by the error. |
After querying, you might see the following trace summary when you select a span from the query results:
To learn more about how to use relational fields with specific span types, visit Query Reference: Relational Fields.
The default output for most queries will include:
The precise composition depends on what your query contains.
Click on any box in the Query Builder to edit the clauses there.
Selecting the X removes the field from the clause.
In the image below, the user already entered VISUALIZE COUNT
and GROUP BY hostname
.
When editing, they add a new WHERE clause on app.status_code
.
Honeycomb autocomplete helps construct the query.
In the upper left corner above the Query Builder, use the Dataset Switcher to choose to query a specific Dataset or all Datasets within your Environment. The latter option may be useful if you need to reference events that span across multiple services.
When you switch datasets, Honeycomb will load the existing query on that dataset, but it will not run it automatically.
In the bottom right corner below the Query Builder, use the time picker to modify the selected time span of the data in your query. Use a preset time range or a custom time range.
Honeycomb supports running queries across different periods of time, so you can compare results to see how your systems change.
Run a Time Comparison query by selecting a time range under Compare to previous time range in the time picker. Honeycomb then will run a query as defined in the Query Builder and another query for the time comparison range selected.
Time comparison queries are currently only in the Query Builder UI and are not supported in the API.
Query Assistant is an experimental feature that generates Honeycomb queries based on your natural language query (NLQ) input.
Use Query Assistant to:
You can access Query Assistant underneath the Query Builder. To use Query Assistant, enter your prompt in the search box and select Get Query, or select one of the suggested questions below the search box. Based on your entry or selection, Query Assistant creates and runs a query in the Query Builder. Results appear after the screen refreshes.
You can expand and collapse the Query Assistant display. Any changes you make will persist. You can also control your team’s ability to use Query Assistant in Team Settings. To learn how to enable or disable Query Assistant, visit [Teams: Manage Behavior](/get-started/configure/teams/manage-behavior#manage-query-assistant}.
Query Assistant uses machine learning systems, such as a Large Language Model (LLM) with a generative pre-trained transformer, to assist in creating Honeycomb Queries using natural language.
In addition to your schema, Query Assistant uses the following as context:
When modifying the current query, Query Assistant translates your prompts into additional query clauses.
For example, when given a query that displays overall latency and the prompt “only show errors”, Query Assistant usually adds a WHERE clause to return spans with an error field present and set to true
.
When a dataset has Suggested Queries configured, Query Assistant analyzes its fields and generates better results. We recommend configuring your own Suggested Queries for datasets that do not conform to common standards defined by OpenTelemetry instrumentation.
Query Assistant uses the fields defined in Dataset Definitions.
For example, OpenTelemetry (and Honeycomb, by default) recognizes any error as a boolean value in the error
field.
When a Dataset Definition overrides this default with a string value in the app.error
field, Query Assistant uses the app.error
field instead of the error
field when it evaluates prompts.
A user can use up to 50 natural language queries per 24 hours.
Query Assistant is not available to any Honeycomb customer who has signed a HIPAA Business Associate Agreement (BAA).
Honeycomb uses OpenAI’s API for Query Assistant. Honeycomb sends information to OpenAI’s API for the purpose of generating a runnable query based on your input. Data is only sent when you execute a natural language query. In addition, Honeycomb does not use any data to train ML models. In the future, we are interested in using data to create more personalized user experiences, but we have no plans to incorporate data itself, and all data is still subject to our Data Retention window.
What Honeycomb sends to OpenAI:
What Honeycomb does NOT send to OpenAI:
OpenAI does not train models on data sent via their API. OpenAI does retain all data for a short period of time to monitor for abuse and misuse. Honeycomb does not use their opt-in mechanism for training and has no plans to offer that as an option for users at this time.
OpenAI’s API exposes the base Large Language Model (LLM) that ChatGPT also uses. ChatGPT adds additional layers of machine learning systems suited for a general-purpose chat application and uses a subset of data it receives to further train their systems. The systems that ChatGPT adds on top of the LLM are not part of Honeycomb’s product implementation.
Honeycomb supports a wide range of calculations to provide insight into your events. When a grouping is provided, calculations occur within each group; otherwise, anything calculated is done so over all matching events.
For example, say you have collected the following events from your web server logs:
Timestamp | uri | status_code | response_time_ms |
---|---|---|---|
2016-08-01 07:30 | /about | 500 | 126 |
2016-08-01 07:45 | /about | 200 | 57 |
2016-08-01 07:57 | /docs | 200 | 82 |
2016-08-01 08:03 | /docs | 200 | 23 |
Specifying a visualization for a particular attribute (P95(response_time_ms)
for example) means to apply the aggregation function (in this case, P95
, or taking the 95th percentile) over the values for the attribute (response_time_ms
) across all input events.
Defining multiple VISUALIZE clauses is common and can be useful, especially when comparing their outputs to each other.
For example, it can be useful to see both the COUNT
and the P95(duration)
for a set of events to understand whether latency changes follow volume changes.
While most VISUALIZE queries return a line graph, the HEATMAP
visualization allows you see the distribution of data in a rich and interactive way.
Heatmaps also allow you to use BubbleUp.
Scenario: we want to capture overall statistics for our web server. Given our four-event dataset described above, consider a query which contains:
COUNT
AVG
of response_time_ms
valuesP95
of response_time_ms
valuesThese calculations would return statistics across the entirety of our dataset:
COUNT | AVG(response_time_ms) | P95(response_time_ms) |
---|---|---|
4 | 72 | 119.4 |
The VISUALIZE clause returns both a time series graph and a column of values in the result summary table. Note that in the graph, the values shown are aggregated for each interval at the current granularity. Conversely, the values shown in the summary table are calculated across the entire time range for that query.
These results can be a little surprising for some calculations.
The P95(duration_ms)
across the entire time range may not look quite like the P95
value at any given point of the curve, because there may be spikes and bumps in the underlying data that are hidden in the time intervals.
The summary table for a HEATMAP
shows a histogram of values of that field across the full time range.
Most VISUALIZE operations take a single argument, with the exception of COUNT
and CONCURRENCY
, which take no arguments.
Events that do not have a relevant attribute are ignored, and will not be counted in aggregations.
In the chart below, <field_name>
refers to any field name while the <numeric_field_name>
argument refers to a float or integer field.
Aggregate | Meaning |
---|---|
COUNT |
The number of events |
COUNT_DISTINCT(<field_name>) |
The number of different values |
SUM(<numeric_field_name>) |
The sum of the field value |
AVG(<numeric_field_name>) |
The average value of the field |
MAX(<numeric_field_name>) |
The maximum value of the field |
MIN(<numeric_field_name>) |
The minimum value of the field |
P001(<numeric_field_name>) |
The .1-th percentile of the field |
P01(<numeric_field_name>) |
The 1st percentile of the field |
P05(<numeric_field_name>) |
The 5th percentile of the field |
P10(<numeric_field_name>) |
The 10th percentile of the field |
P20(<numeric_field_name>) |
The 20th percentile of the field |
P25(<numeric_field_name>) |
The 25th percentile of the field |
P50(<numeric_field_name>) |
The 50th percentile of the field |
P75(<numeric_field_name>) |
The 75th percentile of the field |
P80(<numeric_field_name>) |
The 80th percentile of the field |
P90(<numeric_field_name>) |
The 90th percentile of the field |
P95(<numeric_field_name>) |
The 95th percentile of the field |
P99(<numeric_field_name>) |
The 99th percentile of the field |
P999(<numeric_field_name>) |
The 99.9th percentile of the field |
HEATMAP(<numeric_field_name>) |
A heatmap of the distribution of that field |
CONCURRENCY |
The concurrency of the current query, for tracing-enabled datasets only. This is a complex operation; please see the detailed explanatory page on it. |
RATE_AVG(<numeric_field_name>) |
The difference between subsequent field values after applying the RATE_AVG operator. Learn more about the RATE operators. |
RATE_SUM(<numeric_field_name>) |
The difference between subsequent field values after applying the RATE_SUM operator. Learn more about the RATE operators. |
RATE_MAX(<numeric_field_name>) |
The difference between subsequent field values after applying the RATE_MAX operator. Learn more about the RATE operators. |
The WHERE clause allows you to constrain events by some attribute besides time. For example, you may want to ignore an outlier case or isolate events triggered by a particular actor or circumstance.
For example, say you have collected the following events from your web server logs:
Timestamp | uri | status_code |
---|---|---|
2016-08-01 08:15 | /about | 500 |
2016-08-01 08:22 | /about | 200 |
2016-08-01 08:27 | /docs | 403 |
You can define any number of arbitrary constraints based on event values. WHERE clauses work in concert with the specified time range to define the events that are ultimately considered by any GROUP BY or VISUALIZE clauses.
/docs
, you can enter uri = /docs
.Scenario: we want to understand the frequency of unsuccessful web requests. Given our three-event dataset described above, consider a query which contains:
COUNT
status_code != 200
The WHERE clause removes the successful event (our /about
web request returning a 200
) from consideration, and only counts the first and third events towards our VISUALIZE clause:
COUNT |
---|
2 |
Scenario: we want to refine our constraints further, to span multiple attributes for each event. Combining WHERE clauses returns events that satisfy either the intersection of all specified WHERE clauses, or the union. Given our three-event dataset described above, consider a query which contains:
uri = "/about"
status_code != 200
As all three events are considered by the WHERE clauses, only the first one satisfies both:
Timestamp | uri | status_code |
---|---|---|
2016-08-01 08:55 | /about | 500 |
Honeycomb also allows you to look at the union of clauses by setting to an OR
.
status_code = 500
status_code = 403
Timestamp | uri | status_code |
---|---|---|
2016-08-01 08:55 | /about | 500 |
2016-08-01 08:27 | /docs | 403 |
You can use relational fields in your WHERE clause to narrow your query to find traces that contain fields and values within a specific span type. To learn more and explore some examples, visit Relational Fields in this reference.
WHERE
operations may take one or more attributes.
Events are only counted if they have a relevant attribute, except in the case of the does-not-exist
operation.
Operation | Opposite | Arguments | Meaning |
---|---|---|---|
= |
!= |
1 | Exact numerical or string match |
starts-with |
does-not-start-with |
1 | String start match |
ends-with |
does-not-end-with |
1 | String end match |
exists |
does-not-exist |
0 | Checks for non-null values |
> , >= |
< , <= |
1 | numerical comparison |
contains |
does-not-contain |
1 | string inclusion: checks whether the attribute matches a substring of the value |
in |
not-in |
1+ | list inclusion: checks whether the attribute matches any item in the list |
The syntax for the in
operator does not use parentheses.
For example, request_method in GET,POST
Being able to separate a series of events into groups by attribute is a powerful way to compare segments of your dataset against each other.
For example, say you have collected the following events from your web server logs:
Timestamp | uri | status_code |
---|---|---|
2016-08-01 07:30 | /about | 500 |
2016-08-01 07:45 | /about | 200 |
2016-08-01 07:57 | /docs | 200 |
You might want to analyze your web traffic in groups based on the uri
("/about" vs “/docs”) or the status_code
(500
versus 200
).
Choosing to group by uri
would return two result rows: one representing events in which uri="/about"
and another representing events in which uri="/docs"
.
Each of these grouped results rows will be represented by a single line on a graph.
When using GROUP BY for more than one attribute, Honeycomb will consider each unique combination of values as a single group.
Here, choosing to group by both uri
and status_code
will return three groups: /about
+500
, /about
+200
, and /docs
+200
.
Grouping, paired with calculation, can often reveal interesting patterns in your underlying events—grouping by uri
, for example, and calculating response time stats will show you the slowest (or fastest) uri
s.
The GROUP BY dropdown list also has a shortcut to create a derived column.
Scenario: we want to examine performance of our web server by endpoint. Given our four-event dataset described above, consider a query which contains:
uri
COUNT
AVG
of response_time_ms
valuesPairing a Grouping clause with a Visualize clause results in events being grouped by uri
; Honeycomb draws one line for each group, and calculates statistics within each group:
uri | COUNT | AVG(response_time_ms) |
---|---|---|
/about | 2 | 91.5 |
/docs | 2 | 52.5 |
This technique is particularly powerful when paired with an Order By and a Limit to return “Top K”-style results.
In this figure, the user has a VISUALIZE COUNT
, and GROUP BY eventtype
.
The two curves, in purple and orange, show the two groups.
The popup shows that the user is hovering the trace_span
eventtype, which has a count of 405 in that 15-second time range.
When you move your cursor over the results table at the bottom of the page, each row is highlighted in turn.
In the image below, the user has highlighted request
and sees the orange line highlighted, and the purple line dimmed.
Rollover for heatmaps is slightly different, as described on the Heatmaps page.
You can use relational fields in your GROUP BY clause to separate a series of events into groups. To learn more and explore some examples, visit Relational Fields in this reference.
ORDER BY clauses define how rows will be sorted in the results table.
For example, say you have collected the following events from your web server logs:
Timestamp | uri | status_code | response_time_ms |
---|---|---|---|
2016-08-01 09:17 | /about | 200 | 57 |
2016-08-01 09:18 | /about | 500 | 234 |
2016-08-01 09:20 | /404 | 200 | 12 |
2016-08-01 09:25 | /docs | 200 | 82 |
You can define any number of ORDER BY clauses in a query and they will be respected in the order they are specified.
The ORDER BY clauses available to you for a particular query are influenced by whether any GROUP BY or VISUALIZE clauses are also specified. If none are, you may order by any of the attributes contained in the dataset. However, once a GROUP BY or VISUALIZE clause exists, you may only order by the values generated by those clauses.
Scenario: we just want to get a sense of the slowest endpoints in our web server. Given our four-event dataset described above, consider a query which contains:
response_time_ms
in descending (DESC
) orderRemember that when no Visualize clauses are defined, we simply return raw events as the result rows:
Timestamp | uri | status_code | response_time_ms |
---|---|---|---|
2016-08-01 09:18 | /about | 500 | 234 |
Scenario: we want to capture statistics for our web server and know what we are looking for (long response_time_ms
s).
Given our four-event dataset described above, consider a query which contains:
P95(response_time_ms)
, or the p95 of response_time_ms
valuesuri
P95(response_time_ms)
in descending (DESC
) order2
resultsOur GROUP BY and VISUALIZE queries influence what will be returned as result rows (uri
and the P95(response_time_ms)
for events within each distinct uri
group), while the ORDER BY determines the sort order of those results (longest P95(response_time_ms)
first) and the LIMIT throws away any results beyond the top 2
:
uri | P95(response_time_ms) |
---|---|
/about | 225.15 |
/docs | 82 |
As you can see, any results referencing the event with uri="/404"
was excluded from our result set as a result of its relatively low response_time_ms
.
This sort of Top K query is particularly valuable when working with high-cardinality data sets, where a GROUP BY clause might split your dataset into a very large number of groups.
The LIMIT clause provides a maximum number of result rows to return. By default, queries return 100 result rows. The LIMIT clause allows you to specify up to 1000 rows.
The HAVING clause allows you to filter on the results table. This operation is distinct from the WHERE clause, which filters the underlying events. A HAVING filter can help further refine your query results when grouping on a high-cardinality attribute, which can result in many different rows in the result table. It can also work in tandem with an ORDER BY clause. ORDER BY allows you to order the results, and HAVING filters them.
Like ORDER BY, HAVING selects its series from the results table.
For example, consider a query on VISUALIZE COUNT, P95(duration_ms)
and GROUP BY endpoint
.
This query would show how many times each endpoint ran, and how often it did so.
The results table from this query might look something like this:
endpoint | COUNT | P95( duration_ms ) |
---|---|---|
/add-to-cart | 521 | 45 |
/remove-from-cart | 1021 | 54 |
/unused-endpoint | 2 | 1500 |
/empty-page | 10 | 1700 |
You might want to ignore the rarest entries when they are least likely to be useful. One way to do that is to add a HAVING clause:
The clause HAVING COUNT > 100
will filter to only results with more than 100 hits on them.
You can then ORDER BY P95( duration_ms )
to sort the results to find the slowest endpoints.
HAVING works by filtering specifically on aggregations across all results in the selected time range. It currently does not filter time periods that match the criteria. Take a look at the following example. Each event has counts ranging from 0 - 9 across the time range.
You may want to filter specifically on counts greater than 5 in the example and add the clause HAVING COUNT > 5
.
This will still return all results in the image, since HAVING filters based off the total in the result table where all values are shown to have counts > 5.
The HAVING clause always refers to one of the VISUALIZE clauses. It then takes one or more numeric arguments.
Operator | Opposite | Arguments | Meaning |
---|---|---|---|
= | != | 1 | numerical equality |
>, >= | <, <= | 1 | comparison |
in | not-in | 1+ | existence in a list |
The in
operator compares where a value is one of a set: COUNT in 10, 20, 30
checks whether the COUNT is precisely one of those values.
The syntax for the in
operator does not use parentheses.
You can use relational fields to narrow your query to find traces that contain fields and values within a specific span type. Span types are identified by prefixes that follow your trace structure.
You can use relational fields in WHERE or GROUP BY clauses in the Honeycomb UI.
The root.
prefix identifies the
within a trace.
When you use it, additional filters in your query will match only for spans in traces that match the conditions on the root span.
You can also query with root.
on its own.
If you are looking for spans somewhere in a trace that you know starts with certain fields, use root.
.
So you might use a query that contains:
COUNT
name = HGET
AND root.name = /cart
AND root.duration_ms > 1000
And you might see the following trace summary when you select a span from the query results:
To explore more examples related to using relational fields with the root.
prefix, visit Examples: Query for Traces.
The parent.
prefix identifies a direct
within a trace.
When you use it, additional filters in your query will search through fields only in the specified parent.
span’s immediate
.
For example, if you know a service is being called at a lower level inside a trace, and you want to find one of its direct child spans, use parent.
.
You can also use parent.
with no additional filters to count the child spans of a parent span.
When you do so, Honeycomb will find one parent span match per trace.
So you might use a query that contains:
P95(duration_ms)
name = retriever_client_fetch
AND parent.name = fetch_dataset_samples
And you might see the following trace summary when you select a span from the query results:
To explore more examples related to using relational fields with the parent.
prefix, visit Examples: Query for Traces.
The any.
prefix identifies any span within a trace.
When you use it, additional filters in your query will match only for spans in traces that match your any.
span conditions.
any.
in your GROUP BY clause, you must also use any.
in your WHERE clause.
Doing so will ensure your joins are accurate.any.
relational fields in the same clause, Honeycomb will apply them all to the same span.any.
returns results in random order.So you might use a query that contains:
COUNT
name = getDiscounts
AND any.http.host = cartservice:7070
AND any.duration_ms > 100
And you might see the following trace summary when you select a span from the query results:
You can use all of the relational fields prefixes together to perform a more complex query.
So you might use a query that contains:
COUNT
error exists
AND root.name = /cart/checkout
AND parent.duration_ms > 200
AND any.name = getDiscounts
any.k8s.pod.name
And you might see the following trace summary when you select a span from the query results:
To explore more examples related to using relational fields with the any.
prefix, visit Examples: Query for Traces.
Refer to Common Issues with Queries.