Best Practices for Querying using Relational Fields

To make your queries run faster, we recommend that you follow certain best practices when querying using relational fields.

Use more filters 

Tip
Using more filters will give the greatest performance increase. Wherever possible, use filters!

To get the greatest performance increase:

  • add filters that use field names both with and without relational field prefixes, even if some of those filters are duplicated.
  • if you add a filter with a relational field prefix to the GROUP BY clause, add a filter that uses the same relational field prefix to the WHERE clause. For example, if you are grouping by parent.name, also add a filter with a parent. relational field prefix into the WHERE clause.
  • make sure any of the additional filters you use exclude a meaningful number of events.

If all else fails, identify a field that applies to all spans (for example, service.name) and include it in the WHERE clause.

Following these recommendations will be particularly helpful for more expensive prefixes like anyX. (any., any2., any3.) and parent..

Root > anyX > parent 

Queries involving the root. prefix generally run faster than queries involving the anyX. (any., any2., any3.) prefix, which generally run faster than queries involving the parent. prefix.

When you use the root. prefix, you get a free, implied is_root filter in the WHERE clause, which will usually filter out a substantial number of spans.

When you use anyX. (any., any2., any3.), Honeycomb chooses the first span that matches your criteria, which also filters out a substantial number of spans.

When you use parent., Honeycomb must find the parent span for every event that matches the criteria defined by your non-prefixed fields, which can take some time. To improve performance, you could add a parent.name to the WHERE clause.

Use shorter time ranges 

Use shorter time ranges for queries, including queries using relational fields.

While Honeycomb can do an impressive amount of parallel processing of infrequently accessed data, we can do only so much within a given time frame. Be prepared for queries with long time ranges to take a while.

Use traces with fewer spans 

Smaller traces means fewer events to hold in memory at once and less work for Honeycomb.

Use similar services in a single trace 

Honeycomb uses the ingest time to determine what fits into a “window” of events that we keep in memory at a time. If you have a significant ingest delay for a specific service, relational fields queries that rely on joining that service’s data with other services might suffer.

For example, if you use a mix of AWS Lambda and non-Lambda services in a single trace, your ingest delay will likely vary significantly. AWS freezes the execution environment before spans can be flushed, which increases the ingest delay from AWS as opposed to other services.

Note
This only matters if the amount of ingest delay varies by service/span. If all of your services have roughly the same amount of ingest delay (for example, all consistently two to three minutes late), then your queries should not be affected.