Heatmaps are a visualization that shows the statistical distribution of the values in a dataset column over time.
Take the graph below, which shows the statistical distribution of duration_ms
over the selected time period:
Each of the vertical columns in that graph is a histogram for that time bucket. The color is chosen based on the number of events fall into that time and value, pale teal at the low end, dark blue at the high end. For example, there are many events that took almost 1.0k ms, but few events that took between 0 and .5k ms.
To add a heatmap to a query, click in the Calculate clause and scroll down:
Heatmaps look best when you have a lot of events to visualize, and where the spread of values is wide enough to see some differentiation, but not complete noise.
Any column representing a duration or size is a perfect fit, but any column you might run a percentile or average calculation on may benefit from being rendered as a heatmap as well.
The rollover for heatmaps is different than for the normal line and stacked graphs:
“This time bucket” shows you what the histogram is for the column under the time bar. You can see those outliers represented as small bumps on the right hand side of the histogram, while “Entire time range” shows you the merged histogram for all data within the time range displayed in the graph. Note the different X and Y axes for these histograms.
In the results view, a heatmap shows a histogram as its summary row. The histogram is similar to the “entire time range” – it shows the distribution of the values across all the data.
One of the most powerful features of Honeycomb is the ability to group results based on values in columns. Heatmaps work well with this. Take the query below, where we have grouped by customer. By default the heatmap of all customers is shown (approximating what you would see if you did the query without grouping.)
Note that the rollover and results table below display independent histograms for each group. This makes it especially easy to see the reason for the bimodal distribution: while all customers are in the lower-valued of the two peaks, only Value Walking and Bicycles Green are in the higher.
Just as we highlight the corresponding line in line graphs as you mouse over the summary table, we also show the heatmap corresponding to that group:
You can clearly see the difference between the top two customers and everyone else: those customers have higher durations, near the upper part of the heatmap, while the other customers are closer to 0.9k.
For tracing-enabled datasets, clicking on a cell in the histogram will choose an arbitrary trace that corresponds to the cell: that is, it has a span that fulfills the WHERE
clause, started at that time, and has that value.
For example, if the user were to click in the group by image above, they would see a trace at 11:51 with a span of duration 2.0k.
Once you have a Heatmap, you can use it to identify how values differ. See the documentation on BubbleUp to learn more.
If a heatmap has outlier values, it may be helpful to use a logarithmic scale to visualize them. The Log Scale option in Graph Settings does not work with heatmaps.
Instead, use a derived column with the function LOG10($column)
.
This technique is illustrated in the blog entry Handle Unruly Outliers with Log Scale Heatmaps.