We use cookies or similar technologies to personalize your online experience and tailor marketing to you. Many of our product features require cookies to function properly. Your use of this site and online product constitutes your consent to these personalization technologies. Read our Privacy Policy to find out more.

X

Explore with BubbleUp

About BubbleUp

BubbleUp is intended to help explain how some data points are different from the other points returned by a query. The goal is to try to explain how a subset of data differs from other data. This feature surfaces potential places to look for signal within your data.

For example, consider the graph below, which shows the statistical distribution of roundtrip_dur of an application’s requests over the selected time period.

In this set of points, for example, the analyst might want to distinguish the strange group of events that have a surprisingly-high roundtrip_dur:

Screenshot illustrating use of BubbleUp

In this screenshot:

This can help your analysis, because it helps figure out which fields are the most likely next starting points. In this case, it seems clear that one particular endpoint had a transient period of slowing down the requests.

Using BubbleUp

Currently, BubbleUp mode is only supported for heatmaps.

Learn how to make a heatmap in “Using Heatmaps.” Learn about interpreting heatmaps from “Heatmaps make Ops Better.”

To access BubbleUp mode:

BubbleUp mode works based on a selection you make within a heatmap. Create a Heatmap, and then click on BubbleUp.

Click within the heatmap to select one corner, and drag to cover the opposite corner. Ensure your selection covers some or all of the points that you want to investigate.

The selected area is called the selection, and is highlighted in orange colors; the remaining area of the shown heatmap is the baseline, and is shown in blue colors. BubbleUp separates events from the Baseline and events from the Selection, showing them as distinct groups.

The BubbleUp charts are displayed below the heatmap.

Interpreting BubbleUp Charts

A BubbleUp is based on a selection of points queried from the dataset. It shows every (non-empty) column in the dataset. For each column, it shows a histogram of values within the baseline in blue, and those from the selection in orange. The histogram shows the distribution of different values for the dataset. The height of each bar is proportional to the number of times the value occurs in the results of the query.

A BubbleUp shows a series of miniature histograms, one for each column in the dataset. The columns are divided into two groups, for categorical dimensions and continuous measures.

Dimensions

A dimension is a column that can be used to group, separate, or filter data items. In BubbleUp, categorical and ordinal data are visualized together. Categorical columns are those in which the values do not fall in a meaningful order. Examples of categorical columns include user_id, hostname, or is_responding.

Screenshot A low-cardinality, categorical dimension. In BubbleUp, categorical dimensions are shown captioned with the relevant value. The field platform has five distinct values; in both the baseline and selection, there are more “android” and “ios” values than “js” and “rest”. The donut charts in the top right show that there are most events in the selection have a platform, hostname, or endpoint; only a few events in the baseline do.

Screenshot A high cardinality, categorical dimension. When there are many columns, only the top fifty are shown, including some from each of baseline and selection. In hostname, both sets are truncated.

Screenshot In endpoint, the one bar of the selection stands out as a visible outlier. It can be interpreted to mean “there is only one value for endpoint within the selection.”

Screenshot An ordinal dimension is one that has a meaningful order. In status_code, the values are numeric, and so are arranged in ascending order. The value 200 occurs frequently in both baseline and selection. Code 500 occurs less frequently in the selection — but almost never occurs in the baseline.

Very different heights of bars in the baseline and selection can be indications that this column is unusual. For example, it could be valuable to learn how status_code differs, or what happens with the one specific endpoint.

Measures

Continuous, numerical dimensions are those where individual values are not as important. Instead, the distribution is important. In the screenshot below, the baseline and outliers are very different for durationMs and mysql_dur; they seem very similar for fraud_dur. This can help validate hypotheses — for example, the fact that mysql_dur is as different as roundtrip_dur might suggest that roundtrip time is being driven by mysql time. The donut charts in the top right show that all rows in both the baseline and the selection have a durationMs field; most events in the selection have a mysql_dur, and just under half of the events in the selection have a fraud_dur. Screenshot

Tooltips

Screenshot

A tooltip is displayed when you hover your mouse over a pair of histogram bars, displaying the field value they represent. Movering over the top bar reveals additional information about the number of events with that field. Click on the “actions” menu below the tooltip to create a new query that filters or breaks down by the field. In this case, the user_id with value 20109 is in 67% of the selection, and just 2% of the baseline.

Screenshot

Hovering over the top bar displays a tooltip that shows the complete title of the field. The field availability_zone appears in 61% of all events, and just 10% of the baseline. In other words, most baseline events do not have an availability_zone.

Tips and Tricks for Using BubbleUp

Screenshot

Troubleshooting