RATE_SUM aggregates in a query allow you to visualize the rate, or the ongoing change over time of a field.
This visualization can be useful to understand the rate at which a field is changing, and in particular, is a helpful tool in understanding metric data that is collected as a counter or sum.
RATE_MAX(<numeric_field_name>)- Displays the difference between subsequent field values after applying the
RATE_AVG(<numeric_field_name>)- Displays the difference between subsequent field values after applying the
RATE_SUM(<numeric_field_name>)- Displays the difference between subsequent field values after applying the
Metrics datasets often contain “counter” style fields. The value of these fields represent the number of times that a particular action occurred since a specific date.
For example, the metric
system.network.dropped is frequently a counter that represents the number of network packets that were dropped or discarded since the machine booted up.
With this type of data, a visualization of
AVG(system.network.dropped) would yield a graph that is mostly a straight line, trending up and to the right.
The slope of this line will change occasionally as network characteristics fluctuate over time.
In fact, the slope of this line is the most useful piece of information provided by this visualization, as it represents the amount of network packets being dropped at any given point in time rather than the total number of dropped packets since the system started up.
In this example, using
RATE_AVG(system.network.dropped) in Query Builder would allow you to view a graph of this slope as it changes over time.
RATE operations are all calculated according to the same process.
In the very simplest case, imagine a dataset field that contains one value every 10 seconds.
A rate query is designed to get the difference between the value of
my.field for that time bucket and the previous time bucket.
A Rate query with a ten second granularity would return that result, as expected.
RATE_AVG(my.field) would all work.)
|12:30:10||150 - 100 = 50|
|12:30:20||210 - 150 = 60|
|12:30:30||265 - 210 = 55|
However, if there were more than one data point in an individual bucket, we would need to apply the aggregation (
AVG) against all data points in that bucket to coalesce them into a single value.
Additionally, if there is a gap between the given value and the previous value,
RATE_xxx(field) will linearly interpolate the difference between the values across the gap.
For example, with a dataset containing the following events:
Given the above events, querying the RATE aggregations against
my.field with a granularity of 10 seconds would result in a graph of the following values:
|12:30:10||no data (nothing to compare)||no data||no data|
|12:30:20||MAX(10,11,24) - MAX(0,3,6) = 18||SUM(10,11,24) - SUM(0,3,6) = 36||AVG(10,11,24) - AVG(0,3,6) = 12|
|12:30:30||MAX(27,29,34) - MAX(10,11,24) = 10||SUM(27,29,34) - SUM(10,11,24) = 45||AVG(27,29,34) - AVG(10,11,24) = 15|
|12:30:40||no data (no underlying data)||no data||no data|
|12:30:50||(MAX(39,44,49) - MAX(27,29,34)) / 2 = 7.5||(SUM(39,44,49) - SUM(27,29,34)) / 2 = 21||(AVG(39,44,49) - AVG(27,29,34)) / 2 = 7|
Query granularity has an important effect on the results of RATE aggregations, since it determines the number of data points that end up aggregated together in any specific bucket.
RATE_SUM, in particular, is sensitive to granularity choice.
If the number of data points that fit into a bucket vary, the result of the
SUM aggregation will in turn be variable, and the resulting calculated rate of change between each bucket will reflect that variability.
For instance, imagine a metrics counter, such as
system.cpu.time.user, with a value that is captured and stored in an event every 10 seconds.
RATE_SUM() aggregation over this counter with a granularity set to a multiple of 10 seconds – for example, a 30 second granularity – will result in a consistent number of data points for each bucket.
In our example, we would expect to see 3 data points per host.
If the granularity of this aggregation is set to a value that does not evenly divide into 10 seconds – for example, a 15 second granularity – each bucket will have a different number of data points, and thus an inconsistent aggregate result. In this example, we would see 2 data points per host in every other bucket, and 1 data point per host in alternating buckets.
It is possible to create triggers for
Since query granularity is not available in triggers,
RATE aggregations are calculated over the trigger duration.
For instance, a trigger for
RATE_MAX(system.cpu.time.user) with a duration of 15 minutes would calculate the difference between the most recent 15 minutes of data and the preceding 15 minutes of data.