The RATE_MAX
, RATE_AVG
, and RATE_SUM
aggregates in a query allow you to visualize the rate, or the ongoing change over time of a field.
This visualization can be useful to understand the rate at which a field is changing, and in particular, is a helpful tool in understanding metric data that is collected as a counter or sum.
RATE_MAX(<numeric_field_name>)
- Displays the difference between subsequent field values after applying the MAX
operatorRATE_AVG(<numeric_field_name>)
- Displays the difference between subsequent field values after applying the AVG
operatorRATE_SUM(<numeric_field_name>)
- Displays the difference between subsequent field values after applying the SUM
operatorMetrics datasets often contain “counter” style fields. The value of these fields represent the number of times that a particular action occurred since a specific date.
For example, the metric system.network.dropped
is frequently a counter that represents the number of network packets that were dropped or discarded since the machine booted up.
With this type of data, a visualization of AVG(system.network.dropped)
would yield a graph that is mostly a straight line, trending up and to the right.
The slope of this line will change occasionally as network characteristics fluctuate over time.
In fact, the slope of this line is the most useful piece of information provided by this visualization, as it represents the amount of network packets being dropped at any given point in time rather than the total number of dropped packets since the system started up.
In this example, using RATE_AVG(system.network.dropped)
in Query Builder would allow you to view a graph of this slope as it changes over time.
The RATE
operations are all calculated according to the same process.
In the very simplest case, imagine a dataset field that contains one value every 10 seconds.
Timestamp | my.field |
---|---|
12:30:00 | 100 |
12:30:10 | 150 |
12:30:20 | 210 |
12:30:30 | 265 |
A rate query is designed to get the difference between the value of my.field
for that time bucket and the previous time bucket.
A Rate query with a ten second granularity would return that result, as expected.
(RATE_MAX(my.field)
, RATE_SUM(my.field)
, and RATE_AVG(my.field)
would all work.)
Timestamp | RATE_xxx(my.field) |
---|---|
12:30:00 | no data |
12:30:10 | 150 - 100 = 50 |
12:30:20 | 210 - 150 = 60 |
12:30:30 | 265 - 210 = 55 |
However, if there were more than one data point in an individual bucket, we would need to apply the aggregation (MAX
, SUM
, or AVG
) against all data points in that bucket to coalesce them into a single value.
Additionally, if there is a gap between the given value and the previous value, RATE_xxx(field)
will linearly interpolate the difference between the values across the gap.
For example, with a dataset containing the following events:
Timestamp | my.field |
---|---|
12:30:10 | 0 |
12:30:13 | 3 |
12:30:15 | 6 |
12:30:20 | 10 |
12:30:23 | 11 |
12:30:25 | 24 |
12:30:30 | 27 |
12:30:33 | 29 |
12:30:35 | 34 |
12:30:50 | 39 |
12:30:53 | 44 |
12:30:55 | 49 |
Given the above events, querying the RATE aggregations against my.field
with a granularity of 10 seconds would result in a graph of the following values:
Time | RATE_MAX(my.field) | RATE_SUM(my.field) | RATE_AVG(my.field) |
---|---|---|---|
12:30:10 | no data (nothing to compare) | no data | no data |
12:30:20 | MAX(10,11,24) - MAX(0,3,6) = 18 | SUM(10,11,24) - SUM(0,3,6) = 36 | AVG(10,11,24) - AVG(0,3,6) = 12 |
12:30:30 | MAX(27,29,34) - MAX(10,11,24) = 10 | SUM(27,29,34) - SUM(10,11,24) = 45 | AVG(27,29,34) - AVG(10,11,24) = 15 |
12:30:40 | no data (no underlying data) | no data | no data |
12:30:50 | (MAX(39,44,49) - MAX(27,29,34)) / 2 = 7.5 | (SUM(39,44,49) - SUM(27,29,34)) / 2 = 21 | (AVG(39,44,49) - AVG(27,29,34)) / 2 = 7 |
Query granularity has an important effect on the results of RATE aggregations, since it determines the number of data points that end up aggregated together in any specific bucket.
RATE_SUM
, in particular, is sensitive to granularity choice.
If the number of data points that fit into a bucket vary, the result of the SUM
aggregation will in turn be variable, and the resulting calculated rate of change between each bucket will reflect that variability.
For instance, imagine a metrics counter, such as system.cpu.time.user
, with a value that is captured and stored in an event every 10 seconds.
A RATE_SUM()
aggregation over this counter with a granularity set to a multiple of 10 seconds – for example, a 30 second granularity – will result in a consistent number of data points for each bucket.
In our example, we would expect to see 3 data points per host.
If the granularity of this aggregation is set to a value that does not evenly divide into 10 seconds – for example, a 15 second granularity – each bucket will have a different number of data points, and thus an inconsistent aggregate result. In this example, we would see 2 data points per host in every other bucket, and 1 data point per host in alternating buckets.
It is possible to create triggers for RATE_MAX
, RATE_AVG
, and RATE_SUM
queries.
Since query granularity is not available in triggers, RATE
aggregations are calculated over the trigger duration.
For instance, a trigger for RATE_MAX(system.cpu.time.user)
with a duration of 15 minutes would calculate the difference between the most recent 15 minutes of data and the preceding 15 minutes of data.