New to running queries in Honeycomb? Check out the introduction to building queries!
Derived columns allow you to run queries based on the value of an expression that’s derived from the columns in an event, without having to modify the structure of the event itself. They can be used in any clause of the query builder (“Visualize,” “Where,” “Group By,” etc.), as well as in most other places where you are asked to input a field name.
For example, let’s say you have events from a web server. You want to compare latency for small requests and large requests, where a request is “small” if the value of its content_length
field is less than 1000. Adding content_length
to the “Group By” clause of a query would create a separate time series for every single value of content_length
. That’s not ideal.
Instead, you can use a derived column. Derived columns are similar to functions in a spreadsheet. In this case, we’ll want to group by the expression
LTE($content_length, 1000)
which evaluates to true
if the value of the content_length
column is less than 1000, and false
otherwise.
You can add a derived column by clicking “Create derived column” in the dropdown of the Group By clause:
That will open a modal dialog where you can craft your expression:
and see sample values computed from recent data. After you click “create column”, the query builder will update to include your new derived column:
You can now run the updated query to get the breakdown you wanted:
After you create a derived column, it’ll remain available for reuse in later queries. You can view the list of derived columns, add new ones, and delete old derived columns by going to your dataset’s schema page (from the query page, click “Overview”, then “Schema”).
Derived columns are composed of functions with some number of arguments enclosed in parenthesis. All function names are all-capital, case-sensitive:
Arguments can be any of the following:
$
, e.g. $column_name
, which will be populated with values from the given column. If the column name has spaces in it, the name must be surrounded by double quotes, e.g. $“column name”.FOO(c, d)
"foo"
). Escaped special characters such as “\n” will be interpreted.`bar`
). This is useful for regular expressions which include sequences such as \d
.true
and false
null
, which is equivalent to an empty (missing) value.A list of all functions is at the Derived Column Function Reference
To access all of your derived columns, click on Dataset Settings and select the Schema tab. The list of derived columns is below the list of columns. From here, you can create, edit and delete your derived columns.
There are many ways to access Dataset Settings. One way is shown below: from the query view, click the Details tab on the right hand side panel and select Go to Dataset Settings
Click the Edit button to edit a derived column.
If this derived column is used in a board, a trigger or an SLO, you will get a warning that this derived column is in use.
You can click Continue to proceed with the edits. The objects dependent on that derived column will be updated to use the new definition.
Sometimes, you may not have created the objects that are dependent on this derived column and may not know if it is safe to proceed with editing. In this case, you have a couple of options:
Click the Remove button to delete a derived column. Note that only team owners and the derived column creator have permission to remove it.
If this derived column is used in a board, trigger or SLO, Honeycomb will not allow you to delete the derived column. The objects dependent on the derived column could stop functioning should you remove it without resolving the dependencies.
The titles of each object are hyperlinked to make it easy for you to go to the object and remove the dependency. As you resolve the dependencies, click the Refresh button to reload the list. Once there are no dependencies, Honeycomb will allow you to remove the derived column.
Many derived column functions are sensitive to column types. In confirming that a derived column function is working the way you expect, check the types of the inputs and outputs. It can be helpful to coerce column types to the expected type: GTE(INT($duration_ms), 50)
will definitely do the expected behavior; GTE($duration_ms, 50)
fails if $duration_ms
is a string column.
The derived column creator accepts constant input; it can be useful to get to know functions by entering sample values: GTE(23, 24)
.
When using Secure Tenancy, note that some Derived Column functions are not available. Please check the guide to derived columns in Secure Tenancy