Calculate with Derived Columns | Honeycomb

We use cookies or similar technologies to personalize your online experience & tailor marketing to you. Many of our product features require cookies to function properly.

Read our privacy policy I accept cookies from this site

Calculate with Derived Columns

New to running queries in Honeycomb? Check out the introduction to building queries!

Derived columns allow you to run queries based on the value of an expression that is derived from the columns in an event, without having to modify the structure of the event itself. They can be used in any clause of the query builder (“Visualize,” “Where,” “Group By,"), as well as in most other places where you are asked to input a field name.

For example, let us say you have events from a web server. You want to compare latency for small requests and large requests, where a request is “small” if the value of its content_length field is less than 1000. Adding content_length to the “Group By” clause of a query would create a separate time series for every single value of content_length. That is not ideal.

Instead, you can use a derived column. Derived columns are similar to functions in a spreadsheet. In this case, we will want to group by the expression

LTE($content_length, 1000)

which evaluates to true if the value of the content_length column is less than 1000, and false otherwise.

You can add a derived column by clicking “Create derived column” in the dropdown of the Group By clause:

Derived column selector

That will open a modal dialog where you can craft your expression:

Modal for creating a derived column

and see sample values computed from recent data. After you click “create column”, the query builder will update to include your new derived column:

Query builder after adding a derived column

You can now run the updated query to get the breakdown you wanted:

Query result with derived column

After you create a derived column, it will remain available for reuse in later queries. You can view the list of derived columns, add new ones, and delete old derived columns by going to your dataset’s schema page (from the query page, click “Overview”, then “Schema”).

Derived Column Syntax 

Derived columns are composed of functions with some number of arguments enclosed in parenthesis. All function names are all-capital and case-sensitive.

Arguments can be any of the following:

  • A column name preceded by a $, $column_name for example, which will be populated with values from the given column. If the column name has spaces in it, the name must be surrounded by double quotes, $“column name”.
  • A nested function, FOO(c, d)
  • A string literal enclosed in double quotes ("foo"). Escaped special characters such as “\n” will be interpreted.
  • A raw string literal enclosed in backticks (`bar`). This is useful for regular expressions which include sequences such as \d.
  • An integer or floating-point literal
  • The boolean literals true and false
  • null, which is equivalent to an empty (missing) value.

A list of all functions is at the Derived Column Function Reference

Accessing Derived Columns 

To access all of your derived columns, click on Dataset Settings and select the Schema tab. The list of derived columns is below the list of columns. From here, you can create, edit and delete your derived columns.

There are many ways to access Dataset Settings. One way is shown below: from the query view, click the Details tab on the right hand side panel and select Go to Dataset Settings Dataset Settings

Editing Derived Columns 

Click the Edit button to edit a derived column. Edit Derived Column

If this derived column is used in a board, a trigger or an SLO, you will get a warning that this derived column is in use.

You can click Continue to proceed with the edits. The objects dependent on that derived column will be updated to use the new definition.

Sometimes, you may not have created the objects that are dependent on this derived column and may not know if it is safe to proceed with editing. In this case, you have a couple of options:

  • Each object includes the most recent editor and the creator of the object so that you can reach out to them and verify that changing the expression is acceptable.
  • You can opt to Clone the derived column - which will create a new, duplicate derived column that you can then edit.

Edit Derived Column

Deleting Derived Columns 

Click the Remove button to delete a derived column. Note that only team owners and the derived column creator have permission to remove it.

If this derived column is used in a board, trigger or SLO, Honeycomb will not allow you to delete the derived column. The objects dependent on the derived column could stop functioning should you remove it without resolving the dependencies.

The titles of each object are hyperlinked to make it easy for you to go to the object and remove the dependency. As you resolve the dependencies, click the Refresh button to reload the list. Once there are no dependencies, Honeycomb will allow you to remove the derived column.

Troubleshooting Derived Columns 

Many derived column functions are sensitive to column types. In confirming that a derived column function is working the way you expect, check the types of the inputs and outputs. It can be helpful to coerce column types to the expected type: GTE(INT($duration_ms), 50) will definitely do the expected behavior; GTE($duration_ms, 50) fails if $duration_ms is a string column.

The derived column creator accepts constant input; it can be useful to get to know functions by entering sample values: GTE(23, 24).

When using Secure Tenancy, note that some Derived Column functions are not available. Please check the guide to derived columns in Secure Tenancy

Did you find what you were looking for?