Calculate with Derived Columns | Honeycomb

Calculate with Derived Columns

Derived columns allow you to define custom fields in a Dataset or an Environment. The field’s value is based on the result of an expression that performs functions, mathematical, and/or logical operations on other fields’ values; similar to expressions defined in a spreadsheet.

These derived columns are equivalent to other fields in your data. Use in query builder, boards, triggers, and Service Level Objectives (SLOs).

Use derived columns to express:

  • an event detail in a more human-readable way
  • business measures that may change over time
  • common field across services with different implementations (example scenario)

Access a Derived Column 

Derived columns exist in the schema of a Dataset or an Environment.

  1. In Data Settings, select a Dataset.

    The Dataset Settings page displays several tabs.

  2. Select the Schema tab.

  3. Select Derived Columns to expand.

    The schema displays a button to create a new derived column, a search box, and a list of any existing derived columns.

    Navigating through the UI to the schema of a Dataset
  1. In the left navigation bar under the Honeycomb logo, select the Environments banner. A menu appears with Manage Environments and a list of existing Environments.

  2. Select Manage Environments. The next screen displays a list of Environments and details about each Environment.

  3. In the row of the target Environment, select the target Environment’s name. The next screen displays the Environment Settings page with several tabs.

  4. Select the Schema tab.

    The schema displays a button to create a new derived column, a search box, and a list of any existing derived columns.

    Navigating through the UI to the schema of an Environment

Create a Derived Column 

Any team owner or member can create a derived column in the Honeycomb UI.

Create derived columns by accessing the Dataset’s schema, the Environment’s schema, or through query builder.

  1. In Query Builder, select the GROUP BY clause.

    A list of suggested fields appears automatically when selected with a mouse/trackpad or by pressing the down arrow on the keyboard.

  2. Select Create Derived Column.

    The “Create a Derived Column” editor appears.

    Derived column selector

Edit a Derived Column 

Any team owner or member can edit a derived column in the Honeycomb UI.

Edit derived columns by accessing the Dataset’s schema or the Environment’s schema.

  1. Select Edit for the derived column.

    Edit derived column

    Changes to a derived column may effect boards, triggers, or SLOs.

    If a dependency exists, an “Edit Derived Column” modal appears first and presents the choice to clone or continue with editing the derived column. If you are unsure whether to clone or edit the column, reach out to the most recent editor of the derived column.

    Edit a derived column modal with a dependency
  2. Select Clone or Continue.

    Clone creates an unsaved copy of the derived column with a new name in the editor. Continue opens the existing derived column in the editor.

Clone a Derived Column 

Any team owner or member can clone a derived column in the Honeycomb UI.

Clone derived columns by accessing the Dataset’s schema or the Environment’s schema.

Clone a derived column from the table
  1. Select Clone next to the target derived column. The derived column editor appears.

  2. An unsaved copy of the target derived column populates in the editor.

  3. Modify the derived column name and formula as needed.

  4. Select Continue to save.

Derived Column Editor 

Note
If creating or editing a derived column for an Environment, a banner at the top of the editor informs that this derived column will be shared across all datasets in your Environment.

The editor displays the following elements:

Display Name
The display name defines an alias that the query builder treats as a field. This alias is unique to its dataset and the environment that this dataset exists.
Tip
Derived column names must be unique. Honeycomb tries to prevent situations where a derived column and a regular field share the same name, but name collisions can still occur. See our Troubleshooting Derived Columns section for more information.
Description
This provides additional information to help provide context or purpose for the derived column.
Function
Define your derived column expression here. Refer to the Derived Column Reference for syntax and a list of available functions. If syntax errors exist, errors in the expression appear underlined in red or with a red triangle. Hover over each error marker to display details about the error or refer to the error message displayed by the editor. Derived column syntax error for a function
Preview Data
Preview the results of the function below the expression editor. Sample data selected used here helps verify the expression before it is saved. Derived column preview data for a function
Save
After selection, the saved derived column appears in the schema.

Delete a Derived Column 

Only the derived column’s creator or a Team Owner can delete a derived column in the Honeycomb UI.

Delete derived columns by accessing the Dataset’s schema or the Environment’s schema.

  1. Select Delete for the derived column.

    Delete derived column

    Queries that use a derived column continue to work, but the alias no longer appears in the query builder. Changes to a derived column may effect boards, triggers, or SLOs. If any dependency exists, a “Delete Derived Column” modal presents a list of its dependencies.

    Delete a derived column modal with dependent objects shown
  2. To continue deletion, remove any objects dependent on the derived column.

    Each object links directly to their respective editor. If you are unsure whether to remove the dependencies, reach out to the most recent editor of the dependent object.

  3. Select Refresh Dependencies to reload the list after removing all dependencies.

    A modal appears to confirm deletion of the derived column.

  4. Select Delete to confirm deletion of the derived column.

    Delete a derived column confirmation modal

Derived Column Example - Query 

A web server handles requests with varying sizes of content.

A query with GROUP BY content_length returns a time series graph that displays the count for each value of content_length.

The operations team wants to compare small requests to all other requests. The team internally categorizes a small request as having a content_length of less than or equal to 1000.

The team creates a derived column named smallRequest with the function LTE($content_length, 1000).

A query with GROUP BY smallRequest returns a time series graph that displays the small requests alongside the other requests.

Derived Column Example Scenario Query

Derived Column Example - Multiple Datasets 

An Environment named Production contains three Datasets named PaymentService, CheckoutService, and SupportService. The datasets all define a derived column with the same alias.

Dataset Derived Column Alias Expression
PaymentService total_amount_in_usd COALESCE($usd_total, ADD(MUL($tax_rate, $item_total), $fees))
CheckoutService total_amount_in_usd MUL($cart_total, $tax_rate)
SupportService total_amount_in_usd SUM(0)

A query across the Production Environment returns data for the:

  • PaymentService with the total_amount_in_usd values calculated with its expression.
  • CheckoutService with the total_amount_in_usd values calculated with its expression.
  • SupportService with the total_amount_in_usd values calculated with its expression.

The Production Environment cannot define a derived column with the alias total_amount_in_usd because the alias is already used by a derived column in one or more datasets.

Troubleshoot Derived Columns 

Null or Unexpected Value 

When a derived column displays a null value or unexpected value, try:

  • Many functions are sensitive to the input’s value type. Review the field types in the schema to ensure they are aligned with the inputs of the functions. Coerce the fields with a cast operator function.

  • Data from an event sent to Honeycomb uses the same name as a derived column. The derived column takes priority over the data sent to Honeycomb. We recommend that you rename the name of the field in the instrumentation or in the derived column.

Derived Column Name Collision 

Honeycomb does not allow a user to create a derived column that has the same name as a regular field. However, if the derived column already exists, and Honeycomb receives an event with a field name that is the same as an existing derived column, Honeycomb allows both the derived column and the field to exist, and a warning message displays in the Dataset Settings > Schema tab.

When a name collision does occur, it may cause confusing behavior in Honeycomb. For example, links that use a duplicate field name may link to the wrong location.

When a name collision is detected, try to resolve it:

  • Change the name of the derived column as soon as possible to a unique name that is not already used by a field.

  • If changing the name of the derived column is not possible, change your instrumentation to rename the duplicate-named field. The duplicate-named field in Honeycomb requires deletion once the instrumentation has been changed. Two options exist to delete this field:

    1. Use the Column Deletion API. This will prevent the column from being queried and remove the data in question from new query results. This removal does not remove the permalinks to existing queries that may contain this data.
    2. For Pro and Enterprise users, contact Support, via support.honeycomb.io or email at support@honeycomb.io, for assistance in deletion.