Define Dataset Schema

Define Schema 

The Schema tab allows you to select whether nested JSON automatically unpacks, and lists the Dataset’s Unique Fields and Derived Columns.

Unique Fields lists all the unique fields seen in events from this dataset. You can modify a field’s data type here if needed.

Tip
If you change the type of a field in the Schema page and then later send an event where the data type does not match, then Honeycomb will try to coerce the value. If Honeycomb cannot coerce the value, it will get set to zero (0) rather than dropping the value.

Custom Fields 

Each Dataset allows you to define unique custom fields for use within it.

What are Custom Fields? 

Otherwise known as Derived Columns, custom fields are computed properties that are calculated by a formula.

The field’s value is based on the result of a formula, which can contain functions, mathematical and/or logical operations on other field values, and constants and literals, similar to expressions defined in a spreadsheet. To learn more about syntax and available functions, visit Derived Column Formula Reference.

Uses 

Though functionally different, you can use custom fields in Honeycomb just as you would use other fields in your data (building queries; defining Triggers and Service Level Objectives (SLOs)). For example, you can use custom fields to express:

  • an event detail in a more human-readable way
  • business measures that may change over time
  • common field across services with different implementations

Some examples include:

Use a Custom Field in a Query to Compare Values

Example scenario: Your team manages a web server that handles requests with varying sizes of content. Querying with GROUP BY content_length returns a time series graph that displays the count for each value of content_length. You want to compare small requests to all other requests. Your team internally categorizes a small request as having a content_length of less than or equal to 1000.

Solution: Create a Derived Column named smallRequest with the function LTE($content_length, 1000).

When you query in Honeycomb with a GROUP BY clause of smallRequest, you receive a time series graph that displays the small requests alongside the other requests.

Use a Custom Field across Multiple Datasets

Example scenario: Your team has an Environment named “Production”, which contains Datasets named “PaymentService”, “CheckoutService”, and “SupportService”. You have defined a derived column with the same name in each Dataset:

Dataset Derived Column Name Formula
PaymentService total_amount_in_usd COALESCE($usd_total, ADD(MUL($tax_rate, $item_total), $fees))
CheckoutService total_amount_in_usd MUL($cart_total, $tax_rate)
SupportService total_amount_in_usd SUM(0)

When you query across the Production Environment in Honeycomb, you receive:

  • data for the PaymentService with the total_amount_in_usd values calculated with its expression
  • data for the CheckoutService with the total_amount_in_usd values calculated with its expression
  • data for the SupportService with the total_amount_in_usd values calculated with its expression

You cannot define an Environment-level Derived Column for “Production” with the name total_amount_in_usd because the name is already used by a Dataset-level Derived Column in one or more datasets.

Solution: Name the Environment-level Derived Column something unique to your Environment or change your instrumentation to rename the Derived Columns in the Datasets.

Create Derived Column 

To create a Derived Column:

  1. Log in to the Honeycomb UI.

  2. Select the Environments label on the top-left, then select the Environment that contains the Dataset to which you would like to add a Derived Column.

  3. In the left navigation menu, select Manage Data.

  4. In the list, locate and select Datasets.

  5. In the list, locate the Dataset to which you would like to add a Derived Column, and select its name to view the available settings.

  6. Navigate to the Schema view.

  7. Select Add new Derived Column.

  8. In the modal, enter a Display Name, which will show as the field name in the Query Builder.

    Tip

    Enter a display name that is unique across the Dataset and its containing Environment. Your Derived Column name should not match the name of any other Derived Column or any other field in any Dataset contained within the Environment.

    Although Honeycomb tries to prevent duplicate field names, they can still occur. To learn more about behaviors related to name collision and solutions, visit Common Issues with Queries: Derived Columns.

  9. In the editor, define the formula for your Derived Column. To learn more about syntax and available functions, and to explore some example formulas, visit Derived Column Formula Reference.

    Hover over any syntax errors (red underlines or red triangles) for assistance with correcting them.

  10. Select Save.

Change Derived Column 

Warning
Changes to a Derived Column may affect Boards, Triggers, or SLOs.

To change a Derived Column:

  1. Log in to the Honeycomb UI.

  2. Select the Environments label on the top-left, then select the Environment that contains the Dataset for which you would like to change a Derived Column.

  3. In the left navigation menu, select Manage Data.

  4. In the list, locate and select Datasets.

  5. In the list, locate the Dataset for which you would like to change a Derived Colum, and select its name to view the available settings.

  6. Navigate to the Define Schema view.

  7. In the list, locate the Derived Column that you want to edit, and select Edit.

    If any dependencies exist, Honeycomb will prompt you to choose whether to clone or continue editing the Derived Column. If you are unsure about whether to clone or edit, reach out to the Derived Column’s most recent editor.

  8. In the modal, modify the pre-populated name and formula as needed. To learn more about syntax and available functions, and to explore some example formulas, visit Derived Column Formula Reference.

  9. Select Save.

Clone Derived Column 

If you have a Derived Column that is similar to one you want to create, you can clone it and use it as the base for a new Derived Column. To clone a Derived Column:

  1. Log in to the Honeycomb UI.

  2. Select the Environments label on the top-left, then select the Environment that contains the Dataset for which you would like to clone a Derived Column.

  3. In the left navigation menu, select Manage Data.

  4. In the list, locate and select Datasets.

  5. In the list, locate the Dataset that contains the Derived Column you want to clone, and select its name to view the available settings.

  6. Navigate to the Define Schema view.

  7. In the list, locate the Derived Column that you want to clone, and select Clone.

  8. In the modal, modify the pre-populated name and formula as needed. To learn more about syntax and available functions, and to explore some example formulas, visit Derived Column Formula Reference.

  9. Select Save.

Delete Derived Column 

Warning

Deleting a Derived Column may affect Boards, Triggers, or SLOs.

Only the derived column’s creator or a Team Owner can delete a derived column in the Honeycomb UI.

  1. Log in to the Honeycomb UI.

  2. Select the Environments label on the top-left, then select the Environment that contains the Dataset for which you want to delete a Derived Column.

  3. In the left navigation menu, select Manage Data.

  4. In the list, locate and select Datasets.

  5. In the list, locate the Dataset that contains the Derived Column you want to delete, and select its name to view the available settings.

  6. Navigate to the Define Schema view.

  7. In the list, locate the Derived Column that you want to delete, and select Delete.

    If any dependencies exist, you will see a list of dependencies that you must remove to continue. If you are unsure about whether to remove the dependencies, reach out to each dependency’s most recent editor. After removing all dependencies, select Refresh Dependencies.

  8. From the modal, select Delete.

    Delete a derived column confirmation modal

    Queries that use your Derived Column will continue to work, but the alias will no longer appear in the Query Builder.