Parse with Regex


Note
This feature is available as an add-on for the Honeycomb Enterprise plan. Please contact your Honeycomb account team for details.
Metrics Logs Traces Telemetry Pipeline Agent
v1.36.0+

Description 

The Parse with Regex Processor is designed to extract and transform telemetry data—including logs, metrics, and traces—using regular expressions (regex) with named capture groups. This enables users to define specific regex patterns with named capture groups to parse and reformat data from different source fields, enhancing data analysis and insights.

Use 

This processor is invaluable when users need to extract and categorize specific elements from unstructured or semi-structured data. Users can employ regex patterns with named capture groups to classify extracted data, making it easily identifiable and accessible for further analysis, monitoring, or alerting.

Configuration 

Field Description
Telemetry Types The types of telemetry to apply the processor to.
Condition The condition, expressed in OTTL, that must be met for the processor to be applied. Allows users to apply specific criteria to select the data entries to be processed.
Source Field Type Indicates the type of the source field where the regex will be applied. It can be Resource, Attribute, Body, or Custom for logs; Resource, Attribute, or Custom for metrics and traces.
Source Field Specifies the specific field where the regex is applied, pertinent to the selected Source Field Type.
Target Field Type Indicates the type of the target field where the parsed data will be stored. It can be Resource, Attribute, Body, or Custom for logs; Resource, Attribute, or Custom for metrics and traces.
Regex Pattern The regex pattern with a named capture group used for parsing the data, essential for extracting or transforming specific data elements within the telemetry data.

Example Configurations 

Extract Error Codes from Log Messages 

In this example, the Parse with Regex Processor is configured to extract error codes embedded within log messages. Given the unstructured nature of these messages, the use of a regex pattern with a named capture group is crucial for efficient extraction and categorization.

Honeycomb Docs - Parse with Regex - image 1

Here is a sample log entry divided into body and attributes:

Body:

{
  "message": "2023-06-20 14:32:10 Error: An error occurred. ErrorCode: ER1023"
}

Attributes:

{
  "timestamp": "2023-06-20 14:32:10"
}

The objective is to extract the error code “ER1023” and assign it to a new attribute for enhanced analysis. The configuration for the Parse with Regex Processor is as follows:

  • Condition: "body contains 'ErrorCode:'"
  • Source Field Type: Body
  • Source Field: message
  • Target Field Type: Attribute
  • Regex Pattern: "ErrorCode: (?P<errorCode>\w+)"

With this setup, the named capture group “errorCode” is employed to categorize the extracted error code. The processed log entry would appear with an updated attributes section as follows:

Attributes After Processing:

{
  "timestamp": "2023-06-20 14:32:10",
  "errorCode": "ER1023"
}

Now, the error code is not only extracted but also categorized under the “errorCode” attribute, facilitating effortless filtering and analysis. This structured format allows for precise monitoring and troubleshooting, especially when dealing with specific error codes.