Handle Sensitive Information with the OpenTelemetry Collector

The OpenTelemetry Collector can filter and redact sensitive data when monitoring in production. Sensitive data, such as Personally Identifiable Information (PII), credit card information, and email addresses, can be helpful in incident diagnosis and troubleshooting, but security considerations may require you to filter it.

Data filtering can be done at the OpenTelemetry Collector level. To customize your telemetry data to meet specific requirements and comply with data privacy regulations, define filtering and redaction rules in the OpenTelemetry Collector configuration file.

Note that filtering sensitive data can impact the ability to diagnose and troubleshoot problems. Carefully consider the trade-offs between security and usability when configuring sensitive data filtering.

Filter Data in the OpenTelemetry Collector 

To filter data using the OpenTelemetry Collector:

  1. Add the required processors to the OpenTelemetry Collector’s configuration file. The processors allow you to delete, edit or redact, or hash specific attributes.
  2. Activate the processor functionality by modifying the appropriate service | pipelines.

The processors available for filtering sensitive data are:

  • attributes - to access individual attributes within a span
  • redaction - to mask or block the attributes’ values for security
  • transform - to transform the values within the spans

Attributes Processor 

The attributes processor is generally used instead of the other two processors (redaction and transform) to access individual attributes within a span.

Use the attributes processor to:

  • Add, modify, or remove attributes from your telemetry data, such as specific keys like credit card information, passwords, and other sensitive values
  • Filter and match input data to determine if they should be included or excluded for specified actions

Attributes Example 

In this example, the configuration redacts the value in the cc_number attribute, deletes the account_password attribute, and hashes the account_email attribute.

processors:
    attributes/update:
        actions:
            - key: cc_number
              value: redacted
              action: update
            - key: account_password
              action: delete
            - key: account_email
              action: hash
...
service:
    pipelines:
        traces:
            processors: [..., attributes/update, ...]
        metrics:
            processors: [..., attributes/update, ...]
        logs:
            processors: [..., attributes/update, ...]

Redaction Processor 

The redaction processor is generally used instead of the other two processors (attributes and transform) to mask the attributes’ values for security.

Use the redaction processor to:

  • Remove or mask sensitive information from your telemetry data. This is useful for compliance or security purposes to ensure sensitive information is not leaked
  • Delete span attributes that do not match a list of allowed span attributes
  • Mask span attribute values that match a blocked value list. Span attributes not on the allowed list are removed before value checks are done

Redaction Examples 

Remove 

In this example, use the redaction processor to remove all attributes except description, group, and id:

processors:
  redaction/update:
    allow_all_keys: false
    allowed_keys:
      - description
      - group
      - id

Block 

In this example, use regex to block credit card numbers for Visa, Amex, and Mastercard. Additionally, use regex to block IP addresses.

processors:
  redaction/update:
    allow_all_keys: true
    blocked_values:
      - "^4[0-9]{12}(?:[0-9]{3})?$" ## Visa
      - "^3[47][0-9]{13}$"       ## Amex
      - "^(5[1-5][0-9]{14}|2(22[1-9][0-9]{12}|2[3-9][0-9]{13}|[3-6][0-9]{14}|7[0-1][0-9]{13}|720[0-9]{12}))$"       ## MasterCard
      - "\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.|$)){4}\b" ## IPAddress

Transform Processor 

The transform processor is generally used instead of the other two processors (attributes and redaction) to transform the values within the spans.

Use the transform processor to transform attributes by modifying or adding new attributes to the data before it is exported. The transform processor enables specific requirements, such as renaming attributes, adding or removing tags, or modifying the data structure.

OTel Transformation Language (OTTL) 

OTel Transformation Language (OTTL) is a scripting language used in the transform processor of the OpenTelemetry Collector to manipulate and transform telemetry data. Using OTTL scripts, you can process telemetry data in real time and create custom data structures that enable powerful analytics and monitoring capabilities. To learn more about OTTL, visit OpenTelemetry’s GitHub resources on:

Transform Example 

In this example, the configuration removes all attributes except for (keep_keys) service.name, service.namespace, cloud.region, process.command_line. This configuration also masks a password (replace_pattern) that appears on the command line, such as $env password=mysecret username=myusername python run-my-app.py.

processors:
    transform/update:
        trace_statements:
        - context: resource
          statements:
            - keep_keys(attributes, "service.name", "service.namespace", "cloud.region", "process.command_line")
            - replace_pattern(attributes["process.command_line"], "password\\=[^\\s]*(\\s?)", "password=***")