Visualize Data Flow


Note
This feature is available as part of the Honeycomb Enterprise plan. Honeycomb Classic users must migrate first to use this feature.

Service Map in Honeycomb visually represents how traffic flows through your system in any given environment. It displays known services, service dependencies, and communication between services. Interact with the Service Map to examine and investigate smaller sets of services in detail.

Service Map generates a view of:

  • Services in your architecture and the request volume that each service receives
  • Services that communicate with one another and the frequency of communication between services
  • p95 duration of request response time for each service and for service to service communication. p95 stands for 95th percentile. The p95 duration indicates that 5% of the data sample have durations higher than that threshold and 95% of the data sample have durations lower than that threshold.

Use cases for Service Map includes, but are not limited to:

  • understanding what and how often services communicate with one another
  • identifying services with slow response times
  • filtering to visualize how requests traverse your services
  • finding sample traces for specific request flows
Note
Want to see a working example of a Service Map at Honeycomb? Check out this interactive demo that requires no setup!

Access Service Map 

Service Map is an Enterprise feature that is only available for teams sending data to new environments. Enterprise Honeycomb Classic teams are encouraged to migrate to Environments in order to use Service Map.

Access Service Map from the left navigation bar:

Screenshot of Service Map icon

Service Map Generation 

Service Map is built with distributed tracing data that you send to Honeycomb.

Spans within traces must contain several key pieces of data to construct the Service Map:

  • Service name - name of the instrumented service
  • timestamp - time and date information that corresponds to the start of the span
  • Span duration - describes how much time in milliseconds that the span took to complete
  • Span ID - the unique ID for the span
  • Trace ID - the ID identifying which trace that the span belongs to
  • Parent span ID - The ID of the span’s parent span, or the call location the current span was called from

These fields can be configured in each dataset’s Definitions.

Note
If required dataset fields change or are deprecated, ensure that Dataset Definitions are updated to reflect the new fields. Otherwise, a Service Map may not be generated.

Service Map Instrumentation 

In order for Service Map to be generated, at minimum, teams need to send tracing datasets to specific environments. The following defined fields for spans in traces are required for a map to be automatically generated from your tracing datasets:

  • Service name - name of the instrumented service
  • timestamp - time and date information that corresponds to the start of the span
  • Span duration - describes how much time in milliseconds that the span took to complete
  • Span ID - the unique ID for the span
  • Trace ID - the ID identifying which trace that the span belongs to
  • Parent span ID - The ID of the span’s parent span, or the call location the current span was called from

To draw a meaningful map, the traces should span multiple services.

Instrument for Gateways 

By default, Service Map treats service.name in spans as a distinct service.

For infrastructure with gateways, this treatment could be problematic as it could obscure true service to service relationships. If multiple services communicate through this gateway, the visual may appear as if some services speak exclusively to the gateway, or as if some services only receive traffic from the gateway. For example, if Service A communicates to Service B through a gateway foo, the automatically generated Service Map will show Service A communicating to foo and foo communicating to Service B.

To ensure the relationship is not hidden by this representation, users can adjust their instrumentation so Honeycomb can represent gateways accurately.

No semantic convention exists for designating services on spans as gateways. We recommend the following methods for enabling Gateway visuals on Service Map. These recommendations are subject to change, pending the creation of an OpenTelemetry standard for Gateways.

To flag services as gateways and thus change its representation on the map:

Istio Service Meshes and Gateways 

This option automatically instruments component.proxy to each span. No additional manual instrumentation is required.

All Other Meshes and Gateways 

We recommend manually instrumenting net.component: proxy for your gateway service(s) with one of two options:

  1. When enabling tracing for a service mesh or gateway, modify the meshConfig to include custom_tags:

    spec:
      meshConfig:
        enableTracing: true
        defaultConfig:
          tracing:
            custom_tags:
              net.component:
                literal:
                  value: proxy
            zipkin:
              address: otel-collector.default:9411
    
  2. When sending telemetry to an OpenTelemetry Collector, use the Attributes Processor to add specific attributes for the pipeline that contains the spans from the mesh:

    processors:
      batch:
      attributes:
        actions:
          - key: "net.component"
            value: "proxy"
            action: insert
    

Sampling in Service Map 

We recommend using Service Map to understand overall request traffic, as Service Map automatically samples traces to represent higher volume services. This means that Service Map may not represent services or edges with very low traffic in a selected time range.

Use Cases for Service Map 

Use Services Map in the following situations:

  • Quickly onboard engineers into a large complex architecture. Ask and identify:

    • What services exist in the system?
    • What are the busiest services?
    • Which services have the most dependencies?
    • How often do two services communicate with one another, relative to other services?
    • What are the slowest or fastest services in the system?
    • What does this map look like for specific requests?
  • During debugging or Root Cause Analysis for an issue with one or more services, ask and identify:

    • What downstream services are impacted?
    • What upstream services may be causing this?
    • What are some sample traces to inspect and to find the source of the problem?
  • When validating system observability, or instrumentation quality assurance, ask and identify:

    • Are the services instrumented correctly?
    • Is instrumentation missing for part of the architecture?
    • Are there unexpected instrumented components?

Troubleshooting 

To explore common issues when working with Service Map, visit Common Issues with Visualization: Service Map.