Service Map | Honeycomb

Service Map

Note
This feature is available as part of the Honeycomb Enterprise plan. Honeycomb Classic users must migrate first to use this feature.

Service Map in Honeycomb visually represents how traffic flows through your system in any given environment. It displays known services, service dependencies, and communication between services. Interact with the Service Map to examine and investigate smaller sets of services in detail.

Service Map generates a view of:

  • Services in your architecture and the request volume that each service receives
  • Services that communicate with one another and the frequency of communication between services
  • p95 duration of request response time for each service and for service to service communication. p95 stands for 95th percentile. The p95 duration indicates that 5% of the data sample have durations higher than that threshold and 95% of the data sample have durations lower than that threshold.

Use cases for Service Map includes, but are not limited to:

  • understanding what and how often services communicate with one another
  • identifying services with slow response times
  • filtering to visualize how requests traverse your services
  • finding sample traces for specific request flows
Note
Want to see a working example of a Service Map at Honeycomb? Check out this interactive demo that requires no setup!

Access Service Map 

Service Map is an Enterprise feature that is only available for teams sending data to new environments. Enterprise Honeycomb Classic teams are encouraged to migrate to Environments in order to use Service Map.

Access Service Map from the left navigation bar:

Screenshot of Service Map Icon

Service Map Generation 

Service Map is built with distributed tracing data that you send to Honeycomb.

Spans within traces must contain several key pieces of data to construct the Service Map:

  • Service name - name of the instrumented service
  • timestamp - time and date information that corresponds to the start of the span
  • Span duration - describes how much time in milliseconds that the span took to complete
  • Span ID - the unique ID for the span
  • Trace ID - the ID identifying which trace that the span belongs to
  • Parent span ID - The ID of the span’s parent span, or the call location the current span was called from

These fields can be configured in each dataset’s Definitions.

Note
If required dataset fields change or are deprecated, ensure that Dataset Definitions are updated to reflect the new fields. Otherwise, a Service Map may not be generated.

Interact with Service Map 

At the top, the Service Map summarizes how many services are displayed.

Service map header to show service count and time picker

A label indicates when the Service Map last regenerated. Use the time picker to modify the selected timespan. Use a preset time range or a custom time range.

Navigate your selection history with the left and right arrows.

The Service Map displays a network of services connected by edges.

Service map overview

The Gateways and Entry Services buttons in the top right corner highlight their respective service type.

Use your mouse or trackpad to magnify and view parts of the map in detail. Use the Recenter button in the top right corner to reset the map display to an overview magnification level.

Select the Legend icon in the bottom left corner to translate symbol meanings.

Hover over a service to trigger a hover box with service name and p95 duration information. Select Isolate in the hover box to activate Isolate Mode. Hovering a service also highlights that service’s dependencies within the Service Map.

The right side panel displays details about services, lists a sample of related traces, and provides the ability to filter and highlight.

Service map right side panel on overview with filters collapsed

Use the collapsible right side panel to:

  • view details about the entire map in Overview, or about a selected service or edge
  • filter services
  • highlight traces
  • access a sample of traces

Service 

Each Service is represented by a circular node. The size of the service represent the relative volume of requests that the service receives compared to other services in your environment. Services in purple have the highest number of dependencies, or combined incoming and outgoing services. The service’s labels display the service.name value and its p95 duration.

Disconnected services display separately from the main set of connected services. Disconnected services are services that do not communicate to or reference other services.

Select a service to populate its details in the right side panel.

Edge 

An Edge, or line, represent communication between two services. The thickness of the edge represents the relative volume of requests between the two services compared to other services. When it appears, the edge’s label displays its p95 duration.

Select an edge to populate its details in the right side panel.

Right Side Panel 

The right side panel displays details about services, lists a sample of related traces, and provides the ability to filter and highlight.

The panel’s title indicates if it summarizes the Overall Service Map, a specific service by its name, or a selected edge.

The panel displays an Overall view when the map initially loads, or when no service or edge is selected. All services are displayed in an alphabetized list with their p95 durations. (Disconnected) appears in the Services list with the total of disconnected services present. Disconnected services are services that do not communicate to or reference other services. Any disconnected service name appears in the Services list with parenthesis around it.

When a Service is selected, the side panel displays:

  • Service Name
  • Service p95 Latency: The p95 duration for this service to respond to requests
  • Incoming Services: a list of services that send requests to this service, and the p95 duration between the selected service and the incoming service
  • Outgoing Services: a list of services that this service sends requests to, and the p95 duration between the selected service and the outgoing service
Service map with a service selected with the side panel

When an Edge is selected, the side panel displays:

  • Edge p95 Latency: The p95 duration for the receiving service to respond to the requesting service
  • The requesting service and its overall p95 latency
  • The receiving service and its overall p95 latency
Service map with an edge selected with the side panel

Filters 

Use the collapsible Filters section to modify the Service Map display based on criteria.

Service map filters

In Display Services, enter a field-value expression to display services that meet the filter criteria.

In Highlight Traces, enter a field-value expression to highlight traces on the Service Map with at least one span that match the filter.

Note
When using filters, very low volume traces may not appear on the Service Map. Refer to Troubleshooting for more information.

Traces 

In the right side panel, Service Map provides sample traces that correspond to the Service Map. When selecting a service or edge, the right side panel updates to display a list of the top five slowest traces that contain the selected service(s).

Service map traces with see query

Select See Query for a longer list of traces in Query Builder and the ability to explore related traces in detail. For a service, See Query creates a query with a filter where service.name = <service>. For an edge, See Query creates a query with a filter where <service 1> calls <service 2> = true. This unique filter includes a custom-created Honeycomb derived column, specific for this query, to isolate traces where the requesting service (<service 1>) calls the receiving service (<service 2>).

Note
If any filter is applied to Service Map, See Query creates a query as described above and includes an additional derived column-based filter where the applied filter’s conditions are also met.

Isolate Mode 

Isolate Mode focuses the map to display a single service and its immediate dependents. To activate, hover over the target service in the map and select Isolate in the hover box.

Service map isolate map enabled and disabled for a service

Then, “Isolate Mode” appears at the top of the Service Map display to indicate its status. The map updates to show only the target service, the Incoming Services that send requests to the target service, and Outgoing Services that the target service sends request to.

Select Show full map at the top left to leave Isolate Mode and return to the overall map.

Gateways 

Use Gateways to illustrate which services communicate to another through a known gateway.

Service map toggling gateways

(Learn more about instrumenting gateways). When toggled on, any Gateway appears as a blue square on an edge.

Entry Services 

Use Entry Services to highlight which services are in the root span of any trace.

Service map entry services highlighted

Toggle Entry Services in the top right corner of the Service Map to modify the Service Map display. When activated, any Entry Service appears with an additional dashed circle around its circular node.

Service Map Instrumentation 

In order for Service Map to be generated, at minimum, teams need to send tracing datasets to specific environments. The following defined fields for spans in traces are required for a map to be automatically generated from your tracing datasets:

  • Service name - name of the instrumented service
  • timestamp - time and date information that corresponds to the start of the span
  • Span duration - describes how much time in milliseconds that the span took to complete
  • Span ID - the unique ID for the span
  • Trace ID - the ID identifying which trace that the span belongs to
  • Parent span ID - The ID of the span’s parent span, or the call location the current span was called from

To draw a meaningful map, the traces should span multiple services.

Instrument for Gateways 

By default, Service Map treats service.name in spans as a distinct service.

For infrastructure with gateways, this treatment could be problematic as it could obscure true service to service relationships. If multiple services communicate through this gateway, the visual may appear as if some services speak exclusively to the gateway, or as if some services only receive traffic from the gateway. For example, if Service A communicates to Service B through a gateway foo, the automatically generated Service Map will show Service A communicating to foo and foo communicating to Service B.

To ensure the relationship is not hidden by this representation, users can adjust their instrumentation so Honeycomb can represent gateways accurately.

No semantic convention exists for designating services on spans as gateways. We recommend the following methods for enabling Gateway visuals on Service Map. These recommendations are subject to change, pending the creation of an OpenTelemetry standard for Gateways.

To flag services as gateways and thus change its representation on the map:

Istio Service Meshes and Gateways 

This option automatically instruments component.proxy to each span. No additional manual instrumentation is required.

All Other Meshes and Gateways 

We recommend manually instrumenting net.component: proxy for your gateway service(s) with one of two options:

  1. When enabling tracing for a service mesh or gateway, modify the meshConfig to include custom_tags:
spec:
  meshConfig:
    enableTracing: true
    defaultConfig:
      tracing:
        custom_tags:
          net.component:
            literal:
              value: proxy
        zipkin:
          address: otel-collector.default:9411
  1. When sending telemetry to an OpenTelemetry Collector, use the Attributes Processor to add specific attributes for the pipeline that contains the spans from the mesh:
processors:
  batch:
  attributes:
    actions:
      - key: "net.component"
        value: "proxy"
        action: insert

Sampling in Service Map 

We recommend using Service Map to understand overall request traffic, as Service Map automatically samples traces to represent higher volume services. This means that Service Map may not represent services or edges with very low traffic in a selected time range.

Use Cases for Service Map 

Use Services Map in the following situations:

  • Quickly onboard engineers into a large complex architecture. Ask and identify:

    • What services exist in the system?
    • What are the busiest services?
    • Which services have the most dependencies?
    • How often do two services communicate with one another, relative to other services?
    • What are the slowest or fastest services in the system?
    • What does this map look like for specific requests?
  • During debugging or Root Cause Analysis for an issue with one or more services, ask and identify:

    • What downstream services are impacted?
    • What upstream services may be causing this?
    • What are some sample traces to inspect and to find the source of the problem?
  • When validating system observability, or instrumentation quality assurance, ask and identify:

    • Are the services instrumented correctly?
    • Is instrumentation missing for part of the architecture?
    • Are there unexpected instrumented components?

Troubleshooting 

The map is empty when I expect data present 

First, confirm that your Dataset fields in Dataset Definitions are defined with Field name values. Service Map generates automatically from trace data sent to Honeycomb, and with the exception of timestamp, is based on these defined Tracing fields:

  • Service name - name of the instrumented service
  • timestamp - time and date information that corresponds to the start of the span
  • Span duration - describes how much time in milliseconds that the span took to complete
  • Span ID - the unique ID for the span
  • Trace ID - the ID identifying which trace that the span belongs to
  • Parent span ID - The ID of the span’s parent span, or the call location the current span was called from

If any of this data is missing or undefined, the map will not display.

Second, the Service Map may take up to a few minutes to generate after starting to send data to Honeycomb. Check back in a few minutes to confirm if the Service Map has generated.

I cannot find a service on the map 

First, confirm that traces include the service name. Service Map generates services from trace data that includes the service name. To confirm, run a VISUALIZE COUNT WHERE service.name = <service name> query in Query Builder where service.name is the service name field.

The query will return a count of events associated with the service and helps to validate whether the Service exists in that time range. If the query result returns 0, then it means that no requests were made to that service and Service Map is accurate. If COUNT returns a result greater than zero, it may not be visible on the map because:

  • Recently sent requests to this service may take a few minutes to process and display on the map

  • The service is instrumented with component: proxy or net.component: proxy. Thus, Honeycomb treats the service as a Gateway, which visually transforms the service into a Gateway square on an edge as opposed to a Service node. Confirm by running the following query to see if your service is instrumented with gateway attributes:

    • VISUALIZE COUNT
    • WHERE service.name = <service name>
    • GROUP BY component:proxy, net.component:proxy

Second, use the time picker in the top right to select a larger range of time. Service Map automatically samples traces to represent higher volume services. This means that Service Map may not represent services or edges with very low traffic in a selected time range. Expand the date range to view a larger sample, and the missing service may appear.

The map contains only disconnected services 

Service Map generates based on trace data sent to Honeycomb. If traces only include one service, the singular service displays as one disconnected node. Send traces that include more than one service to visualize edge connections between services on your map.

All services point to the gateway service 

Follow the instrumentation guide to flag gateways, so Service Map can represent them differently.

My map is dense and hard to read 

Use Display Services in the right side panel to narrow down your map to a specific set of services. After narrowing down to a smaller diagram, hover over a specific service on the map and select Isolate to display all dependencies, or its incoming and outgoing services, for the selected service.

When I apply filters, an expected service or highlighted path does not display 

When filter parameters are applied to the Service Map, the results return a maximum of 10,000 traces. These traces are then highlighted on the Service Map when using Highlight Traces and/or selectively displayed when using Display Services in the right side panel. Because of the limit applied, there is a possibility that very low volume traces are not represented in the sample returned.

p95 latency for services and edges in Service Map do not match p95(duration_ms) in Query Builder 

This difference is expected. Service Map calculates the p95 duration for services and edges differently from the p95(duration_ms) query in Query Builder.

Service Map latency calculations include synchronous spans, which communicate between services. Query Builder latency calculations includes both synchronous and asynchronous, or internal, spans.

I instrumented Gateways, but they do not appear on my map 

When instrumenting your gateways and they still do not appear on your map, it is possible that your gateway requests do not occur between known services. Service Map only displays gateways on an edge, which is formed when one service send requests to another service. If an edge does not exist, either because a service does not send requests to another service or if the service sends requests to another service through multiple gateways, then gateways may not display on your map.

If either of these scenarios are common in your architecture, join #discuss-service-map in our Pollinators Community Slack and let us know.

A “No Service Map data found” error appears when I filter my map, but I found a trace that matches the filter set 

Both Display Service and Highlight Traces find traces with at least one span that matches all filters you have entered. For example, if the filters of app.cart.items > 5 AND app.user.currency = USD are set in Highlight Traces, then it displays traces that have at least one span that matches both those filters. If the fields used in your filters exist across different spans in a trace, a result is not returned. If possible, try propagating context throughout your traces to enable filters to return matching results.