Skip to main content
An extremely common use case for large language models (LLMs) is to convert text to JSON where the LLM:
  • generates a JSON object from user input
  • parses that object
  • validates its contents
  • uses the content elsewhere in the application
Collecting observability data for this process is a great fit as clear and measurable outcomes are present for each step. This data answers questions like:
  • Do some user inputs result in higher latency or more errors?
  • Was the JSON object structurally valid as per the JSON schema?
  • Did the JSON object contain expected fields?
  • Did the new version of our prompt result in more invalid JSON objects?
  • Are there particular patterns in the user input that result in invalid JSON objects?
Adding telemetry to your LLM and then analyzing with Honeycomb provides a flexible way to contribute towards and measure the overall user experience. Use our Quick Start to instrument your LLM application.

Before You Begin

Before instrumenting your LLM application, you’ll need to do a few things:
  1. Create a Honeycomb Account. Signup is free!
  2. Create a Honeycomb Team. Complete your account creation by giving us a team name. Honeycomb uses teams to organize groups of users, grant them access to data, and create a shared work history in Honeycomb.
    We recommend using your company or organization name as your Honeycomb team name.
  3. Get Your Honeycomb API Key. To send data to Honeycomb, you’ll need your Honeycomb API Key. Once you create your team, you will be able to view or copy your API key. Make note of it; you will need it later! You can also find your Honeycomb API Key any time in your Environment Settings.

Send Telemetry Data to Honeycomb

Once you have your Honeycomb API key and your LLM application to instrument, it’s time to send telemetry data to Honeycomb! To instrument your LLM, you will add automatic instrumentation to your code for standard trace data telemetry, and then add custom instrumentation specifically for your LLM.

Add Automatic Instrumentation to Your Code

The quickest way to start seeing your trace data in Honeycomb is to use OpenTelemetry, an open-source collection of tools, APIs, and SDKs, to automatically inject instrumentation code into your application without requiring explicit changes to your codebase.
Automatic instrumentation works slightly differently within each language, but the general idea is that it attaches hooks into popular tools and frameworks and “watches” for certain functions to be called. When they’re called, the instrumentation automatically starts and completes trace spans on behalf of your application.
When you add automatic instrumentation to your code, OpenTelemetry will inject spans, which represent units of work or operations within your application that you want to capture and analyze for observability purposes.
This Quick Start uses the npm dependency manager. For instructions with yarn or if using TypeScript, read our OpenTelemetry Node.js documentation.

Acquire Dependencies

Open your terminal, navigate to the location of your project on your drive, and install OpenTelemetry’s automatic instrumentation meta package and OpenTelemetry’s Node.js SDK package:
npm install --save \
    @opentelemetry/auto-instrumentations-node \
    @opentelemetry/sdk-node
ModuleDescription
auto-instrumentations-nodeOpenTelemetry’s meta package that provides a way to add automatic instrumentation to any Node application to capture telemetry data from a number of popular libraries and frameworks, like express, dns, http, and more.
sdk-nodeOpenTelemetry’s Node.js distribution package that streamlines configuration and allows you to instrument as quickly and easily as possible.
Alternatively, install individual instrumentation packages.If using TypeScript, install ts-node to run the code:
npm install --save-dev ts-node

Initialize

Create an initialization file, commonly known as the tracing.js file:
// Example filename: tracing.js
'use strict';

const { NodeSDK } = require('@opentelemetry/sdk-node');
const { getNodeAutoInstrumentations } = require('@opentelemetry/auto-instrumentations-node');

const sdk = new NodeSDK({
  instrumentations: [
    getNodeAutoInstrumentations(),
  ],
});

sdk.start();

Configure the OpenTelemetry SDK

Use environment variables to configure the OpenTelemetry SDK:
export OTEL_SERVICE_NAME="your-service-name"
export OTEL_EXPORTER_OTLP_PROTOCOL="http/protobuf"
export OTEL_EXPORTER_OTLP_ENDPOINT="https://api.honeycomb.io:443" # US instance
#export OTEL_EXPORTER_OTLP_ENDPOINT="https://api.eu1.honeycomb.io:443" # EU instance
export OTEL_EXPORTER_OTLP_HEADERS="x-honeycomb-team=your-api-key"
VariableDescription
OTEL_SERVICE_NAMEService name. When you send data, Honeycomb creates a dataset in which to store your data and uses this as the name. Can be any string.
OTEL_EXPORTER_OTLP_PROTOCOLThe data format that the SDK uses to send telemetry to Honeycomb. For more on data format configuration options, read Choosing between gRPC and HTTP.
OTEL_EXPORTER_OTLP_ENDPOINTHoneycomb endpoint to which you want to send your data.
OTEL_EXPORTER_OTLP_HEADERSAdds your Honeycomb API Key to the exported telemetry headers for authorization. Learn how to find your Honeycomb API Key.
If you use Honeycomb Classic, you must also specify the Dataset using the x-honeycomb-dataset header.
export OTEL_EXPORTER_OTLP_HEADERS="x-honeycomb-team=your-api-key,x-honeycomb-dataset=your-dataset"
If you are sending data directly to Honeycomb, you must configure the API key and service name. If you are using an OpenTelemetry Collector, configure your API key at the Collector level instead.

Run Your Application

Run the Node.js app and include the initialization file you created:
node -r ./tracing.js YOUR_APPLICATION_NAME.js
Be sure to replace YOUR_APPLICATION_NAME with the name of your application’s main file.Alternatively, you can import the initialization file as the first step in your application lifecycle.In Honeycomb’s UI, you should now see your application’s incoming requests and outgoing HTTP calls generate traces.

Generate Automated Data

Now that you have added automatic instrumentation to your application and have it running in your development environment, interact with your application by making a few requests. Making requests to your service will generate telemetry data and send it to Honeycomb where it will appear in the Honeycomb UI within seconds.

Add Custom Instrumentation for LLMs

With OpenTelemetry’s automatic instrumentation now installed in your application, trace data is being sent to Honeycomb. The next step is to add custom instrumentation that tracks all relevant information related to your LLM feature. In OpenTelemetry, custom instrumentation is called manual instrumentation. To get the most out of your traces, you must use OpenTelemetry APIs to instrument. To add custom instrumentation, create a single span that tracks all relevant information related to your LLM feature. Specifically, the minimum information to track must include:
  • User ID
  • Prompt version
  • User input
  • Full prompt text
  • Full LLM response
  • Any error including parsing JSON and/or validating it
  • Error message
  • Token count

Examples

The following code examples show how to capture the correct information on an OpenTelemetry span:
  import { trace, Span, SpanStatusCode } from "@opentelemetry/api";

  const tracer = trace.getTracer("llm.tracer");

  function getJsonFromText(
    userInput: string,
    userId: string,
    promptTemplate: string,
    promptVersion: string
  ) {
    return tracer.startActiveSpan("app.get_json_from_text", (span: Span) => {
      span.setAttribute("app.user_id", userId);
      span.setAttribute("app.llm.prompt_version", promptVersion);
      span.setAttribute("app.llm.user_input", userInput);

      try {
          // Programmatically build the full prompt.
          // The output is the entire prompt you'd send to the LLM,
          // after RAG or any other context-building operations.
          const fullPrompt = buildFullPrompt(promptTemplate, userInput);

          span.setAttribute("app.llm.prompt_text", fullPrompt);

          // Call the LLM and get back the text of the result
          // and the number of tokens used.
          const { response, tokenCount } = callLLM(full_prompt);

          span.setAttribute("app.llm.response", response);
          span.setAttribute("app.llm.token_count", tokenCount);

          // Parse the JSON object and validate it,
          // capturing any errors you might encounter.
          const result = parseAndValidateResponse(response);

          return result;
      } catch (ex) {
          // Track any unexpected errors.
          span.setStatus({ code: SpanStatusCode.ERROR });
          span.recordException(ex);
      } finally {
          span.end();
      }
    });
  }

  function parseAndValidateResponse(llmResult: string) {
    const currentSpan = trace.getActiveSpan();

    // Extract and parse the JSON object from the LLM response.
    const { extracted, result, extractionError } = extractAndParseJson(llmResult);
    if (!extracted) {
      currentSpan.setAttribute("error.message", extractionError);
      currentSpan.setStatus({ code: SpanStatusCode.ERROR });
      return null;
    }

    // Validate the structure of the result, capturing
    // any validation errors you might encounter.
    const { validated, validationError } = validateResult(result);
    if (!validated) {
      currentSpan.setAttribute("error.message", validationError);
    }

    return result;
  }

Explore Your Data

With your app running and telemetry being sent to Honeycomb, it’s time to explore your data.

Create a Board

For quick reference over time, you should create a Board to show LLM-specific queries of interest. We recommend creating a Board first before trying queries, so you can save with ease later. To create a Board:
  1. In the Honeycomb UI’s left navigation menu, select Boards ()n. When the left navigation menu is compact, only the icon appears.
  2. Select New Board.
  3. In the modal that appears, name your new board, such as “LLM Dashboard.” Optionally, give your Board a description to help others find and use it. Determine the board’s Sharing setting - Public to the Team or Limited to Collaborators.
  4. Select Create to finish. Your new board appears next.

Next Steps

  1. Select Add Query to go to the Query Builder display.
  2. Use the example queries in the next section to populate your LLM Board.
  3. Follow the directions to add queries to an existing Board.

Create Queries

Now it’s time to create your first queries for LLMs! Use the query examples below to explore the performance and behavior of your LLM application. The specific attributes should exist in your data and environment if you added custom instrumentation for LLMs in the previous step. Enter each example query using the Query Builder. These example queries use two to three of the VISUALIZE, WHERE, and GROUP BY clauses, located at the top of the Query Builder.
  • VISUALIZE - Performs a calculation and displays a corresponding graph over time. Most VISUALIZE queries return a line graph while the HEATMAP visualization shows the distribution of data over time
  • WHERE - Filters based on attribute parameter(s)
  • GROUP BY - Groups fields by attribute parameter(s)
Screenshot of Visualize, Where, and Group by clauses in Query Builder

Track Overall Latency

This query tracks overall latency of all LLM-related operations and the slowest requests.
VISUALIZEWHERE
HEATMAP(duration_ms)
MAX(duration_ms)
name = app.get_json_from_text
Use to identify any spikes in latency, or if latency is increasing over time. In the event of a spike, you can investigate what happened by using BubbleUp to find outliers.

Track Invalid JSON Objects

This query shows each instance that a user input led to a bad JSON object, whether that was because of a parsing error or a validation error.
VISUALIZEWHEREGROUP BY
COUNTname = app.get_json_from_text
error exists
app.llm.input
app.llm.response
error.message
Use to identify exactly which inputs lead to bad behavior, which makes it easier to identify specific bugs to solve.

Track all User Inputs Grouped by Response and Errors

This query shows groups of all inputs and LLM outputs that succeeded.
VISUALIZEWHEREGROUP BY
COUNTname = app.get_json_from_text
error does-not-exists
app.llm.user_input
app.llm.response
Use to understand general user behavior, and to identify any patterns in user input that leads to a particularly useful response. It’s just as helpful to understand what’s working as it is to understand what isn’t.

Show Token Usage Over Time

This query tracks token use over time, grouped by user ID.
VISUALIZEWHEREGROUP BY
HEATMAP(app.llm.token_count)name = app.get_json_from_textapp.user_id
Use to understand how many tokens being used over time, but also to identify that usage down to specific users, as often a small number of users is responsible for the majority of usage.

Investigate Specific Traces

The queries on our LLM Board act as a starting point. If curious about specific behavior, you can view a specific trace that represents one request. Select any point on a graph, and in the menu that appears, select View trace. The next screen displays a trace detail view that lets you see what happened step by step.

Next Steps

The queries on our LLM board act as a referenceable starting point. If you’re curious about specific behavior(s), start with any query and:
  1. Add additional fields in the GROUP BY clause to slice your data into revealing interesting field values.
  2. Use BubbleUp to find outlier behavior and identify its contributing characteristics.
  3. Select a specific trace that represents one request to see what happened step-by-step.