If you need to troubleshoot sampling, explore these solutions to common issues.
Troubleshoot issues related to Honeycomb Refinery.
The default logging level is warn
.
The debug
level emits too much data to be used in production, but contains excellent information in a pre-production environment.
Setting the logging level to debug
during initial configuration will help understand what is working and what is not, but when traffic volumes increase it should be set to warn
.
Refinery does not yet buffer traces or sampling decisions to disk. When you restart the process, all in-flight traces will be flushed and sent upstream to Honeycomb, but you will lose the record of past trace decisions. When started back up, Refinery will start with a clean slate.
Configuration file formats (TOML and YAML) can be confusing to read and write.
There is an option to check the loaded configuration by using one of the /query
endpoints from the command line, from a server that can access a refinery host.
The /query
endpoints are protected and can be enabled by specifying QueryAuthToken
in the configuration file or specifying REFINERY_QUERY_AUTH_TOKEN
in Refinery’s environment.
All requests to any /query
endpoint must include the header X-Honeycomb-Refinery-Query
set to the value of the specified token.
Retrieve the entire rules configuration in the desired format from Refinery:
curl --get $REFINERY_HOST/query/allrules/$FORMAT --header "x-honeycomb-refinery-query: my-local-token"
where:
$REFINERY_HOST
should be the url of your refinery.$FORMAT
can be one of json
, yaml
, or toml
.Retrieve the rule set that Refinery uses for the environment (or dataset in Classic Mode) defined in the variable $ENVIRON
:
curl --get $REFINERY_HOST/query/rules/$FORMAT/$ENVIRON --header "x-honeycomb-refinery-query: my-local-token"
where:
$REFINERY_HOST
should be the URL of your refinery.$FORMAT
can be one of json
, yaml
, or toml
.$ENVIRON
is the name of the environment (or dataset, in Classic Mode).The response contains a map of the sampler type to its rule set.
Retrieve information about the configurations currently in use, including the timestamp when the configuration was last loaded:
curl --include --get $REFINERY_HOST/query/configmetadata --header "x-honeycomb-refinery-query: my-local-token"
where:
$REFINERY_HOST
should be the URL of your refinery.The response contains a JSON blob of information about Refinery’s configurations. It will look something like this:
[
{
"type": "config",
"id": "tools/loadtest/config.yaml",
"hash": "1047bb6140b487ecdb0745f3335b6bc3",
"loaded_at": "2022-11-08T22:24:18-05:00"
},
{
"type": "rules",
"id": "tools/loadtest/rules.yaml",
"hash": "2d88389e1ff6530fba53466973e591e0",
"loaded_at": "2022-11-08T22:24:18-05:00"
}
]
For file-based configurations (the only type currently supported), the hash
value is identical to the value generated by the md5sum
command available in major operating systems.
Refinery can send telemetry that includes information that can help debug the sampling decisions that are made.
To enable it, in the configuration file, set AddRuleReasonToTrace
to true
.
This will cause traces that are sent to Honeycomb to include a field meta.refinery.reason
.
This field will contain text indicating which rule was evaluated that caused the trace to be included.
The rules comparisons in Refinery’s Rules-Based Sampler take the datatype of the fields into account.
In particular, a rule that compares status_code
to 200
(an integer) will fail if the status code is actually "200"
(a string), and vice-versa.
In a mixed environment where either datatype may be included in the telemetry, you should create a separate rule for each case.
This situation can be hard to diagnose, because Honeycomb’s backend converts all the values of a given field to the datatype specified in the dataset schema. Inspection of the data in Honeycomb will not give any indication that this has happened. If you see rules that appear to not execute when they should have, please consider this possibility of incorrect datatype.
Use health check API endpoints to determine if an instance is bootstrapped. The Refinery cluster machines respond to two different health check endpoints via HTTP:
/alive
This /alive
API call will return a 200
JSON response.
It does not perform any checks beyond the web server’s response to requests.
{
"source": "refinery",
"alive": "yes"
}
/x/alive
This /x/alive
API call will return a 200
JSON response that has been proxied from the Honeycomb API.
This can be used to determine if the instance is able to communicate with Honeycomb.
{
"alive": "yes"
}
If gRPC is configured, Refinery also responds to a standard gRPC Health Probe.