$ sudo initctl start honeytail
$ sudo systemctl start honeytail
$ honeytail -c /etc/honeytail/honeytail.conf
NGINX is one of the most popular web servers today. In a world driven by the web and connected APIs, its logs are a great candidate for surfacing a birds' eye view of activity in your service.
To see an example of the NGINX integration in action, try out the Honeytail-NGINX Example App.
Capturing web logs for Honeycomb requires:
honeytail
Download and install the latest honeytail
by running:
# Download and install the AMD64 debian package
wget -q https://honeycomb.io/download/honeytail/v1.6.2/honeytail_1.6.2_amd64.deb && \
echo '620e189973c8930de22d24dc7d568ac5b2a41af681f03bace69d9c6eba3c0a15 honeytail_1.6.2_amd64.deb' | sha256sum -c && \
sudo dpkg -i honeytail_1.6.2_amd64.deb
# Download and install the ARM64 debian package
wget -q https://honeycomb.io/download/honeytail/v1.6.2/honeytail_1.6.2_arm64.deb && \
echo 'c2c844f51b9f29f6809b63b2554bbe9a045a8ff1b3e745ae050a46408244fa06 honeytail_1.6.2_arm64.deb' | sha256sum -c && \
sudo dpkg -i honeytail_1.6.2_arm64.deb
# Download and install the rpm package
wget -q https://honeycomb.io/download/honeytail/v1.6.2/honeytail-1.6.2-1.x86_64.rpm && \
echo 'c41bb62a97c0dd3af12cdc6bf2982aec82e58889e35479eb2b6c2b2106a33179 honeytail-1.6.2-1.x86_64.rpm' | sha256sum -c && \
sudo rpm -i honeytail-1.6.2-1.x86_64.rpm
wget -q -O honeytail https://honeycomb.io/download/honeytail/v1.6.2/honeytail-linux-amd64 && \
echo '6476024603b308e54469552b9f17161b5847a30bfe2137ed88ee5a9e7f6204fa honeytail' | sha256sum -c && \
chmod 755 ./honeytail
wget -q -O honeytail https://honeycomb.io/download/honeytail/v1.6.2/honeytail-linux-arm64 && \
echo '208843f6a01b94a848e192744be09c364e1d49e73447cb441f23ea2e5709f68c honeytail' | sha256sum -c && \
chmod 755 ./honeytail
wget -q -O honeytail https://honeycomb.io/download/honeytail/v1.6.2/honeytail-darwin-amd64 && \
echo 'bfd74588062fd2333f04c0878103d94fb542e4b91456ae5c8c10e6ad309286c7 honeytail' | shasum -a 256 -c && \
chmod 755 ./honeytail
# Build from latest source after setting up go
git clone https://github.com/honeycombio/honeytail
cd honeytail; go install
The packages install honeytail
, its config file /etc/honeytail/honeytail.conf
,
and some start scripts. Build honeytail
from source if you need it in an unpackaged form or for ad-hoc use.
You should modify the config file and uncomment and set:
ParserName
to nginx
WriteKey
to your API key, available from the account pageLogFiles
to the path for the log file you want to ingest.
For NGINX, this is typically /var/log/nginx/access.log
.Dataset
to the name of the dataset you wish to create with this log file.Make sure to run through Optional Configuration below before running honeytail
, in order to get the richest metadata out of your web traffic and into your logs.
In addition to the standard configuration captured in /etc/honeytail/honeytail.conf
, you will want to set the two options in the Nginx Parser Options
section:
ConfigFile
: the path to your NGINX config file: whichever part of it contains the definition for the log formatLogFormatName
: the name of the log format used to produce the NGINX access log fileFor example, if your nginx config file is at /etc/nginx/nginx.conf
and has the following snippet:
access_log /var/log/nginx/access.log my_favorite_format;
log_format my_favorite_format '$remote_addr - $remote_user [$time_local] "$request" $status $bytes_sent';
… then ConfigFile
should be set to /etc/nginx/nginx.conf
and your LogFormatName
value should be set to my_favorite_format
.
Start up a honeytail
process using upstart
or systemd
or by launching the process by hand.
$ sudo initctl start honeytail
$ sudo systemctl start honeytail
$ honeytail -c /etc/honeytail/honeytail.conf
In addition to getting current logs flowing, you can backfill old logs into Honeycomb to kickstart your dataset.
By running honeytail
from the command line, you can import old logs separate from tailing your current logs.
Adding the --backfill
flag to honeytail
adjusts a number of settings to make it appropriate for backfilling old data, such as stopping when it gets to the end of the log file instead of the default behavior of waiting for new content (like tail
).
The specific locations on your system may vary from ours, but once you fill in your system’s values instead of our examples, you can backfill using this command:
honeytail --writekey=YOUR_API_KEY --dataset="nginx API logs" --parser=nginx \
--file=/var/log/nginx/access.16.log \
--nginx.conf=/etc/nginx/nginx.conf \
--nginx.format=api_fmt \
--backfill
This command can be used at any point to backfill from archived log files. You can read more about our agent honeytail or its backfill behavior here.
Note: honeytail
does not unzip log files, so you will need to do this before backfilling.
Easiest way—pipe to STDIN: zcat *.gz | honeytail --file - --backfill --all-the-other-flags.
First, check out honeytail
Troubleshooting for general debugging tips.
log_format <format>
not found in given config` Make sure the file referenced by --nginx.conf
contains your log format definitions.
The log format definition should look something like the example below, and should contain whatever format name you are passing to --nginx.format
:
log_format combined '$remote_addr - $remote_user [$time_local] '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent"';
… which defines the output log format for the log format name “combined.”
Note that, in more advanced nginx setups, it is possible for the log format to be defined in an overall nginx.conf
file, while a different config file (maybe under, say, /etc/nginx/sites-enabled/api.conf
) tells nginx how to output the access_log
and which format to use.
In this case, you will want to make sure and use the config file containing the log_format
line for the --nginx.conf
argument.
--debug
reveals failed to parse nginx log line
messages If your log format has fields that are likely to have spaces in them, make sure to surround that field with single quotes.
For example, if $my_upstream_var
is likely to contain spaces, you will want to change this:
log_format main '$remote_addr $host $my_upstream_var $request $other_field';
to a log_format
with quotes:
log_format main '$remote_addr $host "$my_upstream_var" $request $other_field';
You can make sure that your quotes had the right effect by peeking at the nginx logs your server is outputting to make sure that the $my_upstream_var
value is correctly surrounded by quotes.
It is good practice to put any variable that comes from an HTTP header in double quotes, because you are depending on whomever is sending you traffic to put only one string in the header.
Some headers also default to multiple words.
For example, the $http_authorization
header is represented by a -
if it is absent and is two words (Basic abcdef123456
) when present.
We are happy to help—send us a message via chat anytime!
Nginx logs can be an incredibly powerful, high-level view of your system—especially so if they are configured correctly and enriched with custom, application-specific information about each request. Below are two simple ways to pack those logs with more useful metadata.
Nginx comes with some fairly powerful optional log fields that are not included by default.
This is the log_format
we recommend for any configuration file (note the extra quotes around some fields):
access_log /var/log/nginx/access.log combined;
log_format combined '$remote_addr - $remote_user [$time_local] $host '
'"$request" $status $bytes_sent $body_bytes_sent $request_time '
'"$http_referer" "$http_user_agent" $request_length "$http_authorization" '
'"$http_x_forwarded_proto" "$http_x_forwarded_for" $server_name';
You may already have an access_log
line, but by defining a log_format
(combined
, in the example above) and specifying the format name (--nginx.format=combined
), you will be able to take advantage of all of these additional fields.
Make sure that all fields that start $http_
are quoted in your log_format
:
$bytes_sent
: the size of the response sent back to the client, including headers$host
: the requested Host header, identifying how your server was addressed$http_authorization
: authorization headers, for associating logs with individual users (must be quoted)$http_referer
: the referring site, if the client followed a link to your site (must be quoted)$http_user_agent
: the User-Agent header, useful in identifying your clients (must be quoted)$http_x_forwarded_for
: the origin IP address, if running behind a load balancer (must be quoted)$http_x_forwarded_proto
: the origin protocol, if terminating TLS in front of nginx (must be quoted)$remote_addr
: the IP address of the host making the connection to ngnix$remote_user
: the user name supplied when/if using basic authentication$request_id
: an nginx-generated unique ID to every request (only available in nginx version 1.11 and later).$request_length
: the length of the client’s request, including headers and body$request_time
: the time (in ms) the server took to respond to the request$request
: the HTTP method, request path, and protocol version$server_name
: the hostname of the machine accepting the request$status
: the HTTP status code returned for this requestNginx can also be configured to extract custom request and response headers. Of the two, response headers are the most powerful in this case—they can carry application-specific IDs or timers back through to the nginx log. Having all of the information pertinent to a single request, available in a single log line, can be an incredibly powerful tool in diagnosing the origin of a problem in your system.
To include a specific response header in your access.log
, add an $upstream_http_
variable to your log_format
—the response header values will be written out and ingested by our nginx parser!
Make sure to put quotes around these variables to capture any embedded spaces.
For example, an X-RateLimit-Remaining
header can be output by adding $upstream_http_x_ratelimit_remaining
to the log_format
line.
See the nginx docs for more about extracting metadata from the HTTP response or request.
As with other fields which may output strings (for example,$http_user_agent
), be careful when logging strings—add an extra set of double quotes around values which might contain spaces, in order to ensure correct parsing.
A final trick: sometimes, response headers may be set for logging that should not be exposed back to the user.
In this case, the proxy_hide_header
directive may be used to strip out specific headers by name:
access_log /var/log/nginx/access.log combined;
log_format combined `... "$upstream_x_internal_top_secret" ...`; # Wrap string values with double quotes
location / {
proxy_pass http://127.0.0.1:8080; # Expose port 8080
proxy_hide_header X-Internal-Top-Secret; # Strip from client
}
While we believe strongly in the value of being able to track down the precise query causing a problem, we understand the concerns of exporting log data which may contain sensitive user information.
With that in mind, we recommend using honeytail
’s nginx parser, but adding a --scrub_field=sensitive_field_name
flag to hash the concrete sensitive_field_name
value, or --drop_field=sensitive_field_name
to drop it altogether and prevent it being sent to Honeycomb’s servers.
More information about dropping or scrubbing sensitive fields can be found here.
honeytail
can break URLs up into their component parts, storing extra information in additional columns.
This behavior is turned on by default for the request
field on nginx
datasets, but can become more useful with a little bit of guidance from you.
See honeytail
’s documentation for details on configuring our agent to parse URL strings.
Honeytail is open source and Apache 2.0 licensed.
Did you find what you were looking for?