Honeycomb provides honeycloudfront
to support ingestion of AWS CloudFront access logs.
These logs are useful for visualizing questions, such as “How many cache misses are happening in CloudFront?" or “How much bandwidth is CloudFront saving us?".
The source is available on Github and instructions for getting started are provided here.
This implementation supports Web Distributions. RTMP distributions are not supported. If you require RTMP support, please file an issue and CC the maintainers.
honeycloudfront
assumes access to an AWS access key ID and AWS secret access key with the proper permissions.
It will attempt to obtain these via the default profile in ~/.aws/config
, by the proper environment variables, or by an IAM EC2 instance profile.
See the AWS guide on providing credentials for more details.
See the provided IAM policy JSON in the honeyaws
repository for one example of a policy which has the proper permissions.
This can be scoped down to more specific resources if desired.
honeycloudfront
is available as part of the Honeycomb AWS Bundle or as a standalone binary.
To install honeycloudfront
, use the following instructions:
wget -q https://honeycomb.io/download/honeyaws/v1.4.3/honeyaws_1.4.3_amd64.deb && \
echo 'c2dd79fec9e2346568de562dd790fd1e474a62fed3a6d3d21e8f1b3472b03418 honeyaws_1.4.3_amd64.deb' | sha256sum -c && \
sudo dpkg -i honeyaws_1.4.3_amd64.deb
wget -q https://honeycomb.io/download/honeyaws/v1.4.3/honeyaws-1.4.3-1.x86_64.rpm && \
echo '98d60d2b898f4b7a03abed013b21c6f2d3a86014cc08d2dbbd78e51c3b8d9dcd honeyaws-1.4.3-1.x86_64.rpm' | sha256sum -c && \
sudo rpm -i honeyaws-1.4.3-1.x86_64.rpm
wget -q -O honeycloudfront https://honeycomb.io/download/honeyaws/v1.4.3/honeycloudfront-linux-amd64 && \
echo 'ac73f2df441d623bbd81adeac5c6d2ad2e254bc0166bfebb53b170a62d61d8b1 honeycloudfront' | sha256sum -c && \
chmod 755 ./honeycloudfront
wget -q -O honeycloudfront https://honeycomb.io/download/honeyaws/v1.4.3/honeycloudfront-linux-arm64 && \
echo '9e4c86e3bd2cd787f58a9c31b44a19c66fd963590b9468ad8f13ad5d0d4c0cc2 honeycloudfront' | sha256sum -c && \
chmod 755 ./honeycloudfront
wget -q -O honeycloudfront https://honeycomb.io/download/honeyaws/v1.4.3/honeycloudfront-darwin-amd64 && \
echo 'd1307902bd2370ee9180873565cb1bc9064dd9eb14d1705e95d5b1b4358c910c honeycloudfront' | shasum -a 256 -c && \
chmod 755 ./honeycloudfront
Use honeycloudfront
interactively (for beginning exploration, debugging credential management) or as a daemon.
Try running some commands interactively at first to get a feel for using the tool and then configure it to run as a proper system service when you are ready to be ingesting continuously.
To show all distributions, invoke honeycloudfront ls
:
$ honeycloudfront ls
EVDBLY2TVIYCVB
D11111ABCDEF8Q
S11A16G5KZMEQD
To ingest access logs from a distribution, use honeycloudfront ingest
with one or more distribution names.
Set your Honeycomb write key with the --writekey
flag.
By default the events will be sent to a dataset called aws-cloudfront-access
.
Note: If access logs are not configured for the distribution it will throw an error. Please see the documentation on CloudFront access logs to learn how to enable this feature.
For example, ingesting logs from one distribution with ID S11A16G5KZMEQD
:
$ honeycloudfront --writekey=YOUR_API_KEY ingest S11A16G5KZMEQD
...
Ingesting logs from multiple specific distributions:
$ honeycloudfront --writekey=YOUR_API_KEY ingest EVDBLY2TVIYCVB D11111ABCDEF8Q S11A16G5KZMEQD
...
honeycloudfront ingest
without any arguments will use all available (“described”) distributions in your configured AWS region.
With arguments, it will ingest logs for the specified load balancer names.
$ honeycloudfront --writekey=YOUR_API_KEY ingest
...ingesting logs from all distributions...
By default, the agent will drop state files (to avoid sending duplicate events) in the current working directory where it is invoked.
To change where these files are kept, use the --statedir
flag.
Sampling is a great way to send fewer events (thereby keeping more history and reducing costs) while still preserving most relevant information.
To set a sample rate while using one of the Honeycomb AWS tools, use the --samplerate
flag.
While the tools run, this base rate will be automatically adjusted by the Honeycomb AWS tools using dynamic sampling to keep more interesting traffic at a higher rate.
For instance, setting the sample flag to 20 will send 1 out of every 20 requests processed to Honeycomb by default.
Fields such as elb_status_code
are used to lower this ratio for rarer, but relevant, events such as HTTP 500-level errors.
honeyelb --samplerate 20 ... ingest ...
honeycloudfront
, while supporting a interactive workflow for initial discovery and experimentation, is meant to be invoked as a long-running process by a system service manager.
To do this, edit the system init files (Upstart and systemd are supported) installed by the package manager to add the API key.
Once you receive data from honeycloudfront
you will want to explore it.
The descriptions of the sent fields is available in the AWS documentation for Web Distribution access logs.
Here are some suggestions for things to try:
GROUP BY
x_edge_result_type
and COUNT
to see which and how many requests were served with cache status Hit
, Miss
, LimitExceeded
, or other valuesGROUP BY
cs_uri_stem
and MAX(time_taken)
to see which URIs took the longest to serveGROUP BY
cs_uri_stem
and MAX(sc_bytes)
to see when the largest requests were served and how big they were in bytesDid you find what you were looking for?