Amazon’s Relational Database Service (RDS) lets you use a number of databases without having to administer them yourself. The Honeycomb RDS connector gives you access to the same data as if you were running PostgreSQL on your own server.
Honeycomb allows you to calculate metrics and statistics on the fly while retaining the full-resolution log lines (and the original query that started it all).
Note: Run the following commands from any Linux host with the appropriate AWS credentials to access the RDS API.
Before running the RDS connector, configure your RDS PostgreSQL instance to output queries in its log file. Refer to Amazon’s documentation on setting Parameter Groups to get started.
Set the following option in the Parameter Group:
log_statement indicates which types of queries are logged, but is superseded when setting
0, as this effectively logs all queries. Setting
log_statement to any other value will change the format of the query logs in a way that isn’t currently supported by the Honeycomb PostgreSQL parser.
If you switch to a new Parameter Group when you make this change, make sure you restart the database.
Once you’ve made the change, verify you are getting RDS logs via the RDS Console
rdslogs will stream the current PostgreSQL query log from RDS or download older log files.
You can view the
rdslogs source here.
Get and verify the current Linux version of
wget -q https://honeycomb.io/download/rdslogs/rdslogs_1.108_amd64.deb && \ echo 'f289b871552170a88e8f5a545d4587343acda1e208a73ef92ae4dda8aa477a5d rdslogs_1.108_amd64.deb' | sha256sum -c && \ sudo dpkg -i rdslogs_1.108_amd64.deb
wget -q https://honeycomb.io/download/rdslogs/rdslogs-1.108-1.x86_64.rpm && \ echo 'e48cdf912d803f97716ffe0dc89b61b9ac75b3d4489d22a022362323fecc93ab rdslogs-1.108-1.x86_64.rpm' | sha256sum -c && \ sudo rpm -i rdslogs-1.108-1.x86_64.rpm
wget -q -O rdslogs https://honeycomb.io/download/rdslogs/1.108 && \ echo '0ec262bb22b5b3f26e17237c9f16d880684346cd2f28d0a505572714eb7c0489 rdslogs' | sha256sum -c && \ chmod 755 ./rdslogs
Use the rdslogs command with the
--output flag set to
honeycomb to connect to RDS and send data from the current log to Honeycomb.
You will need the following information:
rdslogs \ -i <instance-identifier> \ --region=<region-code> \ --output=honeycomb \ --writekey=YOUR_API_KEY \ --dataset='RDS PostgreSQL' \ --dbtype=postgresql
--sample_rate to send a subset (1/N log lines, defaults to N=1) of your data. Sampling in Honeycomb is described in detail in Sampling high volume data.
We believe strongly in the value of being able to track down the precise query causing a problem, but we also understand the concerns of exporting log data which may contain sensitive user information, so you have the option of hashing the contents of the data returned by a query.
To hash the concrete
query, add the flag
normalized_query attribute will still be representative of the shape of the query and identifying patterns (including specific queries) will still be possible, but the sensitive information will be completely obscured before leaving your servers.
For more information about dropping or scrubbing sensitive fields, see “Dropping or scrubbing fields” in the Agent documentation section.
If you’re getting started with Honeycomb, you can load the past 24 hours of logs into Honeycomb to start finding interesting things right away. Launch this command to run in the background (it will take some time) while you hook up the live stream. (However, if you just now enabled the slow query log, you won’t have the past 24 hours of logs. You can skip this step and go straight to streaming.)
The following commands will download all available slow query logs to a newly created
slow_logs directory and then start up
honeytail to send the parsed events to Honeycomb. You’ll need your RDS instance identifier (from the instances page of the RDS Console) and your Honeycomb API key (from your Honeycomb account page).
mkdir slow_logs && \ rdslogs \ -i <instance-identifier> \ --download \ --download_dir=slow_logs \ --dbtype=postgresql && \ honeytail \ --writekey=YOUR_API_KEY \ --dataset='RDS PostgreSQL' \ --parser=postgresql \ --postgresql.log_line_prefix="%t:%r:%u@%d:[%p]:" \ --file='slow_logs/*' \ --backfill
Once you’ve finished backfilling your old logs, we recommend transitioning to the default streaming behavior to stream current logs.
As an alternative to using the
rdslogs CLI tool, you can configure your RDS instance to mirror its logs to Cloudwatch Logs, then install the Agentless Integration for PostgreSQL Logs. This integration is a Lambda function subscribed to your instance’s RDS Log Group. It parses log events as they arrive and submits them to Honeycomb. Note that configuring your RDS instance to send its logs to Cloudwatch will incur additional AWS costs - see the Cloudwatch Pricing docs for details.
Before installing the integration, configure your RDS PostgreSQL instance to output queries in its log file. Refer to Amazon’s documentation on setting Parameter Groups to get started.
Set the following option in the Parameter Group:
If you switch to a new Parameter Group when you make these changes, make sure you restart the database.
Next, enable publishing of PostgreSQL slow query logs to AWS Cloudwatch Logs. You can do this in the RDS console in the instance configuration. See the AWS docs for full details. After the change, verify that logs are being received by Cloudwatch Logs via the Cloudwatch Logs Console.
The PostgreSQL integration exists as an AWS Lambda function deployed in your AWS account. It subscribes to the Cloudwatch Log Group created by RDS, parses log lines, and submits them as events to Honeycomb. You can view the source here.
To install the integration, you will need:
To get started, click this AWS quick-create link. This will launch the Cloudformation Stack creation wizard and will prompt you for a few key inputs:
%t [%p-%l] %u@%d. See the Postgres Docs for more info.