String operators manipulate and perform operations on strings.
CONCAT
Concatenates string representations of all arguments into a single string result. Non-string arguments are converted to strings, empty arguments are ignored.
# Usage: CONCAT(arg1, arg2, ...)
# Examples
CONCAT($api_version, $sdk)
IF($is_batch, CONCAT($url, "-batch"), $url)
TO_LOWER
Converts an input string to be all lower-case.
# Usage: TO_LOWER(string)`
# Examples
TO_LOWER($service.name)
IF(CONTAINS(TO_LOWER(app.user.name), "bob"), "bob!", "not bob!")
STARTS_WITH
Returns true if the first argument starts with the second argument. Returns false if either argument is not a string.
# Usage: STARTS_WITH(string, prefix)
# Examples
STARTS_WITH($url, "https")
STARTS_WITH($user_agent, "ELB-")
CONTAINS
Returns true if the first argument contains the second argument. Returns false if either argument is not a string.
# Usage: CONTAINS(string, substr)`
# Examples
CONTAINS($email, "@honeycomb.io")
CONTAINS($header_accept_encoding, "gzip")
IF(CONTAINS($url, "/v1/"), "api_v1", "api_v2")
REG_MATCH
Returns true if the first argument matches the second argument, which must be a defined regular expression.
Returns false if the first argument is not a string or is empty.
The provided regex
must be a string literal containing a valid regular expression.
\s
, \d
or \w
, enclose the regular expression in `backticks`
so that it is treated as a raw string literal.# Usage: REG_MATCH(string, regex)
# Examples
REG_MATCH($error_msg, `^[a-z]+\[[0-9]+\]$`)
REG_MATCH($referrer, `[\w-_]+\.(s3\.)?amazonaws.com`)
REG_VALUE
Evaluates to the first regex submatch found in the first argument.
Evaluates to an empty value if the first argument contains no matches or is not a string.
The provided regex
must be a string literal containing a valid regular expression.
\s
, \d
or \w
, enclose the regular expression in `backticks`
so that it is treated as a raw string literal.# Usage: REG_VALUE(string, regex)
# Examples
REG_VALUE($user_agent, `Chrome/[\d.]+`)
REG_VALUE($source, `^(ui-\d+|log|app-\d+)`)
The first example above yields a string like Chrome/1.2.3
and the second could be any one of ui-123
, log
, or app-456
.
REG_VALUE
is most effective when combined with other functions.
As an example, the honeytail
agent sets its User-Agent
header to a string like libhoney-go/1.3.0 honeytail/1.378 (nginx)
, but there are also User-Agent
s like "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36...
.
In order to extract only the name of the parser used and not get caught up with other things in parentheses (such as the Macintosh...
bit), we use this as a derived column:
IF(CONTAINS($user_agent, "honeytail"), REG_VALUE($user_agent, `\([a-z]+\)`), null)
This results in fields that contain (nginx)
, (mysql)
, and so on.
Combining CONTAINS
or REG_MATCH
with REG_VALUE
is a way to limit the total number of strings available to the match and more effectively grab only the values you are expecting.
REG_COUNT
Returns the number of non-overlapping successive matches yielded by the provided regex.
Returns 0
if the first argument contains no matches or is not a string.
The provided regex
must be a string literal containing a valid regular expression.
\s
, \d
or \w
, enclose the regular expression in `backticks `
so that it is treated as a raw string literal.# Usage: REG_COUNT(string, regex)
# Examples
REG_COUNT($sql, `JOIN`)
REG_COUNT($ip, `19.`)
LENGTH
Returns the length of a string in either bytes, or user-perceived characters.
The second argument must be either “bytes” or “chars”.
Returns 0
if the first argument is not a string, or if the first argument is not valid utf8 when second argument is “chars”.
# Usage: LENGTH(string[, "bytes" | "chars"])
# Examples
LENGTH($hostname, "bytes") # returns the number of bytes that make up the string.
LENGTH($hostname, "chars") # returns the number of user-perceived characters that make up the string.
To show the difference between the two units, refer to the single character 🏳️🌈 (unicode rainbow flag) in the example below:
LENGTH("🏳️🌈", "bytes") # == 14
LENGTH("🏳️🌈", "chars") # == 1