String operators manipulate and perform operations on strings.
CONCAT Concatenates string representations of all arguments into a single string result. Non-string arguments are converted to strings, empty arguments are ignored.
# Usage: CONCAT(arg1, arg2, ...)
# Examples
CONCAT($api_version, $sdk)
IF($is_batch, CONCAT($url, "-batch"), $url)
TO_LOWER Converts an input string to be all lower-case.
# Usage: TO_LOWER(string)`
# Examples
TO_LOWER($service.name)
IF(CONTAINS(TO_LOWER(app.user.name), "bob"), "bob!", "not bob!")
STARTS_WITH Returns true if the first argument starts with the second argument. Returns false if either argument is not a string.
# Usage: STARTS_WITH(string, prefix)
# Examples
STARTS_WITH($url, "https")
STARTS_WITH($user_agent, "ELB-")
ENDS_WITH Returns true if the first argument ends with the second argument. Returns false if either argument is not a string.
# Usage: ENDS_WITH(string, suffix)
# Examples
ENDS_WITH($filename, ".json")
CONTAINS Returns true if the first argument contains the second argument. Returns false if either argument is not a string.
# Usage: CONTAINS(string, substr)`
# Examples
CONTAINS($email, "@honeycomb.io")
CONTAINS($header_accept_encoding, "gzip")
IF(CONTAINS($url, "/v1/"), "api_v1", "api_v2")
REG_MATCH Returns true if the first argument matches the second argument, which must be a defined regular expression.
Returns false if the first argument is not a string or is empty.
The provided regex must be a string literal containing a valid regular expression.
\s, \d or \w, enclose the regular expression in `backticks` so that it is treated as a raw string literal.# Usage: REG_MATCH(string, regex)
# Examples
REG_MATCH($error_msg, `^[a-z]+\[[0-9]+\]$`)
REG_MATCH($referrer, `[\w-_]+\.(s3\.)?amazonaws.com`)
REG_VALUE Evaluates to the first regex submatch found in the first argument.
Evaluates to an empty value if the first argument contains no matches or is not a string.
The provided regex must be a string literal containing a valid regular expression.
\s, \d or \w, enclose the regular expression in `backticks` so that it is treated as a raw string literal.# Usage: REG_VALUE(string, regex)
# Examples
REG_VALUE($user_agent, `Chrome/[\d.]+`)
REG_VALUE($source, `^(ui-\d+|log|app-\d+)`)
The first example above yields a string like Chrome/1.2.3 and the second could be any one of ui-123, log, or app-456.
REG_VALUE is most effective when combined with other functions.
As an example, the honeytail agent sets its User-Agent header to a string like libhoney-go/1.3.0 honeytail/1.378 (nginx), but there are also User-Agents like "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36....
In order to extract only the name of the parser used and not get caught up with other things in parentheses (such as the Macintosh... bit), we use this as a calculated field:
IF(CONTAINS($user_agent, "honeytail"), REG_VALUE($user_agent, `\([a-z]+\)`), null)
This results in fields that contain (nginx), (mysql), and so on.
Combining CONTAINS or REG_MATCH with REG_VALUE is a way to limit the total number of strings available to the match and more effectively grab only the values you are expecting.
REG_COUNT Returns the number of non-overlapping successive matches yielded by the provided regex.
Returns 0 if the first argument contains no matches or is not a string.
The provided regex must be a string literal containing a valid regular expression.
\s, \d or \w, enclose the regular expression in `backticks `so that it is treated as a raw string literal.# Usage: REG_COUNT(string, regex)
# Examples
REG_COUNT($sql, `JOIN`)
REG_COUNT($ip, `19.`)
LENGTH Returns the length of a string in either bytes, or user-perceived characters.
The second argument must be either “bytes” or “chars”.
Returns 0 if the first argument is not a string, or if the first argument is not valid utf8 when second argument is “chars”.
# Usage: LENGTH(string[, "bytes" | "chars"])
# Examples
LENGTH($hostname, "bytes") # returns the number of bytes that make up the string.
LENGTH($hostname, "chars") # returns the number of user-perceived characters that make up the string.
To show the difference between the two units, refer to the single character 🏳️🌈 (unicode rainbow flag) in the example below:
LENGTH("🏳️🌈", "bytes") # == 14
LENGTH("🏳️🌈", "chars") # == 1