VRL example reference
Here you’ll find a comprehensive list of all VRL program examples. These examples demonstrate the breadth of the language and its observability-focused facilities.
Try Using the VRL subcommand
You can run these examples using the vector vrl
subcommand with --input
(input is newline
delimited JSON file representing a list of events). and --program
(VRL program) to pass in the
example input and program as well as --print-object
to show the modified object. The below
examples show pretty-printed JSON for the input events, so collapse these to single lines when
passing in via --input
.
For example: vector vrl --input input.json --program program.vrl --print-object
. This closely
matches how VRL will receive the input in a running Vector instance.
VRL REPL
Additionally, if vector vrl
is run without any arguments, it will spawn a REPL
(Read–eval–print loop).
Assuming you have Vector installed, you can run vector vrl
to start the REPL.
From there, you can type help
and press return to get further help.
The REPL behaves nearly identical to the programs you write for your Vector configuration, and can be used to test individual snippets of complex programs before you commit them to your production configuration.
The vector vrl
command has many other capabilities, run the command using
vector vrl --help
to see more information.
Real world examples
Parse Syslog logs
{
"message": "\u003c102\u003e1 2020-12-22T15:22:31.111Z vector-user.biz su 2666 ID389 - Something went wrong"
}
. |= parse_syslog!(.message)
{
"log": {
"appname": "su",
"facility": "ntp",
"hostname": "vector-user.biz",
"message": "Something went wrong",
"msgid": "ID389",
"procid": 2666,
"severity": "info",
"timestamp": "2020-12-22T15:22:31.111Z",
"version": 1
}
}
Parse key/value (logfmt) logs
{
"message": "@timestamp=\"Sun Jan 10 16:47:39 EST 2021\" level=info msg=\"Stopping all fetchers\" tag#production=stopping_fetchers id=ConsumerFetcherManager-1382721708341 module=kafka.consumer.ConsumerFetcherManager"
}
. = parse_key_value!(.message)
{
"log": {
"@timestamp": "Sun Jan 10 16:47:39 EST 2021",
"id": "ConsumerFetcherManager-1382721708341",
"level": "info",
"module": "kafka.consumer.ConsumerFetcherManager",
"msg": "Stopping all fetchers",
"tag#production": "stopping_fetchers"
}
}
Parse custom logs
{
"message": "2021/01/20 06:39:15 +0000 [error] 17755#17755: *3569904 open() \"/usr/share/nginx/html/test.php\" failed (2: No such file or directory), client: xxx.xxx.xxx.xxx, server: localhost, request: \"GET /test.php HTTP/1.1\", host: \"yyy.yyy.yyy.yyy\""
}
. |= parse_regex!(.message, r'^(?P<timestamp>\d+/\d+/\d+ \d+:\d+:\d+ \+\d+) \[(?P<severity>\w+)\] (?P<pid>\d+)#(?P<tid>\d+):(?: \*(?P<connid>\d+))? (?P<message>.*)$')
# Coerce parsed fields
.timestamp = parse_timestamp(.timestamp, "%Y/%m/%d %H:%M:%S %z") ?? now()
.pid = to_int!(.pid)
.tid = to_int!(.tid)
# Extract structured data
message_parts = split(.message, ", ", limit: 2)
structured = parse_key_value(message_parts[1], key_value_delimiter: ":", field_delimiter: ",") ?? {}
.message = message_parts[0]
. = merge(., structured)
{
"log": {
"client": "xxx.xxx.xxx.xxx",
"connid": "3569904",
"host": "yyy.yyy.yyy.yyy",
"message": "open() \"/usr/share/nginx/html/test.php\" failed (2: No such file or directory)",
"pid": 17755,
"request": "GET /test.php HTTP/1.1",
"server": "localhost",
"severity": "error",
"tid": 17755,
"timestamp": "2021-01-20T06:39:15Z"
}
}
Multiple parsing strategies
{
"message": "\u003c102\u003e1 2020-12-22T15:22:31.111Z vector-user.biz su 2666 ID389 - Something went wrong"
}
structured =
parse_syslog(.message) ??
parse_common_log(.message) ??
parse_regex!(.message, r'^(?P<timestamp>\d+/\d+/\d+ \d+:\d+:\d+) \[(?P<severity>\w+)\] (?P<pid>\d+)#(?P<tid>\d+):(?: \*(?P<connid>\d+))? (?P<message>.*)$')
. = merge(., structured)
{
"log": {
"appname": "su",
"facility": "ntp",
"hostname": "vector-user.biz",
"message": "Something went wrong",
"msgid": "ID389",
"procid": 2666,
"severity": "info",
"timestamp": "2020-12-22T15:22:31.111Z",
"version": 1
}
}
Modify metric tags
{
"counter": {
"value": 102
},
"kind": "incremental",
"name": "user_login_total",
"tags": {
"email": "vic@vector.dev",
"host": "my.host.com",
"instance_id": "abcd1234"
}
}
.tags.environment = get_env_var!("ENV") # add
.tags.hostname = del(.tags.host) # rename
del(.tags.email)
{
"metric": {
"counter": {
"value": 102
},
"kind": "incremental",
"name": "user_login_total",
"tags": {
"environment": "production",
"hostname": "my.host.com",
"instance_id": "abcd1234"
}
}
}
Emitting multiple logs from JSON
{
"message": "[{\"message\": \"first_log\"}, {\"message\": \"second_log\"}]"
}
. = parse_json!(.message) # sets `.` to an array of objects
[
{
"log": {
"message": "first_log"
}
},
{
"log": {
"message": "second_log"
}
}
]
Emitting multiple non-object logs from JSON
{
"message": "[5, true, \"hello\"]"
}
. = parse_json!(.message) # sets `.` to an array
[
{
"log": {
"message": 5
}
},
{
"log": {
"message": true
}
},
{
"log": {
"message": "hello"
}
}
]
Invalid argument type
{
"not_a_string": 1
}
upcase(42)
error[E110]: invalid argument type
┌─ :1:8
│
1 │ upcase(42)
│ ^^
│ │
│ this expression resolves to the exact type integer
│ but the parameter "value" expects the exact type string
│
= try: ensuring an appropriate type at runtime
=
= 42 = string!(42)
= upcase(42)
=
= try: coercing to an appropriate type and specifying a default value as a fallback in case coercion fails
=
= 42 = to_string(42) ?? "default"
= upcase(42)
=
= see documentation about error handling at https://errors.vrl.dev/#handling
= learn more about error code 110 at https://errors.vrl.dev/110
= see language documentation at https://vrl.dev
= try your code in the VRL REPL, learn more at https://vrl.dev/examples
Unhandled fallible assignment
{
"message": "key1=value1 key2=value2"
}
structured = parse_key_value(.message)
error[E103]: unhandled fallible assignment
┌─ :1:14
│
1 │ structured = parse_key_value(.message)
│ ------------ ^^^^^^^^^^^^^^^^^^^^^^^^^
│ │ │
│ │ this expression is fallible because at least one argument's type cannot be verified to be valid
│ │ update the expression to be infallible by adding a `!`: `parse_key_value!(.message)`
│ │ `.message` argument type is `any` and this function expected a parameter `value` of type `string`
│ or change this to an infallible assignment:
│ structured, err = parse_key_value(.message)
│
= see documentation about error handling at https://errors.vrl.dev/#handling
= see functions characteristics documentation at https://vrl.dev/expressions/#function-call-characteristics
= learn more about error code 103 at https://errors.vrl.dev/103
= see language documentation at https://vrl.dev
= try your code in the VRL REPL, learn more at https://vrl.dev/examples
Array examples
zip
Iterate over several arrays in parallel, producing a new array containing arrays of items from each source.
The resulting array will be as long as the shortest input array, with all the remaining elements dropped.
This function is modeled from the zip
function in Python,
but similar methods can be found in Ruby
and Rust.
If a single parameter is given, it must contain an array of all the input arrays.
Codec examples
decode_base16
value
(a Base16 string) into its original string.Decode Base16 data
decode_base16!("796f752068617665207375636365737366756c6c79206465636f646564206d65")
you have successfully decoded me
decode_base64
value
(a Base64 string) into its original string.Decode Base64 data (default)
decode_base64!("eW91IGhhdmUgc3VjY2Vzc2Z1bGx5IGRlY29kZWQgbWU=")
you have successfully decoded me
Decode Base64 data (URL safe)
decode_base64!("eW91IGNhbid0IG1ha2UgeW91ciBoZWFydCBmZWVsIHNvbWV0aGluZyBpdCB3b24ndA==", charset: "url_safe")
you can't make your heart feel something it won't
decode_charset
value
(a non-UTF8 string) to a UTF8 string using the specified character set.Decode EUC-KR string
decode_charset!(decode_base64!("vsiz58fPvLy/5A=="), "euc-kr")
안녕하세요
Decode EUC-JP string
decode_charset!(decode_base64!("pLOk86TLpMGkzw=="), "euc-jp")
こんにちは
Decode GB2312 string
decode_charset!(decode_base64!("xOO6ww=="), "gb2312")
你好
decode_gzip
value
(a Gzip string) into its original string.Decode Gzip data
encoded_text = decode_base64!("H4sIAHEAymMAA6vML1XISCxLVSguTU5OLS5OK83JqVRISU3OT0lNUchNBQD7BGDaIAAAAA==")
decode_gzip!(encoded_text)
you have successfully decoded me
decode_mime_q
value
with their original string.Decode single encoded-word
decode_mime_q!("=?utf-8?b?SGVsbG8sIFdvcmxkIQ==?=")
Hello, World!
Embedded
decode_mime_q!("From: =?utf-8?b?SGVsbG8sIFdvcmxkIQ==?= <=?utf-8?q?hello=5Fworld=40example=2ecom?=>")
From: Hello, World! <hello_world@example.com>
Without charset
decode_mime_q!("?b?SGVsbG8sIFdvcmxkIQ==")
Hello, World!
decode_percent
value
like a URL.Percent decode a value
decode_percent("foo%20bar%3F")
foo bar?
decode_punycode
value
, such as an internationalized domain name (IDN). This function assumes that the value passed is meant to be used in IDN context and that it is either a domain name or a part of it.Decode a punycode encoded internationalized domain name
decode_punycode!("www.xn--caf-dma.com")
www.café.com
Decode an ASCII only string
decode_punycode!("www.cafe.com")
www.cafe.com
Ignore validation
decode_punycode!("xn--8hbb.xn--fiba.xn--8hbf.xn--eib.", validate: false)
١٠.٦٦.٣٠.٥.
decode_snappy
value
(a Snappy string) into its original string.Decode Snappy data
encoded_text = decode_base64!("LKxUaGUgcXVpY2sgYnJvd24gZm94IGp1bXBzIG92ZXIgMTMgbGF6eSBkb2dzLg==")
decode_snappy!(encoded_text)
The quick brown fox jumps over 13 lazy dogs.
decode_zlib
value
(a Zlib string) into its original string.Decode Zlib data
encoded_text = decode_base64!("eJwNy4ENwCAIBMCNXIlQ/KqplUSgCdvXAS41qPMHshCB2R1zJlWIVlR6UURX2+wx2YcuK3kAb9C1wd6dn7Fa+QH9gRxr")
decode_zlib!(encoded_text)
you_have_successfully_decoded_me.congratulations.you_are_breathtaking.
decode_zstd
value
(a Zstandard string) into its original string.Decode Zstd data
encoded_text = decode_base64!("KLUv/QBY/QEAYsQOFKClbQBedqXsb96EWDax/f/F/z+gNU4ZTInaUeAj82KqPFjUzKqhcfDqAIsLvAsnY1bI/N2mHzDixRQA")
decode_zstd!(encoded_text)
you_have_successfully_decoded_me.congratulations.you_are_breathtaking.
encode_base16
value
to Base16.Encode to Base16
encode_base16("please encode me")
706c6561736520656e636f6465206d65
encode_base64
value
to Base64.Encode to Base64 (default)
encode_base64("please encode me")
cGxlYXNlIGVuY29kZSBtZQ==
Encode to Base64 (without padding)
encode_base64("please encode me, no padding though", padding: false)
cGxlYXNlIGVuY29kZSBtZSwgbm8gcGFkZGluZyB0aG91Z2g
Encode to Base64 (URL safe)
encode_base64("please encode me, but safe for URLs", charset: "url_safe")
cGxlYXNlIGVuY29kZSBtZSwgYnV0IHNhZmUgZm9yIFVSTHM=
encode_charset
value
(a UTF8 string) to a non-UTF8 string using the specified character set.Encode UTF8 string to EUC-KR
encode_base64(encode_charset!("안녕하세요", "euc-kr"))
vsiz58fPvLy/5A==
Encode UTF8 string to EUC-JP
encode_base64(encode_charset!("こんにちは", "euc-jp"))
pLOk86TLpMGkzw==
Encode UTF8 string to GB2312
encode_base64(encode_charset!("你好", "gb2312"))
xOO6ww==
encode_gzip
value
to Gzip.Encode to Gzip
encoded_text = encode_gzip("please encode me")
encode_base64(encoded_text)
H4sIAAAAAAAA/yvISU0sTlVIzUvOT0lVyE0FAI4R4vcQAAAA
encode_json
value
to JSON.Encode to JSON
.payload = encode_json({"hello": "world"})
{"hello":"world"}
encode_key_value
value
into key-value format with customizable delimiters. Default delimiters match
the logfmt format.Encode with default delimiters (no ordering)
encode_key_value({"ts": "2021-06-05T17:20:00Z", "msg": "This is a message", "lvl": "info"})
lvl=info msg="This is a message" ts=2021-06-05T17:20:00Z
Encode with default delimiters (fields ordering)
encode_key_value!({"ts": "2021-06-05T17:20:00Z", "msg": "This is a message", "lvl": "info", "log_id": 12345}, ["ts", "lvl", "msg"])
ts=2021-06-05T17:20:00Z lvl=info msg="This is a message" log_id=12345
Encode with default delimiters (nested fields)
encode_key_value({"agent": {"name": "foo"}, "log": {"file": {"path": "my.log"}}, "event": "log"})
agent.name=foo event=log log.file.path=my.log
Encode with default delimiters (nested fields ordering)
encode_key_value!({"agent": {"name": "foo"}, "log": {"file": {"path": "my.log"}}, "event": "log"}, ["event", "log.file.path", "agent.name"])
event=log log.file.path=my.log agent.name=foo
Encode with custom delimiters (no ordering)
encode_key_value(
{"ts": "2021-06-05T17:20:00Z", "msg": "This is a message", "lvl": "info"},
field_delimiter: ",",
key_value_delimiter: ":"
)
lvl:info,msg:"This is a message",ts:2021-06-05T17:20:00Z
Encode with custom delimiters and flatten boolean
encode_key_value(
{"ts": "2021-06-05T17:20:00Z", "msg": "This is a message", "lvl": "info", "beta": true, "dropped": false},
field_delimiter: ",",
key_value_delimiter: ":",
flatten_boolean: true
)
beta,lvl:info,msg:"This is a message",ts:2021-06-05T17:20:00Z
encode_logfmt
value
to logfmt.Encode to logfmt (no ordering)
encode_logfmt({"ts": "2021-06-05T17:20:00Z", "msg": "This is a message", "lvl": "info"})
lvl=info msg="This is a message" ts=2021-06-05T17:20:00Z
Encode to logfmt (fields ordering)
encode_logfmt!({"ts": "2021-06-05T17:20:00Z", "msg": "This is a message", "lvl": "info", "log_id": 12345}, ["ts", "lvl", "msg"])
ts=2021-06-05T17:20:00Z lvl=info msg="This is a message" log_id=12345
Encode to logfmt (nested fields)
encode_logfmt({"agent": {"name": "foo"}, "log": {"file": {"path": "my.log"}}, "event": "log"})
agent.name=foo event=log log.file.path=my.log
Encode to logfmt (nested fields ordering)
encode_logfmt!({"agent": {"name": "foo"}, "log": {"file": {"path": "my.log"}}, "event": "log"}, ["event", "log.file.path", "agent.name"])
event=log log.file.path=my.log agent.name=foo
encode_percent
value
with percent encoding to safely be used in URLs.Percent encode all non-alphanumeric characters (default)
encode_percent("foo bar?")
foo%20bar%3F
Percent encode only control characters
encode_percent("foo bar", ascii_set: "CONTROLS")
foo %09bar
encode_proto
value
into a protocol buffer payload.Encode to proto
.payload = encode_base64(encode_proto!({"name": "someone", "phones": [{"number": "123456"}]}, "resources/protobuf_descriptor_set.desc", "test_protobuf.Person"))
Cgdzb21lb25lIggKBjEyMzQ1Ng==
encode_punycode
value
to punycode. Useful for internationalized domain names (IDN). This function assumes that the value passed is meant to be used in IDN context and that it is either a domain name or a part of it.Encode an internationalized domain name
encode_punycode!("www.café.com")
www.xn--caf-dma.com
Encode an internationalized domain name with mixed case
encode_punycode!("www.CAFé.com")
www.xn--caf-dma.com
Encode an ASCII only string
encode_punycode!("www.cafe.com")
www.cafe.com
Ignore validation
encode_punycode!("xn--8hbb.xn--fiba.xn--8hbf.xn--eib.", validate: false)
xn--8hbb.xn--fiba.xn--8hbf.xn--eib.
encode_snappy
value
to Snappy.Encode to Snappy
encoded_text = encode_snappy!("The quick brown fox jumps over 13 lazy dogs.")
encode_base64(encoded_text)
LKxUaGUgcXVpY2sgYnJvd24gZm94IGp1bXBzIG92ZXIgMTMgbGF6eSBkb2dzLg==
encode_zlib
value
to Zlib.Encode to Zlib
encoded_text = encode_zlib("please encode me")
encode_base64(encoded_text)
eJwryElNLE5VSM1Lzk9JVchNBQA0RQX7
encode_zstd
value
to Zstandard.Encode to Zstd
encoded_text = encode_zstd("please encode me")
encode_base64(encoded_text)
KLUv/QBYgQAAcGxlYXNlIGVuY29kZSBtZQ==
Coerce examples
Convert examples
from_unix_timestamp
Converts the value
integer from a Unix timestamp to a VRL timestamp
.
Converts from the number of seconds since the Unix epoch by default. To convert from milliseconds or nanoseconds, set the unit
argument to milliseconds
or nanoseconds
.
Convert from a Unix timestamp (seconds)
from_unix_timestamp!(5)
1970-01-01T00:00:05Z
Convert from a Unix timestamp (milliseconds)
from_unix_timestamp!(5000, unit: "milliseconds")
1970-01-01T00:00:05Z
Convert from a Unix timestamp (nanoseconds)
from_unix_timestamp!(5000, unit: "nanoseconds")
1970-01-01T00:00:00.000005Z
to_syslog_facility
value
, a Syslog facility code, into its corresponding
Syslog keyword. For example, 0
into "kern"
, 1
into "user"
, etc.Coerce to a Syslog facility
to_syslog_facility!(4)
auth
to_syslog_level
value
, a Syslog severity level, into its corresponding keyword,
i.e. 0 into "emerg"
, 1 into "alert"
, etc.Coerce to a Syslog level
to_syslog_level!(5)
notice
to_syslog_severity
Coerce to Syslog severity
to_syslog_severity!("alert")
1
to_unix_timestamp
Converts the value
timestamp into a Unix timestamp.
Returns the number of seconds since the Unix epoch by default. To return the number in milliseconds or nanoseconds, set the unit
argument to milliseconds
or nanoseconds
.
Convert to a Unix timestamp (seconds)
to_unix_timestamp(t'2021-01-01T00:00:00+00:00')
1609459200
Convert to a Unix timestamp (milliseconds)
to_unix_timestamp(t'2021-01-01T00:00:00Z', unit: "milliseconds")
1609459200000
Convert to a Unix timestamp (nanoseconds)
to_unix_timestamp(t'2021-01-01T00:00:00Z', unit: "nanoseconds")
1609459200000000000
Debug examples
assert
condition
, which must be a Boolean expression. The program is aborted with
message
if the condition evaluates to false
.assert_eq
left
and right
, have the same value. The program is
aborted with message
if they do not have the same value.Unsuccessful assertion with custom log message
assert_eq!(1, 0, message: "Unequal integers")
Enrichment examples
find_enrichment_table_records
Searches an enrichment table for rows that match the provided condition.
For file
enrichment tables, this condition needs to be a VRL object in which
the key-value pairs indicate a field to search mapped to a value to search in that field.
This function returns the rows that match the provided condition(s). All fields need to
match for rows to be returned; if any fields do not match, then no rows are returned.
There are currently two forms of search criteria:
Exact match search. The given field must match the value exactly. Case sensitivity can be specified using the
case_sensitive
argument. An exact match search can use an index directly into the dataset, which should make this search fairly “cheap” from a performance perspective.Date range search. The given field must be greater than or equal to the
from
date and less than or equal to theto
date. A date range search involves sequentially scanning through the rows that have been located using any exact match criteria. This can be an expensive operation if there are many rows returned by any exact match criteria. Therefore, use date ranges as the only criteria when the enrichment data set is very small.
For geoip
and mmdb
enrichment tables, this condition needs to be a VRL object with a single key-value pair
whose value needs to be a valid IP address. Example: {"ip": .ip }
. If a return field is expected
and without a value, null
is used. This table can return the following fields:
ISP databases:
autonomous_system_number
autonomous_system_organization
isp
organization
City databases:
city_name
continent_code
country_code
country_name
region_code
region_name
metro_code
latitude
longitude
postal_code
timezone
Connection-Type databases:
connection_type
To use this function, you need to update your configuration to
include an
enrichment_tables
parameter.
Exact match
find_enrichment_table_records!("test",
{
"surname": "smith",
},
case_sensitive: false)
[{"firstname":"Bob","id":1,"surname":"Smith"},{"firstname":"Fred","id":2,"surname":"Smith"}]
Date range search
find_enrichment_table_records!("test",
{
"surname": "Smith",
"date_of_birth": {
"from": t'1985-01-01T00:00:00Z',
"to": t'1985-12-31T00:00:00Z'
}
})
[{"firstname":"Bob","id":1,"surname":"Smith"},{"firstname":"Fred","id":2,"surname":"Smith"}]
get_enrichment_table_record
Searches an enrichment table for a row that matches the provided condition. A single row must be matched. If no rows are found or more than one row is found, an error is returned.
For file
enrichment tables, this condition needs to be a VRL object in which
the key-value pairs indicate a field to search mapped to a value to search in that field.
This function returns the rows that match the provided condition(s). All fields need to
match for rows to be returned; if any fields do not match, then no rows are returned.
There are currently two forms of search criteria:
Exact match search. The given field must match the value exactly. Case sensitivity can be specified using the
case_sensitive
argument. An exact match search can use an index directly into the dataset, which should make this search fairly “cheap” from a performance perspective.Date range search. The given field must be greater than or equal to the
from
date and less than or equal to theto
date. A date range search involves sequentially scanning through the rows that have been located using any exact match criteria. This can be an expensive operation if there are many rows returned by any exact match criteria. Therefore, use date ranges as the only criteria when the enrichment data set is very small.
For geoip
and mmdb
enrichment tables, this condition needs to be a VRL object with a single key-value pair
whose value needs to be a valid IP address. Example: {"ip": .ip }
. If a return field is expected
and without a value, null
is used. This table can return the following fields:
ISP databases:
autonomous_system_number
autonomous_system_organization
isp
organization
City databases:
city_name
continent_code
country_code
country_name
region_code
region_name
metro_code
latitude
longitude
postal_code
timezone
Connection-Type databases:
connection_type
To use this function, you need to update your configuration to
include an
enrichment_tables
parameter.
Exact match
get_enrichment_table_record!("test",
{
"surname": "bob",
"firstname": "John"
},
case_sensitive: false)
{
"firstname": "Bob",
"id": 1,
"surname": "Smith"
}
Date range search
get_enrichment_table_record!("test",
{
"surname": "Smith",
"date_of_birth": {
"from": t'1985-01-01T00:00:00Z',
"to": t'1985-12-31T00:00:00Z'
}
})
{
"firstname": "Bob",
"id": 1,
"surname": "Smith"
}
Enumerate examples
compact
value
by removing empty values, where empty values are defined using the
available parameters.Compact an array
compact(["foo", "bar", "", null, [], "buzz"], string: true, array: true, null: true)
["foo","bar","buzz"]
Compact an object
compact({"field1": 1, "field2": "", "field3": [], "field4": null}, string: true, array: true, null: true)
{
"field1": 1
}
filter
Filter elements from a collection.
This function currently does not support recursive iteration.
The function uses the function closure syntax to allow reading the key-value or index-value combination for each item in the collection.
The same scoping rules apply to closure blocks as they do for regular blocks. This means that any variable defined in parent scopes is accessible, and mutations to those variables are preserved, but any new variables instantiated in the closure block are unavailable outside of the block.
See the examples below to learn about the closure syntax.
Filter elements
filter(array!(.tags)) -> |_index, value| {
# keep any elements that aren't equal to "foo"
value != "foo"
}
["bar","baz"]
flatten
value
into a single-level representation.Flatten array
flatten([1, [2, 3, 4], [5, [6, 7], 8], 9])
[1,2,3,4,5,6,7,8,9]
Flatten object
flatten({
"parent1": {
"child1": 1,
"child2": 2
},
"parent2": {
"child3": 3
}
})
{
"parent1.child1": 1,
"parent1.child2": 2,
"parent2.child3": 3
}
for_each
Iterate over a collection.
This function currently does not support recursive iteration.
The function uses the “function closure syntax” to allow reading the key/value or index/value combination for each item in the collection.
The same scoping rules apply to closure blocks as they do for regular blocks. This means that any variable defined in parent scopes is accessible, and mutations to those variables are preserved, but any new variables instantiated in the closure block are unavailable outside of the block.
See the examples below to learn about the closure syntax.
Tally elements
tally = {}
for_each(array!(.tags)) -> |_index, value| {
# Get the current tally for the `value`, or
# set to `0`.
count = int(get!(tally, [value])) ?? 0
# Increment the tally for the value by `1`.
tally = set!(tally, [value], count + 1)
}
tally
{
"bar": 1,
"baz": 1,
"foo": 2
}
includes
value
array includes the specified item
.Array includes
includes(["apple", "orange", "banana"], "banana")
true
keys
Get keys from the object
keys({"key1": "val1", "key2": "val2"})
["key1","key2"]
length
Returns the length of the value
.
- If
value
is an array, returns the number of elements. - If
value
is an object, returns the number of top-level keys. - If
value
is a string, returns the number of bytes in the string. If you want the number of characters, seestrlen
.
Length (object)
length({
"portland": "Trail Blazers",
"seattle": "Supersonics"
})
2
Length (nested object)
length({
"home": {
"city": "Portland",
"state": "Oregon"
},
"name": "Trail Blazers",
"mascot": {
"name": "Blaze the Trail Cat"
}
})
3
Length (array)
length(["Trail Blazers", "Supersonics", "Grizzlies"])
3
Length (string)
length("The Planet of the Apes Musical")
30
map_keys
Map the keys within an object.
If recursive
is enabled, the function iterates into nested
objects, using the following rules:
- Iteration starts at the root.
- For every nested object type:
- First return the key of the object type itself.
- Then recurse into the object, and loop back to item (1) in this list.
- Any mutation done on a nested object before recursing into it, are preserved.
- For every nested array type:
- First return the key of the array type itself.
- Then find all objects within the array, and apply item (2) to each individual object.
The above rules mean that map_keys
with
recursive
enabled finds all keys in the target,
regardless of whether nested objects are nested inside arrays.
The function uses the function closure syntax to allow reading the key for each item in the object.
The same scoping rules apply to closure blocks as they do for regular blocks. This means that any variable defined in parent scopes is accessible, and mutations to those variables are preserved, but any new variables instantiated in the closure block are unavailable outside of the block.
See the examples below to learn about the closure syntax.
Upcase keys
map_keys(.) -> |key| { upcase(key) }
{
"BAR": "bar",
"FOO": "foo"
}
De-dot keys
map_keys(., recursive: true) -> |key| { replace(key, ".", "_") }
{
"labels": {
"app_kubernetes_io/name": "mysql"
}
}
map_values
Map the values within a collection.
If recursive
is enabled, the function iterates into nested
collections, using the following rules:
- Iteration starts at the root.
- For every nested collection type:
- First return the collection type itself.
- Then recurse into the collection, and loop back to item (1) in the list
- Any mutation done on a collection before recursing into it, are preserved.
The function uses the function closure syntax to allow mutating the value for each item in the collection.
The same scoping rules apply to closure blocks as they do for regular blocks, meaning, any variable defined in parent scopes are accessible, and mutations to those variables are preserved, but any new variables instantiated in the closure block are unavailable outside of the block.
Check out the examples below to learn about the closure syntax.
Upcase values
map_values(.) -> |value| { upcase!(value) }
{
"bar": "BAR",
"foo": "FOO"
}
match_array
value
array matches the pattern
. By default, it checks that at least one element matches, but can be set to determine if all the elements match.Match at least one element
match_array(["foobar", "bazqux"], r'foo')
true
Match all elements
match_array(["foo", "foobar", "barfoo"], r'foo', all: true)
true
Not all elements match
match_array(["foo", "foobar", "baz"], r'foo', all: true)
strlen
Returns the number of UTF-8 characters in value
. This differs from
length
which counts the number of bytes of a string.
Note: This is the count of Unicode scalar values which can sometimes differ from Unicode code points.
unflatten
value
into a nested representation.Unflatten
unflatten({
"foo.bar.baz": true,
"foo.bar.qux": false,
"foo.quux": 42
})
{
"foo": {
"bar": {
"baz": true,
"qux": false
},
"quux": 42
}
}
Unflatten recursively
unflatten({
"flattened.parent": {
"foo.bar": true,
"foo.baz": false
}
})
{
"flattened": {
"parent": {
"foo": {
"bar": true,
"baz": false
}
}
}
}
Unflatten non-recursively
unflatten({
"flattened.parent": {
"foo.bar": true,
"foo.baz": false
}
}, recursive: false)
{
"flattened": {
"parent": {
"foo.bar": true,
"foo.baz": false
}
}
}
Ignore inconsistent keys values
unflatten({
"a": 3,
"a.b": 2,
"a.c": 4
})
{
"a": {
"b": 2,
"c": 4
}
}
unique
Returns the unique values for an array.
The first occurrence of each element is kept.
Unique
unique(["foo", "bar", "foo", "baz"])
["foo","bar","baz"]
values
Get values from the object
values({"key1": "val1", "key2": "val2"})
["val1","val2"]
Event examples
get_secret
Get the Datadog API key from the event metadata
get_secret("datadog_api_key")
secret value
remove_secret
Removes the Datadog API key from the event
remove_secret("datadog_api_key")
set_secret
Set the Datadog API key to the given value
set_secret("datadog_api_key", "abc122")
set_semantic_meaning
Sets custom field semantic meaning
set_semantic_meaning(.foo, "bar")
Path examples
del
Removes the field specified by the static path
from the target.
For dynamic path deletion, see the remove
function.
exists
Checks whether the path
exists for the target.
This function distinguishes between a missing path
and a path with a null
value. A regular path lookup,
such as .foo
, cannot distinguish between the two cases
since it always returns null
if the path doesn’t exist.
get
Dynamically get the value of a given path.
If you know the path you want to look up, use
static paths such as .foo.bar[1]
to get the value of that
path. However, if you do not know the path names,
use the dynamic get
function to get the requested
value.
single-segment top-level field
get!(value: { "foo": "bar" }, path: ["foo"])
bar
multi-segment nested field
get!(value: { "foo": { "bar": "baz" } }, path: ["foo", "bar"])
baz
array indexing
get!(value: ["foo", "bar", "baz"], path: [-2])
bar
remove
Dynamically remove the value for a given path.
If you know the path you want to remove, use
the del
function and static paths such as del(.foo.bar[1])
to remove the value at that path. The del
function returns the
deleted value, and is more performant than remove
.
However, if you do not know the path names, use the dynamic
remove
function to remove the value at the provided path.
single-segment top-level field
remove!(value: { "foo": "bar" }, path: ["foo"])
multi-segment nested field
remove!(value: { "foo": { "bar": "baz" } }, path: ["foo", "bar"])
{
"foo": {}
}
array indexing
remove!(value: ["foo", "bar", "baz"], path: [-2])
["foo","baz"]
compaction
remove!(value: { "foo": { "bar": [42], "baz": true } }, path: ["foo", "bar", 0], compact: true)
{
"foo": {
"baz": true
}
}
set
Dynamically insert data into the path of a given object or array.
If you know the path you want to assign a value to,
use static path assignments such as .foo.bar[1] = true
for
improved performance and readability. However, if you do not
know the path names, use the dynamic set
function to
insert the data into the object or array.
single-segment top-level field
set!(value: { "foo": "bar" }, path: ["foo"], data: "baz")
{
"foo": "baz"
}
multi-segment nested field
set!(value: { "foo": { "bar": "baz" } }, path: ["foo", "bar"], data: "qux")
{
"foo": {
"bar": "qux"
}
}
array
set!(value: ["foo", "bar", "baz"], path: [-2], data: 42)
["foo",42,"baz"]
Cryptography examples
decrypt
Decrypts a string with a symmetric encryption algorithm.
Supported Algorithms:
- AES-256-CFB (key = 32 bytes, iv = 16 bytes)
- AES-192-CFB (key = 24 bytes, iv = 16 bytes)
- AES-128-CFB (key = 16 bytes, iv = 16 bytes)
- AES-256-OFB (key = 32 bytes, iv = 16 bytes)
- AES-192-OFB (key = 24 bytes, iv = 16 bytes)
- AES-128-OFB (key = 16 bytes, iv = 16 bytes)
- AES-128-SIV (key = 32 bytes, iv = 16 bytes)
- AES-256-SIV (key = 64 bytes, iv = 16 bytes)
- Deprecated - AES-256-CTR (key = 32 bytes, iv = 16 bytes)
- Deprecated - AES-192-CTR (key = 24 bytes, iv = 16 bytes)
- Deprecated - AES-128-CTR (key = 16 bytes, iv = 16 bytes)
- AES-256-CTR-LE (key = 32 bytes, iv = 16 bytes)
- AES-192-CTR-LE (key = 24 bytes, iv = 16 bytes)
- AES-128-CTR-LE (key = 16 bytes, iv = 16 bytes)
- AES-256-CTR-BE (key = 32 bytes, iv = 16 bytes)
- AES-192-CTR-BE (key = 24 bytes, iv = 16 bytes)
- AES-128-CTR-BE (key = 16 bytes, iv = 16 bytes)
- AES-256-CBC-PKCS7 (key = 32 bytes, iv = 16 bytes)
- AES-192-CBC-PKCS7 (key = 24 bytes, iv = 16 bytes)
- AES-128-CBC-PKCS7 (key = 16 bytes, iv = 16 bytes)
- AES-256-CBC-ANSIX923 (key = 32 bytes, iv = 16 bytes)
- AES-192-CBC-ANSIX923 (key = 24 bytes, iv = 16 bytes)
- AES-128-CBC-ANSIX923 (key = 16 bytes, iv = 16 bytes)
- AES-256-CBC-ISO7816 (key = 32 bytes, iv = 16 bytes)
- AES-192-CBC-ISO7816 (key = 24 bytes, iv = 16 bytes)
- AES-128-CBC-ISO7816 (key = 16 bytes, iv = 16 bytes)
- AES-256-CBC-ISO10126 (key = 32 bytes, iv = 16 bytes)
- AES-192-CBC-ISO10126 (key = 24 bytes, iv = 16 bytes)
- AES-128-CBC-ISO10126 (key = 16 bytes, iv = 16 bytes)
- CHACHA20-POLY1305 (key = 32 bytes, iv = 12 bytes)
- XCHACHA20-POLY1305 (key = 32 bytes, iv = 24 bytes)
- XSALSA20-POLY1305 (key = 32 bytes, iv = 24 bytes)
Decrypt value
ciphertext = decode_base64!("5fLGcu1VHdzsPcGNDio7asLqE1P43QrVfPfmP4i4zOU=")
iv = decode_base64!("fVEIRkIiczCRWNxaarsyxA==")
key = "16_byte_keyxxxxx"
decrypt!(ciphertext, "AES-128-CBC-PKCS7", key, iv: iv)
super_secret_message
encrypt
Encrypts a string with a symmetric encryption algorithm.
Supported Algorithms:
- AES-256-CFB (key = 32 bytes, iv = 16 bytes)
- AES-192-CFB (key = 24 bytes, iv = 16 bytes)
- AES-128-CFB (key = 16 bytes, iv = 16 bytes)
- AES-256-OFB (key = 32 bytes, iv = 16 bytes)
- AES-192-OFB (key = 24 bytes, iv = 16 bytes)
- AES-128-OFB (key = 16 bytes, iv = 16 bytes)
- AES-128-SIV (key = 32 bytes, iv = 16 bytes)
- AES-256-SIV (key = 64 bytes, iv = 16 bytes)
- Deprecated - AES-256-CTR (key = 32 bytes, iv = 16 bytes)
- Deprecated - AES-192-CTR (key = 24 bytes, iv = 16 bytes)
- Deprecated - AES-128-CTR (key = 16 bytes, iv = 16 bytes)
- AES-256-CTR-LE (key = 32 bytes, iv = 16 bytes)
- AES-192-CTR-LE (key = 24 bytes, iv = 16 bytes)
- AES-128-CTR-LE (key = 16 bytes, iv = 16 bytes)
- AES-256-CTR-BE (key = 32 bytes, iv = 16 bytes)
- AES-192-CTR-BE (key = 24 bytes, iv = 16 bytes)
- AES-128-CTR-BE (key = 16 bytes, iv = 16 bytes)
- AES-256-CBC-PKCS7 (key = 32 bytes, iv = 16 bytes)
- AES-192-CBC-PKCS7 (key = 24 bytes, iv = 16 bytes)
- AES-128-CBC-PKCS7 (key = 16 bytes, iv = 16 bytes)
- AES-256-CBC-ANSIX923 (key = 32 bytes, iv = 16 bytes)
- AES-192-CBC-ANSIX923 (key = 24 bytes, iv = 16 bytes)
- AES-128-CBC-ANSIX923 (key = 16 bytes, iv = 16 bytes)
- AES-256-CBC-ISO7816 (key = 32 bytes, iv = 16 bytes)
- AES-192-CBC-ISO7816 (key = 24 bytes, iv = 16 bytes)
- AES-128-CBC-ISO7816 (key = 16 bytes, iv = 16 bytes)
- AES-256-CBC-ISO10126 (key = 32 bytes, iv = 16 bytes)
- AES-192-CBC-ISO10126 (key = 24 bytes, iv = 16 bytes)
- AES-128-CBC-ISO10126 (key = 16 bytes, iv = 16 bytes)
- CHACHA20-POLY1305 (key = 32 bytes, iv = 12 bytes)
- XCHACHA20-POLY1305 (key = 32 bytes, iv = 24 bytes)
- XSALSA20-POLY1305 (key = 32 bytes, iv = 24 bytes)
Encrypt value
plaintext = "super secret message"
iv = "1234567890123456" # typically you would call random_bytes(16)
key = "16_byte_keyxxxxx"
encrypted_message = encrypt!(plaintext, "AES-128-CBC-PKCS7", key, iv: iv)
encode_base64(encrypted_message)
GBw8Mu00v0Kc38+/PvsVtGgWuUJ+ZNLgF8Opy8ohIYE=
hmac
Calculates a HMAC of the value
using the given key
.
The hashing algorithm
used can be optionally specified.
For most use cases, the resulting bytestream should be encoded into a hex or base64 string using either encode_base16 or encode_base64.
This function is infallible if either the default algorithm
value or a recognized-valid compile-time
algorithm
string literal is used. Otherwise, it is fallible.
Calculate message HMAC (defaults: SHA-256), encoding to a base64 string
encode_base64(hmac("Hello there", "super-secret-key"))
eLGE8YMviv85NPXgISRUZxstBNSU47JQdcXkUWcClmI=
Calculate message HMAC using SHA-224, encoding to a hex-encoded string
encode_base16(hmac("Hello there", "super-secret-key", algorithm: "SHA-224"))
42fccbc2b7d22a143b92f265a8046187558a94d11ddbb30622207e90
Calculate message HMAC using a variable hash algorithm
.hash_algo = "SHA-256"
hmac_bytes, err = hmac("Hello there", "super-secret-key", algorithm: .hash_algo)
if err == null {
.hmac = encode_base16(hmac_bytes)
}
78b184f1832f8aff3934f5e0212454671b2d04d494e3b25075c5e45167029662
md5
value
.Create md5 hash
md5("foo")
acbd18db4cc2f85cedef654fccc4a4d8
seahash
value
.
Note: Due to limitations in the underlying VRL data types, this function converts the unsigned 64-bit integer SeaHash result to a signed 64-bit integer. Results higher than the signed 64-bit integer maximum value wrap around to negative values.sha1
value
.Calculate sha1 hash
sha1("foo")
0beec7b5ea3f0fdbc95d0dd47f3c5bc275da8a33
IP examples
ip_aton
Converts IPv4 address in numbers-and-dots notation into network-order bytes represented as an integer.
This behavior mimics inet_aton.
ip_cidr_contains
ip
is contained in the block referenced by the cidr
.IPv4 contains CIDR
ip_cidr_contains!("192.168.0.0/16", "192.168.10.32")
true
IPv6 contains CIDR
ip_cidr_contains!("2001:4f8:4:ba::/64", "2001:4f8:4:ba:2e0:81ff:fe22:d1f1")
true
ip_ntoa
Converts numeric representation of IPv4 address in network-order bytes to numbers-and-dots notation.
This behavior mimics inet_ntoa.
ip_ntop
Converts IPv4 and IPv6 addresses from binary to text form.
This behavior mimics inet_ntop.
Convert IPv4 address from bytes after decoding from Base64
ip_ntop!(decode_base64!("wKgAAQ=="))
192.168.0.1
Convert IPv6 address from bytes after decoding from Base64
ip_ntop!(decode_base64!("IAENuIWjAAAAAIouA3BzNA=="))
2001:db8:85a3::8a2e:370:7334
ip_pton
Converts IPv4 and IPv6 addresses from text to binary form.
- The binary form of IPv4 addresses is 4 bytes (32 bits) long.
- The binary form of IPv6 addresses is 16 bytes (128 bits) long.
This behavior mimics inet_pton.
Convert IPv4 address to bytes and encode to Base64
encode_base64(ip_pton!("192.168.0.1"))
wKgAAQ==
Convert IPv6 address to bytes and encode to Base64
encode_base64(ip_pton!("2001:db8:85a3::8a2e:370:7334"))
IAENuIWjAAAAAIouA3BzNA==
ip_to_ipv6
ip
to an IPv6 address.IPv4 to IPv6
ip_to_ipv6!("192.168.10.32")
::ffff:192.168.10.32
ipv6_to_ipv4
ip
to an IPv4 address. ip
is returned unchanged if it’s already an IPv4 address. If ip
is
currently an IPv6 address then it needs to be IPv4 compatible, otherwise an error is thrown.IPv6 to IPv4
ipv6_to_ipv4!("::ffff:192.168.0.1")
192.168.0.1
is_ipv4
Check if the string is a valid IPv4 address or not.
An [IPv4-mapped][https://datatracker.ietf.org/doc/html/rfc6890] or [IPv4-compatible][https://datatracker.ietf.org/doc/html/rfc6890] IPv6 address is not considered valid for the purpose of this function.
Number examples
format_int
value
into a string representation using the given base/radix.Format as a hexadecimal integer
format_int!(42, 16)
2a
Format as a negative hexadecimal integer
format_int!(-42, 16)
-2a
format_number
value
into a string representation of the number.Format a number (3 decimals)
format_number(1234567.89, 3, decimal_separator: ".", grouping_separator: ",")
1,234,567.890
Object examples
match_datadog_query
OR query
match_datadog_query({"message": "contains this and that"}, "this OR that")
true
AND query
match_datadog_query({"message": "contains only this"}, "this AND that")
Attribute wildcard
match_datadog_query({"name": "foobar"}, "@name:foo*")
true
Tag range
match_datadog_query({"tags": ["a:x", "b:y", "c:z"]}, s'b:["x" TO "z"]')
true
merge
from
object into the to
object.Object merge (shallow)
merge(
{
"parent1": {
"child1": 1,
"child2": 2
},
"parent2": {
"child3": 3
}
},
{
"parent1": {
"child2": 4,
"child5": 5
}
}
)
{
"parent1": {
"child2": 4,
"child5": 5
},
"parent2": {
"child3": 3
}
}
Object merge (deep)
merge(
{
"parent1": {
"child1": 1,
"child2": 2
},
"parent2": {
"child3": 3
}
},
{
"parent1": {
"child2": 4,
"child5": 5
}
},
deep: true
)
{
"parent1": {
"child1": 1,
"child2": 4,
"child5": 5
},
"parent2": {
"child3": 3
}
}
object_from_array
Iterate over either one array of arrays or a pair of arrays and create an object out of all the key-value pairs contained in them.
With one array of arrays, any entries with no value use null
instead.
Any keys that are null
skip the corresponding value.
If a single parameter is given, it must contain an array of all the input arrays.
Create an object from one array
object_from_array([["one", 1], [null, 2], ["two", 3]])
{
"one": 1,
"two": 3
}
Create an object from separate key and value arrays
object_from_array([1, 2, 3], keys: ["one", null, "two"])
{
"one": 1,
"two": 3
}
unnest
Unnest an array field from an object to create an array of objects using that field; keeping all other fields.
Assigning the array result of this to .
results in multiple events being emitted from remap
. See the
remap
transform docs for more details.
This is also referred to as explode
in some languages.
Parse examples
parse_apache_log
common
,
combined
, or the default error
format.Parse using Apache log format (common)
parse_apache_log!("127.0.0.1 bob frank [10/Oct/2000:13:55:36 -0700] \"GET /apache_pb.gif HTTP/1.0\" 200 2326", format: "common")
{
"host": "127.0.0.1",
"identity": "bob",
"message": "GET /apache_pb.gif HTTP/1.0",
"method": "GET",
"path": "/apache_pb.gif",
"protocol": "HTTP/1.0",
"size": 2326,
"status": 200,
"timestamp": "2000-10-10T20:55:36Z",
"user": "frank"
}
Parse using Apache log format (combined)
parse_apache_log!(
s'127.0.0.1 bob frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 "http://www.seniorinfomediaries.com/vertical/channels/front-end/bandwidth" "Mozilla/5.0 (X11; Linux i686; rv:5.0) Gecko/1945-10-12 Firefox/37.0"',
"combined",
)
{
"agent": "Mozilla/5.0 (X11; Linux i686; rv:5.0) Gecko/1945-10-12 Firefox/37.0",
"host": "127.0.0.1",
"identity": "bob",
"message": "GET /apache_pb.gif HTTP/1.0",
"method": "GET",
"path": "/apache_pb.gif",
"protocol": "HTTP/1.0",
"referrer": "http://www.seniorinfomediaries.com/vertical/channels/front-end/bandwidth",
"size": 2326,
"status": 200,
"timestamp": "2000-10-10T20:55:36Z",
"user": "frank"
}
Parse using Apache log format (error)
parse_apache_log!(
s'[01/Mar/2021:12:00:19 +0000] [ab:alert] [pid 4803:tid 3814] [client 147.159.108.175:24259] I will bypass the haptic COM bandwidth, that should matrix the CSS driver!',
"error"
)
{
"client": "147.159.108.175",
"message": "I will bypass the haptic COM bandwidth, that should matrix the CSS driver!",
"module": "ab",
"pid": 4803,
"port": 24259,
"severity": "alert",
"thread": "3814",
"timestamp": "2021-03-01T12:00:19Z"
}
parse_aws_alb_log
value
in the Elastic Load Balancer Access format.Parse AWS ALB log
parse_aws_alb_log!(
"http 2018-11-30T22:23:00.186641Z app/my-loadbalancer/50dc6c495c0c9188 192.168.131.39:2817 - 0.000 0.001 0.000 200 200 34 366 \"GET http://www.example.com:80/ HTTP/1.1\" \"curl/7.46.0\" - - arn:aws:elasticloadbalancing:us-east-2:123456789012:targetgroup/my-targets/73e2d6bc24d8a067 \"Root=1-58337364-23a8c76965a2ef7629b185e3\" \"-\" \"-\" 0 2018-11-30T22:22:48.364000Z \"forward\" \"-\" \"-\" \"-\" \"-\" \"-\" \"-\""
)
{
"actions_executed": "forward",
"chosen_cert_arn": null,
"classification": null,
"classification_reason": null,
"client_host": "192.168.131.39:2817",
"domain_name": null,
"elb": "app/my-loadbalancer/50dc6c495c0c9188",
"elb_status_code": "200",
"error_reason": null,
"matched_rule_priority": "0",
"received_bytes": 34,
"redirect_url": null,
"request_creation_time": "2018-11-30T22:22:48.364000Z",
"request_method": "GET",
"request_processing_time": 0,
"request_protocol": "HTTP/1.1",
"request_url": "http://www.example.com:80/",
"response_processing_time": 0,
"sent_bytes": 366,
"ssl_cipher": null,
"ssl_protocol": null,
"target_group_arn": "arn:aws:elasticloadbalancing:us-east-2:123456789012:targetgroup/my-targets/73e2d6bc24d8a067",
"target_host": null,
"target_port_list": [],
"target_processing_time": 0.001,
"target_status_code": "200",
"target_status_code_list": [],
"timestamp": "2018-11-30T22:23:00.186641Z",
"trace_id": "Root=1-58337364-23a8c76965a2ef7629b185e3",
"traceability_id": null,
"type": "http",
"user_agent": "curl/7.46.0"
}
parse_aws_cloudwatch_log_subscription_message
aws_kinesis_firehose
source.Parse AWS Cloudwatch Log subscription message
parse_aws_cloudwatch_log_subscription_message!(.message)
{
"log_events": [
{
"id": "35683658089614582423604394983260738922885519999578275840",
"message": "{\"bytes\":26780,\"datetime\":\"14/Sep/2020:11:45:41 -0400\",\"host\":\"157.130.216.193\",\"method\":\"PUT\",\"protocol\":\"HTTP/1.0\",\"referer\":\"https://www.principalcross-platform.io/markets/ubiquitous\",\"request\":\"/expedite/convergence\",\"source_type\":\"stdin\",\"status\":301,\"user-identifier\":\"-\"}",
"timestamp": "2020-09-14T19:09:29.039Z"
}
],
"log_group": "test",
"log_stream": "test",
"message_type": "DATA_MESSAGE",
"owner": "111111111111",
"subscription_filters": [
"Destination"
]
}
parse_aws_vpc_flow_log
value
in the VPC Flow Logs format.Parse AWS VPC Flow log (default format)
parse_aws_vpc_flow_log!("2 123456789010 eni-1235b8ca123456789 - - - - - - - 1431280876 1431280934 - NODATA")
{
"account_id": "123456789010",
"action": null,
"bytes": null,
"dstaddr": null,
"dstport": null,
"end": 1431280934,
"interface_id": "eni-1235b8ca123456789",
"log_status": "NODATA",
"packets": null,
"protocol": null,
"srcaddr": null,
"srcport": null,
"start": 1431280876,
"version": 2
}
Parse AWS VPC Flow log (custom format)
parse_aws_vpc_flow_log!(
"- eni-1235b8ca123456789 10.0.1.5 10.0.0.220 10.0.1.5 203.0.113.5",
"instance_id interface_id srcaddr dstaddr pkt_srcaddr pkt_dstaddr"
)
{
"dstaddr": "10.0.0.220",
"instance_id": null,
"interface_id": "eni-1235b8ca123456789",
"pkt_dstaddr": "203.0.113.5",
"pkt_srcaddr": "10.0.1.5",
"srcaddr": "10.0.1.5"
}
Parse AWS VPC Flow log including v5 fields
parse_aws_vpc_flow_log!("5 52.95.128.179 10.0.0.71 80 34210 6 1616729292 1616729349 IPv4 14 15044 123456789012 vpc-abcdefab012345678 subnet-aaaaaaaa012345678 i-0c50d5961bcb2d47b eni-1235b8ca123456789 ap-southeast-2 apse2-az3 - - ACCEPT 19 52.95.128.179 10.0.0.71 S3 - - ingress OK",
format: "version srcaddr dstaddr srcport dstport protocol start end type packets bytes account_id vpc_id subnet_id instance_id interface_id region az_id sublocation_type sublocation_id action tcp_flags pkt_srcaddr pkt_dstaddr pkt_src_aws_service pkt_dst_aws_service traffic_path flow_direction log_status")
{
"account_id": "123456789012",
"action": "ACCEPT",
"az_id": "apse2-az3",
"bytes": 15044,
"dstaddr": "10.0.0.71",
"dstport": 34210,
"end": 1616729349,
"flow_direction": "ingress",
"instance_id": "i-0c50d5961bcb2d47b",
"interface_id": "eni-1235b8ca123456789",
"log_status": "OK",
"packets": 14,
"pkt_dst_aws_service": null,
"pkt_dstaddr": "10.0.0.71",
"pkt_src_aws_service": "S3",
"pkt_srcaddr": "52.95.128.179",
"protocol": 6,
"region": "ap-southeast-2",
"srcaddr": "52.95.128.179",
"srcport": 80,
"start": 1616729292,
"sublocation_id": null,
"sublocation_type": null,
"subnet_id": "subnet-aaaaaaaa012345678",
"tcp_flags": 19,
"traffic_path": null,
"type": "IPv4",
"version": 5,
"vpc_id": "vpc-abcdefab012345678"
}
parse_bytes
value
into a human-readable bytes format specified by unit
and base
.Parse bytes (kilobytes)
parse_bytes!("1024KiB", unit: "MiB")
1
Parse bytes in SI unit (terabytes)
parse_bytes!("4TB", unit: "MB", base: "10")
4000000
Parse bytes in ambiguous unit (gigabytes)
parse_bytes!("1GB", unit: "B", base: "2")
1073741824
parse_cef
value
in CEF (Common Event Format) format. Ignores everything up to CEF header. Empty values are returned as empty strings. Surrounding quotes are removed from values.Parse output generated by PTA
parse_cef!(
"CEF:0|CyberArk|PTA|12.6|1|Suspected credentials theft|8|suser=mike2@prod1.domain.com shost=prod1.domain.com src=1.1.1.1 duser=andy@dev1.domain.com dhost=dev1.domain.com dst=2.2.2.2 cs1Label=ExtraData cs1=None cs2Label=EventID cs2=52b06812ec3500ed864c461e deviceCustomDate1Label=detectionDate deviceCustomDate1=1388577900000 cs3Label=PTAlink cs3=https://1.1.1.1/incidents/52b06812ec3500ed864c461e cs4Label=ExternalLink cs4=None"
)
{
"cefVersion": "0",
"cs1": "None",
"cs1Label": "ExtraData",
"cs2": "52b06812ec3500ed864c461e",
"cs2Label": "EventID",
"cs3": "https://1.1.1.1/incidents/52b06812ec3500ed864c461e",
"cs3Label": "PTAlink",
"cs4": "None",
"cs4Label": "ExternalLink",
"deviceCustomDate1": "1388577900000",
"deviceCustomDate1Label": "detectionDate",
"deviceEventClassId": "1",
"deviceProduct": "PTA",
"deviceVendor": "CyberArk",
"deviceVersion": "12.6",
"dhost": "dev1.domain.com",
"dst": "2.2.2.2",
"duser": "andy@dev1.domain.com",
"name": "Suspected credentials theft",
"severity": "8",
"shost": "prod1.domain.com",
"src": "1.1.1.1",
"suser": "mike2@prod1.domain.com"
}
Ignore syslog header
parse_cef!(
"Sep 29 08:26:10 host CEF:1|Security|threatmanager|1.0|100|worm successfully stopped|10|src=10.0.0.1 dst=2.1.2.2 spt=1232"
)
{
"cefVersion": "1",
"deviceEventClassId": "100",
"deviceProduct": "threatmanager",
"deviceVendor": "Security",
"deviceVersion": "1.0",
"dst": "2.1.2.2",
"name": "worm successfully stopped",
"severity": "10",
"spt": "1232",
"src": "10.0.0.1"
}
Translate custom fields
parse_cef!(
"CEF:0|Dev|firewall|2.2|1|Connection denied|5|c6a1=2345:0425:2CA1:0000:0000:0567:5673:23b5 c6a1Label=Device IPv6 Address",
translate_custom_fields: true
)
{
"Device IPv6 Address": "2345:0425:2CA1:0000:0000:0567:5673:23b5",
"cefVersion": "0",
"deviceEventClassId": "1",
"deviceProduct": "firewall",
"deviceVendor": "Dev",
"deviceVersion": "2.2",
"name": "Connection denied",
"severity": "5"
}
parse_common_log
value
using the Common Log Format (CLF).Parse using Common Log Format (with default timestamp format)
parse_common_log!("127.0.0.1 bob frank [10/Oct/2000:13:55:36 -0700] \"GET /apache_pb.gif HTTP/1.0\" 200 2326")
{
"host": "127.0.0.1",
"identity": "bob",
"message": "GET /apache_pb.gif HTTP/1.0",
"method": "GET",
"path": "/apache_pb.gif",
"protocol": "HTTP/1.0",
"size": 2326,
"status": 200,
"timestamp": "2000-10-10T20:55:36Z",
"user": "frank"
}
Parse using Common Log Format (with custom timestamp format)
parse_common_log!(
"127.0.0.1 bob frank [2000-10-10T20:55:36Z] \"GET /apache_pb.gif HTTP/1.0\" 200 2326",
"%+"
)
{
"host": "127.0.0.1",
"identity": "bob",
"message": "GET /apache_pb.gif HTTP/1.0",
"method": "GET",
"path": "/apache_pb.gif",
"protocol": "HTTP/1.0",
"size": 2326,
"status": 200,
"timestamp": "2000-10-10T20:55:36Z",
"user": "frank"
}
parse_csv
Parse a single CSV formatted row
parse_csv!("foo,bar,\"foo \"\", bar\"")
["foo","bar","foo \", bar"]
Parse a single CSV formatted row with custom delimiter
parse_csv!("foo bar", delimiter: " ")
["foo","bar"]
parse_dnstap
value
as base64 encoded DNSTAP data.Parse dnstap query message
parse_dnstap!("ChVqYW1lcy1WaXJ0dWFsLU1hY2hpbmUSC0JJTkQgOS4xNi4zGgBy5wEIAxACGAEiEAAAAAAAAAAAAAAAAAAAAAAqECABBQJwlAAAAAAAAAAAADAw8+0CODVA7+zq9wVNMU3WNlI2kwIAAAABAAAAAAABCWZhY2Vib29rMQNjb20AAAEAAQAAKQIAAACAAAAMAAoACOxjCAG9zVgzWgUDY29tAGAAbQAAAAByZLM4AAAAAQAAAAAAAQJoNQdleGFtcGxlA2NvbQAABgABAAApBNABAUAAADkADwA1AAlubyBTRVAgbWF0Y2hpbmcgdGhlIERTIGZvdW5kIGZvciBkbnNzZWMtZmFpbGVkLm9yZy54AQ==")
{
"dataType": "Message",
"dataTypeId": 1,
"extraInfo": "",
"messageType": "ResolverQuery",
"messageTypeId": 3,
"queryZone": "com.",
"requestData": {
"fullRcode": 0,
"header": {
"aa": false,
"ad": false,
"anCount": 0,
"arCount": 1,
"cd": false,
"id": 37634,
"nsCount": 0,
"opcode": 0,
"qdCount": 1,
"qr": 0,
"ra": false,
"rcode": 0,
"rd": false,
"tc": false
},
"opt": {
"do": true,
"ednsVersion": 0,
"extendedRcode": 0,
"options": [
{
"optCode": 10,
"optName": "Cookie",
"optValue": "7GMIAb3NWDM="
}
],
"udpPayloadSize": 512
},
"question": [
{
"class": "IN",
"domainName": "facebook1.com.",
"questionType": "A",
"questionTypeId": 1
}
],
"rcodeName": "NoError"
},
"responseAddress": "2001:502:7094::30",
"responseData": {
"fullRcode": 16,
"header": {
"aa": false,
"ad": false,
"anCount": 0,
"arCount": 1,
"cd": false,
"id": 45880,
"nsCount": 0,
"opcode": 0,
"qdCount": 1,
"qr": 0,
"ra": false,
"rcode": 16,
"rd": false,
"tc": false
},
"opt": {
"do": false,
"ede": [
{
"extraText": "no SEP matching the DS found for dnssec-failed.org.",
"infoCode": 9,
"purpose": "DNSKEY Missing"
}
],
"ednsVersion": 1,
"extendedRcode": 1,
"udpPayloadSize": 1232
},
"question": [
{
"class": "IN",
"domainName": "h5.example.com.",
"questionType": "SOA",
"questionTypeId": 6
}
],
"rcodeName": "BADSIG"
},
"responsePort": 53,
"serverId": "james-Virtual-Machine",
"serverVersion": "BIND 9.16.3",
"socketFamily": "INET6",
"socketProtocol": "UDP",
"sourceAddress": "::",
"sourcePort": 46835,
"time": 1593489007920014000,
"timePrecision": "ns",
"timestamp": "2020-06-30T03:50:07.920014129Z"
}
parse_duration
value
into a human-readable duration format specified by unit
.Parse duration (milliseconds)
parse_duration!("1005ms", unit: "s")
1.005
Parse multiple durations (seconds & milliseconds)
parse_duration!("1s 1ms", unit: "ms")
1001
parse_etld
value
representing domain name.Parse eTLD
parse_etld!("sub.sussex.ac.uk")
{
"etld": "ac.uk",
"etld_plus": "ac.uk",
"known_suffix": true
}
Parse eTLD+1
parse_etld!("sub.sussex.ac.uk", plus_parts: 1)
{
"etld": "ac.uk",
"etld_plus": "sussex.ac.uk",
"known_suffix": true
}
Parse eTLD with unknown suffix
parse_etld!("vector.acmecorp")
{
"etld": "acmecorp",
"etld_plus": "acmecorp",
"known_suffix": false
}
Parse eTLD with custom PSL
parse_etld!("vector.acmecorp", psl: "resources/public_suffix_list.dat")
{
"etld": "acmecorp",
"etld_plus": "acmecorp",
"known_suffix": false
}
parse_glog
value
using the glog (Google Logging Library) format.Parse using glog
parse_glog!("I20210131 14:48:54.411655 15520 main.c++:9] Hello world!")
{
"file": "main.c++",
"id": 15520,
"level": "info",
"line": 9,
"message": "Hello world!",
"timestamp": "2021-01-31T14:48:54.411655Z"
}
parse_grok
Parse using Grok
parse_grok!(
"2020-10-02T23:22:12.223222Z info Hello world",
"%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} %{GREEDYDATA:message}"
)
{
"level": "info",
"message": "Hello world",
"timestamp": "2020-10-02T23:22:12.223222Z"
}
parse_groks
Parse using multiple Grok patterns
parse_groks!(
"2020-10-02T23:22:12.223222Z info Hello world",
patterns: [
"%{common_prefix} %{_status} %{_message}",
"%{common_prefix} %{_message}",
],
aliases: {
"common_prefix": "%{_timestamp} %{_loglevel}",
"_timestamp": "%{TIMESTAMP_ISO8601:timestamp}",
"_loglevel": "%{LOGLEVEL:level}",
"_status": "%{POSINT:status}",
"_message": "%{GREEDYDATA:message}"
}
)
{
"level": "info",
"message": "Hello world",
"timestamp": "2020-10-02T23:22:12.223222Z"
}
parse_influxdb
value
as an InfluxDB line protocol
string, producing a list of Vector-compatible metrics.Parse InfluxDB line protocol
parse_influxdb!("cpu,host=A,region=us-west usage_system=64i,usage_user=10u,temperature=50.5,on=true,sleep=false 1590488773254420000")
[{"gauge":{"value":64},"kind":"absolute","name":"cpu_usage_system","tags":{"host":"A","region":"us-west"},"timestamp":"2020-05-26T10:26:13.254420Z"},{"gauge":{"value":10},"kind":"absolute","name":"cpu_usage_user","tags":{"host":"A","region":"us-west"},"timestamp":"2020-05-26T10:26:13.254420Z"},{"gauge":{"value":50.5},"kind":"absolute","name":"cpu_temperature","tags":{"host":"A","region":"us-west"},"timestamp":"2020-05-26T10:26:13.254420Z"},{"gauge":{"value":1},"kind":"absolute","name":"cpu_on","tags":{"host":"A","region":"us-west"},"timestamp":"2020-05-26T10:26:13.254420Z"},{"gauge":{"value":0},"kind":"absolute","name":"cpu_sleep","tags":{"host":"A","region":"us-west"},"timestamp":"2020-05-26T10:26:13.254420Z"}]
parse_json
value
as JSON.Parse JSON
parse_json!("{\"key\": \"val\"}")
{
"key": "val"
}
Parse JSON with max_depth
parse_json!("{\"top_level\":{\"key\": \"val\"}}", max_depth: 1)
{
"top_level": "{\"key\": \"val\"}"
}
parse_key_value
Parses the value
in key-value format. Also known as logfmt.
- Keys and values can be wrapped with
"
. "
characters can be escaped using\
.
Parse logfmt log
parse_key_value!(
"@timestamp=\"Sun Jan 10 16:47:39 EST 2021\" level=info msg=\"Stopping all fetchers\" tag#production=stopping_fetchers id=ConsumerFetcherManager-1382721708341 module=kafka.consumer.ConsumerFetcherManager"
)
{
"@timestamp": "Sun Jan 10 16:47:39 EST 2021",
"id": "ConsumerFetcherManager-1382721708341",
"level": "info",
"module": "kafka.consumer.ConsumerFetcherManager",
"msg": "Stopping all fetchers",
"tag#production": "stopping_fetchers"
}
Parse comma delimited log
parse_key_value!(
"path:\"/cart_link\", host:store.app.com, fwd: \"102.30.171.16\", dyno: web.1, connect:0ms, service:87ms, status:304, bytes:632, protocol:https",
field_delimiter: ",",
key_value_delimiter: ":"
)
{
"bytes": "632",
"connect": "0ms",
"dyno": "web.1",
"fwd": "102.30.171.16",
"host": "store.app.com",
"path": "/cart_link",
"protocol": "https",
"service": "87ms",
"status": "304"
}
Parse comma delimited log with standalone keys
parse_key_value!(
"env:prod,service:backend,region:eu-east1,beta",
field_delimiter: ",",
key_value_delimiter: ":",
)
{
"beta": true,
"env": "prod",
"region": "eu-east1",
"service": "backend"
}
Parse duplicate keys
parse_key_value!(
"at=info,method=GET,path=\"/index\",status=200,tags=dev,tags=dummy",
field_delimiter: ",",
key_value_delimiter: "=",
)
{
"at": "info",
"method": "GET",
"path": "/index",
"status": "200",
"tags": [
"dev",
"dummy"
]
}
parse_klog
value
using the klog format used by Kubernetes components.Parse using klog
parse_klog!("I0505 17:59:40.692994 28133 klog.go:70] hello from klog")
{
"file": "klog.go",
"id": 28133,
"level": "info",
"line": 70,
"message": "hello from klog",
"timestamp": "2025-05-05T17:59:40.692994Z"
}
parse_linux_authorization
/var/log/auth.log
(for Debian-based systems) or
/var/log/secure
(for RedHat-based systems) according to Syslog format.Parse Linux authorization event
parse_linux_authorization!(
s'Mar 23 01:49:58 localhost sshd[1111]: Accepted publickey for eng from 10.1.1.1 port 8888 ssh2: RSA SHA256:foobar'
)
{
"appname": "sshd",
"hostname": "localhost",
"message": "Accepted publickey for eng from 10.1.1.1 port 8888 ssh2: RSA SHA256:foobar",
"procid": 1111,
"timestamp": "2025-03-23T01:49:58Z"
}
parse_logfmt
Parses the value
in logfmt.
- Keys and values can be wrapped using the
"
character. "
characters can be escaped by the\
character.- As per this logfmt specification, the
parse_logfmt
function accepts standalone keys and assigns them a Boolean value oftrue
.
Parse logfmt log
parse_logfmt!(
"@timestamp=\"Sun Jan 10 16:47:39 EST 2021\" level=info msg=\"Stopping all fetchers\" tag#production=stopping_fetchers id=ConsumerFetcherManager-1382721708341 module=kafka.consumer.ConsumerFetcherManager"
)
{
"@timestamp": "Sun Jan 10 16:47:39 EST 2021",
"id": "ConsumerFetcherManager-1382721708341",
"level": "info",
"module": "kafka.consumer.ConsumerFetcherManager",
"msg": "Stopping all fetchers",
"tag#production": "stopping_fetchers"
}
parse_nginx_log
Parses Nginx access and error log lines. Lines can be in [`combined`](https://nginx.org/en/docs/http/ngx_http_log_module.html),
[`ingress_upstreaminfo`](https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/log-format/), [`main`](https://hg.nginx.org/pkg-oss/file/tip/debian/debian/nginx.conf) or [`error`](https://github.com/nginx/nginx/blob/branches/stable-1.18/src/core/ngx_log.c#L102) format.
Parse via Nginx log format (combined)
parse_nginx_log!(
s'172.17.0.1 - alice [01/Apr/2021:12:02:31 +0000] "POST /not-found HTTP/1.1" 404 153 "http://localhost/somewhere" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36" "2.75"',
"combined",
)
{
"agent": "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36",
"client": "172.17.0.1",
"compression": "2.75",
"referer": "http://localhost/somewhere",
"request": "POST /not-found HTTP/1.1",
"size": 153,
"status": 404,
"timestamp": "2021-04-01T12:02:31Z",
"user": "alice"
}
Parse via Nginx log format (error)
parse_nginx_log!(
s'2021/04/01 13:02:31 [error] 31#31: *1 open() "/usr/share/nginx/html/not-found" failed (2: No such file or directory), client: 172.17.0.1, server: localhost, request: "POST /not-found HTTP/1.1", host: "localhost:8081"',
"error"
)
{
"cid": 1,
"client": "172.17.0.1",
"host": "localhost:8081",
"message": "open() \"/usr/share/nginx/html/not-found\" failed (2: No such file or directory)",
"pid": 31,
"request": "POST /not-found HTTP/1.1",
"server": "localhost",
"severity": "error",
"tid": 31,
"timestamp": "2021-04-01T13:02:31Z"
}
Parse via Nginx log format (ingress_upstreaminfo)
parse_nginx_log!(
s'0.0.0.0 - bob [18/Mar/2023:15:00:00 +0000] "GET /some/path HTTP/2.0" 200 12312 "https://10.0.0.1/some/referer" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36" 462 0.050 [some-upstream-service-9000] [some-other-upstream-5000] 10.0.50.80:9000 19437 0.049 200 752178adb17130b291aefd8c386279e7',
"ingress_upstreaminfo"
)
{
"body_bytes_size": 12312,
"http_referer": "https://10.0.0.1/some/referer",
"http_user_agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36",
"proxy_alternative_upstream_name": "some-other-upstream-5000",
"proxy_upstream_name": "some-upstream-service-9000",
"remote_addr": "0.0.0.0",
"remote_user": "bob",
"req_id": "752178adb17130b291aefd8c386279e7",
"request": "GET /some/path HTTP/2.0",
"request_length": 462,
"request_time": 0.05,
"status": 200,
"timestamp": "2023-03-18T15:00:00Z",
"upstream_addr": "10.0.50.80:9000",
"upstream_response_length": 19437,
"upstream_response_time": 0.049,
"upstream_status": 200
}
Parse via Nginx log format (main)
parse_nginx_log!(
s'172.24.0.3 - alice [31/Dec/2024:17:32:06 +0000] "GET / HTTP/1.1" 200 615 "https://domain.tld/path" "curl/8.11.1" "1.2.3.4, 10.10.1.1"',
"main"
)
{
"body_bytes_size": 615,
"http_referer": "https://domain.tld/path",
"http_user_agent": "curl/8.11.1",
"http_x_forwarded_for": "1.2.3.4, 10.10.1.1",
"remote_addr": "172.24.0.3",
"remote_user": "alice",
"request": "GET / HTTP/1.1",
"status": 200,
"timestamp": "2024-12-31T17:32:06Z"
}
parse_proto
value
as a protocol buffer payload.Parse proto
parse_proto!(decode_base64!("Cgdzb21lb25lIggKBjEyMzQ1Ng=="), "resources/protobuf_descriptor_set.desc", "test_protobuf.Person")
{
"name": "someone",
"phones": [
{
"number": "123456"
}
]
}
parse_query_string
value
as a query string.Parse query string
parse_query_string("foo=%2B1&bar=2&bar=3&xyz")
{
"bar": [
"2",
"3"
],
"foo": "+1",
"xyz": ""
}
Parse Ruby on Rails’ query string
parse_query_string("?foo%5b%5d=1&foo%5b%5d=2")
{
"foo[]": [
"1",
"2"
]
}
parse_regex
Parses the value
using the provided Regex pattern
.
This function differs from the parse_regex_all
function in that it returns only the first match.
Parse using Regex (with capture groups)
parse_regex!("first group and second group.", r'(?P<number>.*?) group')
{
"number": "first"
}
Parse using Regex (without capture groups)
parse_regex!("first group and second group.", r'(\w+) group', numeric_groups: true)
{
"0": "first group",
"1": "first"
}
parse_regex_all
Parses the value
using the provided Regex pattern
.
This function differs from the parse_regex
function in that it returns all matches, not just the first.
Parse using Regex (all matches)
parse_regex_all!("first group and second group.", r'(?P<number>\w+) group', numeric_groups: true)
[{"0":"first group","1":"first","number":"first"},{"0":"second group","1":"second","number":"second"}]
parse_ruby_hash
value
as ruby hash.Parse ruby hash
parse_ruby_hash!(s'{ "test" => "value", "testNum" => 0.2, "testObj" => { "testBool" => true, "testNull" => nil } }')
{
"test": "value",
"testNum": 0.2,
"testObj": {
"testBool": true,
"testNull": null
}
}
parse_syslog
value
in Syslog format.Parse Syslog log (5424)
parse_syslog!(
s'<13>1 2020-03-13T20:45:38.119Z dynamicwireless.name non 2426 ID931 [exampleSDID@32473 iut="3" eventSource= "Application" eventID="1011"] Try to override the THX port, maybe it will reboot the neural interface!'
)
{
"appname": "non",
"exampleSDID@32473": {
"eventID": "1011",
"eventSource": "Application",
"iut": "3"
},
"facility": "user",
"hostname": "dynamicwireless.name",
"message": "Try to override the THX port, maybe it will reboot the neural interface!",
"msgid": "ID931",
"procid": 2426,
"severity": "notice",
"timestamp": "2020-03-13T20:45:38.119Z",
"version": 1
}
parse_timestamp
Parse timestamp
parse_timestamp!("10-Oct-2020 16:00+00:00", format: "%v %R %:z")
2020-10-10T16:00:00Z
Parse timestamp with timezone
parse_timestamp!("16/10/2019 12:00:00", format: "%d/%m/%Y %H:%M:%S", timezone: "Asia/Taipei")
2019-10-16T04:00:00Z
parse_tokens
Parses the value
in token format. A token is considered to be one of the following:
- A word surrounded by whitespace.
- Text delimited by double quotes:
".."
. Quotes can be included in the token if they are escaped by a backslash (\
). - Text delimited by square brackets:
[..]
. Closing square brackets can be included in the token if they are escaped by a backslash (\
).
Parse tokens
parse_tokens(
"A sentence \"with \\\"a\\\" sentence inside\" and [some brackets]"
)
["A","sentence","with \\\"a\\\" sentence inside","and","some brackets"]
parse_url
value
in URL format.Parse URL
parse_url!("ftp://foo:bar@example.com:4343/foobar?hello=world#123")
{
"fragment": "123",
"host": "example.com",
"password": "bar",
"path": "/foobar",
"port": 4343,
"query": {
"hello": "world"
},
"scheme": "ftp",
"username": "foo"
}
Parse URL with default port
parse_url!("https://example.com", default_known_ports: true)
{
"fragment": null,
"host": "example.com",
"password": "",
"path": "/",
"port": 443,
"query": {},
"scheme": "https",
"username": ""
}
Parse URL with internationalized domain name
parse_url!("https://www.café.com")
{
"fragment": null,
"host": "www.xn--caf-dma.com",
"password": "",
"path": "/",
"port": null,
"query": {},
"scheme": "https",
"username": ""
}
Parse URL with mixed case internationalized domain name
parse_url!("https://www.CAFé.com")
{
"fragment": null,
"host": "www.xn--caf-dma.com",
"password": "",
"path": "/",
"port": null,
"query": {},
"scheme": "https",
"username": ""
}
parse_user_agent
value
as a user agent string, which has a loosely defined format
so this parser only provides best effort guarantee.Fast mode
parse_user_agent(
"Mozilla Firefox 1.0.1 Mozilla/5.0 (X11; U; Linux i686; de-DE; rv:1.7.6) Gecko/20050223 Firefox/1.0.1"
)
{
"browser": {
"family": "Firefox",
"version": "1.0.1"
},
"device": {
"category": "pc"
},
"os": {
"family": "Linux",
"version": null
}
}
Reliable mode
parse_user_agent(
"Mozilla/4.0 (compatible; MSIE 7.66; Windows NT 5.1; SV1; .NET CLR 1.1.4322)",
mode: "reliable"
)
{
"browser": {
"family": "Internet Explorer",
"version": "7.66"
},
"device": {
"category": "pc"
},
"os": {
"family": "Windows XP",
"version": "NT 5.1"
}
}
Enriched mode
parse_user_agent(
"Opera/9.80 (J2ME/MIDP; Opera Mini/4.3.24214; iPhone; CPU iPhone OS 4_2_1 like Mac OS X; AppleWebKit/24.783; U; en) Presto/2.5.25 Version/10.54",
mode: "enriched"
)
{
"browser": {
"family": "Opera Mini",
"major": "4",
"minor": "3",
"patch": "24214",
"version": "10.54"
},
"device": {
"brand": "Apple",
"category": "smartphone",
"family": "iPhone",
"model": "iPhone"
},
"os": {
"family": "iOS",
"major": "4",
"minor": "2",
"patch": "1",
"patch_minor": null,
"version": "4.2.1"
}
}
parse_xml
value
as XML.Parse XML
value = s'<book category="CHILDREN"><title lang="en">Harry Potter</title><author>J K. Rowling</author><year>2005</year></book>';
parse_xml!(value, text_key: "value", parse_number: false)
{
"book": {
"@category": "CHILDREN",
"author": "J K. Rowling",
"title": {
"@lang": "en",
"value": "Harry Potter"
},
"year": "2005"
}
}
Random examples
random_bytes
Generate random base 64 encoded bytes
encode_base64(random_bytes(16))
LNu0BBgUbh7XAlXbjSOomQ==
random_float
Random float from 0.0 to 10.0, not including 10.0
f = random_float(0.0, 10.0)
f >= 0 && f < 10
true
random_int
Random integer from 0 to 10, not including 10
i = random_int(0, 10)
i >= 0 && i < 10
true
uuid_from_friendly_id
Convert a Friendly ID to a UUID
uuid_from_friendly_id!("3s87yEvnmkiPBMHsj8bwwc")
7f41deed-d5e2-8b5e-7a13-ab4ff93cfad2
uuid_v4
Create a UUIDv4
uuid_v4()
1d262f4f-199b-458d-879f-05fd0a5f0683
uuid_v7
Create a UUIDv7 with implicit now()
uuid_v7()
06338364-8305-7b74-8000-de4963503139
Create a UUIDv7 with explicit now()
uuid_v7(now())
018e29b3-0bea-7f78-8af3-d32ccb1b93c1
Create a UUIDv7 with custom timestamp
uuid_v7(t'2020-12-30T22:20:53.824727Z')
0176b5bd-5d19-7394-bb60-c21028c6152b
String examples
camelcase
value
string, and turns it into camelCase. Optionally, you can
pass in the existing case of the function, or else an attempt is made to determine the case automatically.community_id
TCP
community_id!(source_ip: "1.2.3.4", destination_ip: "5.6.7.8", source_port: 1122, destination_port: 3344, protocol: 6)
1:wCb3OG7yAFWelaUydu0D+125CLM=
contains_all
value
string contains all the specified substrings
.String contains all
contains_all("The Needle In The Haystack", ["Needle", "Haystack"])
true
String contains all (case sensitive)
contains_all("the NEEDLE in the haystack", ["needle", "haystack"])
downcase
value
string, where downcase is defined according to the
Unicode Derived Core Property Lowercase.Downcase a string
downcase("Hello, World!")
hello, world!
ends_with
value
string ends with the specified substring
.String ends with (case sensitive)
ends_with("The Needle In The Haystack", "The Haystack")
true
String ends with (case insensitive)
ends_with("The Needle In The Haystack", "the haystack", case_sensitive: false)
true
find
value
that matches pattern
. Returns -1
if not found.join
value
array into a single string, with items optionally separated from one another
by a separator
.kebabcase
value
string, and turns it into kebab-case. Optionally, you can
pass in the existing case of the function, or else we will try to figure out the case automatically.match_any
value
matches any of the given patterns
. All
patterns are checked in a single pass over the target string, giving this
function a potential performance advantage over the multiple calls
in the match
function.Regex match on a string
match_any("I'm a little teapot", [r'frying pan', r'teapot'])
true
parse_float
value
representing a floating point number in base 10 to a float.Parse negative integer
parse_float!("42.38")
42.38
pascalcase
value
string, and turns it into PascalCase. Optionally, you can
pass in the existing case of the function, or else we will try to figure out the case automatically.PascalCase a string
pascalcase("input-string")
InputString
PascalCase a string
pascalcase("input-string", "kebab-case")
InputString
redact
Redact sensitive data in value
such as:
- US social security card numbers
- Other forms of personally identifiable information with custom patterns
This can help achieve compliance by ensuring sensitive data does not leave your network.
Replace text using a regex
redact("my id is 123456", filters: [r'\d+'])
my id is [REDACTED]
Replace us social security numbers in any field
redact({ "name": "John Doe", "ssn": "123-12-1234"}, filters: ["us_social_security_number"])
{
"name": "John Doe",
"ssn": "[REDACTED]"
}
Replace with custom text
redact("my id is 123456", filters: [r'\d+'], redactor: {"type": "text", "replacement": "***"})
my id is ***
Replace with SHA-2 hash
redact("my id is 123456", filters: [r'\d+'], redactor: "sha2")
my id is GEtTedW1p6tC094dDKH+3B8P+xSnZz69AmpjaXRd63I=
Replace with SHA-3 hash
redact("my id is 123456", filters: [r'\d+'], redactor: "sha3")
my id is ZNCdmTDI7PeeUTFnpYjLdUObdizo+bIupZdl8yqnTKGdLx6X3JIqPUlUWUoFBikX+yTR+OcvLtAqWO11NPlNJw==
Replace with SHA-256 hash using hex encoding
redact("my id is 123456", filters: [r'\d+'], redactor: {"type": "sha2", "variant": "SHA-256", "encoding": "base16"})
my id is 8d969eef6ecad3c29a3a629280e686cf0c3f5d5a86aff3ca12020c923adc6c92
replace
Replaces all matching instances of pattern
in value
.
The pattern
argument accepts regular expression capture groups.
Note when using capture groups:
- You will need to escape the
$
by using$$
to avoid Vector interpreting it as an environment variable when loading configuration - If you want a literal
$
in the replacement pattern, you will also need to escape this with$$
. When combined with environment variable interpolation in config files this means you will need to use$$$$
to have a literal$
in the replacement pattern.
Replace literal text
replace("Apples and Bananas", "and", "not")
Apples not Bananas
Replace using regular expression
replace("Apples and Bananas", r'(?i)bananas', "Pineapples")
Apples and Pineapples
Replace first instance
replace("Bananas and Bananas", "Bananas", "Pineapples", count: 1)
Pineapples and Bananas
Replace with capture groups (Note: Use $$num
in config files)
replace("foo123bar", r'foo(?P<num>\d+)bar', "$num")
123
replace_with
Replaces all matching instances of pattern
using a closure.
The pattern
argument accepts a regular expression that can use capture groups.
The function uses the function closure syntax to compute the replacement values.
The closure takes a single parameter, which is an array, where the first item is always
present and contains the entire string that matched pattern
. The items from index one on
contain the capture groups of the corresponding index. If a capture group is optional, the
value may be null if it didn’t match.
The value returned by the closure must be a string and will replace the section of the input that was matched.
This returns a new string with the replacements, the original string is not mutated.
Capitalize words
replace_with("apples and bananas", r'\b(\w)(\w*)') -> |match| {
upcase!(match.captures[0]) + string!(match.captures[1])
}
Apples And Bananas
Replace with hash
replace_with("email from test@example.com", r'\w+@example.com') -> |match| {
sha2(match.string, variant: "SHA-512/224")
}
email from adf6e1bc4415d24912bd93072ad34ef825a7b6eb3bf53f68def1fc17
Replace first instance
replace_with("Apples and Apples", r'(?i)apples|cones', count: 1) -> |match| {
"Pine" + downcase(match.string)
}
Pineapples and Apples
Named capture group
replace_with("level=error A message", r'level=(?P<level>\w+)') -> |match| {
lvl = upcase!(match.level)
"[{{lvl}}]"
}
[ERROR] A message
screamingsnakecase
value
string, and turns it into SCREAMING_SNAKE case. Optionally, you can
pass in the existing case of the function, or else we will try to figure out the case automatically.SCREAMING_SNAKE a string
screamingsnakecase("input-string")
INPUT_STRING
SCREAMING_SNAKE a string
screamingsnakecase("input-string", "kebab-case")
INPUT_STRING
sieve
Keeps only matches of pattern
in value
.
This can be used to define patterns that are allowed in the string and remove everything else.
Sieve with regex
sieve("test123%456.فوائد.net.", r'[a-z0-9.]')
test123456..net.
Custom replacements
sieve("test123%456.فوائد.net.", r'[a-z.0-9]', replace_single: "X", replace_repeated: "<REMOVED>")
test123X456.<REMOVED>.net.
slice
Returns a slice of value
between the start
and end
positions.
If the start
and end
parameters are negative, they refer to positions counting from the right of the
string or array. If end
refers to a position that is greater than the length of the string or array,
a slice up to the end of the string or array is returned.
Slice a string (positive index)
slice!("Supercalifragilisticexpialidocious", start: 5, end: 13)
califrag
Slice a string (negative index)
slice!("Supercalifragilisticexpialidocious", start: 5, end: -14)
califragilistic
snakecase
value
string, and turns it into snake-case. Optionally, you can
pass in the existing case of the function, or else we will try to figure out the case automatically.split
value
string using pattern
.Split a string (no limit)
split("apples and pears and bananas", " and ")
["apples","pears","bananas"]
Split a string (with a limit)
split("apples and pears and bananas", " and ", limit: 2)
["apples","pears and bananas"]
starts_with
value
begins with substring
.String starts with (case sensitive)
starts_with("The Needle In The Haystack", "The Needle")
true
String starts with (case insensitive)
starts_with("The Needle In The Haystack", "the needle", case_sensitive: false)
true
strip_ansi_escape_codes
value
.Strip ANSI escape codes
strip_ansi_escape_codes("\e[46mfoo\e[0m bar")
foo bar
strip_whitespace
value
, where whitespace is defined by the Unicode
White_Space
property.Strip whitespace
strip_whitespace(" A sentence. ")
A sentence.
truncate
value
string up to the limit
number of characters.Truncate a string
truncate("A rather long sentence.", limit: 11, suffix: "...")
A rather lo...
Truncate a string
truncate("A rather long sentence.", limit: 11, suffix: "[TRUNCATED]")
A rather lo[TRUNCATED]
upcase
value
, where upcase is defined according to the Unicode Derived Core Property
Uppercase.Upcase a string
upcase("Hello, World!")
HELLO, WORLD!
System examples
get_env_var
name
.Get an environment variable
get_env_var!("HOME")
/root
get_timezone_name
local
, then it attempts to
determine the name of the timezone from the host OS. If this
is not possible, then it returns the fixed offset of the
local timezone for the current time in the format "[+-]HH:MM"
,
for example, "+02:00"
.Get the IANA name of Vector’s timezone
.vector_timezone = get_timezone_name!()
Timestamp examples
format_timestamp
value
into a string representation of the timestamp.Format a timestamp (ISO8601/RFC 3339)
format_timestamp!(t'2020-10-21T16:00:00Z', format: "%+")
2020-10-21T16:00:00+00:00
Format a timestamp (custom)
format_timestamp!(t'2020-10-21T16:00:00Z', format: "%v %R")
21-Oct-2020 16:00
now
Generate a current timestamp
now()
2021-03-04T10:51:15.928937Z
Type examples
array
value
if it is an array, otherwise returns an error. This enables the type checker to guarantee that the
returned value is an array and can be used in any function that expects an array.bool
value
if it is a Boolean, otherwise returns an error. This enables the type checker to guarantee that the
returned value is a Boolean and can be used in any function that expects a Boolean.float
value
if it is a float, otherwise returns an error. This enables the type checker to guarantee that the
returned value is a float and can be used in any function that expects a float.int
value
if it is an integer, otherwise returns an error. This enables the type checker to guarantee that the
returned value is an integer and can be used in any function that expects an integer.is_json
Exact variant
is_json("{}", variant: "object")
true
is_nullish
value
is nullish, where nullish denotes the absence of a
meaningful value.Null detection (blank string)
is_nullish("")
true
Null detection (dash string)
is_nullish("-")
true
Null detection (whitespace)
is_nullish("
")
true
is_timestamp
value
’s type is a timestamp.Valid timestamp
is_timestamp(t'2021-03-26T16:00:00Z')
true
object
value
if it is an object, otherwise returns an error. This enables the type checker to guarantee that the
returned value is an object and can be used in any function that expects an object.Declare an object type
object!(.value)
{
"field1": "value1",
"field2": "value2"
}
string
value
if it is a string, otherwise returns an error. This enables the type checker to guarantee that the
returned value is a string and can be used in any function that expects a string.Declare a string type
string!(.message)
{"field": "value"}
tag_types_externally
Adds type information to all (nested) scalar values in the provided value
.
The type information is added externally, meaning that value
has the form of "type": value
after this
transformation.
Tag types externally (scalar)
tag_types_externally(123)
{
"integer": 123
}
Tag types externally (object)
tag_types_externally({
"message": "Hello world",
"request": {
"duration_ms": 67.9
}
})
{
"message": {
"string": "Hello world"
},
"request": {
"duration_ms": {
"float": 67.9
}
}
}
Tag types externally (array)
tag_types_externally(["foo", "bar"])
[{"string":"foo"},{"string":"bar"}]
Tag types externally (null)
tag_types_externally(null)
timestamp
value
if it is a timestamp, otherwise returns an error. This enables the type checker to guarantee that
the returned value is a timestamp and can be used in any function that expects a timestamp.Declare a timestamp type
timestamp(t'2020-10-10T16:00:00Z')
2020-10-10T16:00:00Z