Merge Transform

The Vector merge transform accepts and outputs log events, allowing you to merge partial log events into a single event.

Configuration

vector.toml
[transforms.my_transform_id]
type = "merge" # required
inputs = ["my-source-or-transform-id"] # required
merge_fields = ["message"] # optional, default
partial_event_marker_field = "_partial" # optional, default
stream_discriminant_fields = [] # optional, default
  • [string]commonoptional

    merge_fields

    Fields to merge. The values of these fields will be merged into the first partial event. Fields not specified here will be ignored. Merging process takes the first partial event and the base, then it merges in the fields from each successive partial event, until a non-partial event arrives. Finally, the non-partial event fields are merged in, producing the resulting merged event. See Field Notation Syntax for more info.

    • Default: ["message"]
    • View examples
  • stringcommonoptional

    partial_event_marker_field

    The field that indicates that the event is partial. A consequent stream of partial events along with the first non-partial event will be merged together. See Field Notation Syntax for more info.

    • Default: "_partial"
    • View examples
  • [string]commonoptional

    stream_discriminant_fields

    An ordered list of fields to distinguish streams by. Each stream has a separate partial event merging state. Should be used to prevent events from unrelated sources from mixing together, as this affects partial event processing. See Field Notation Syntax for more info.

    • Default: []
    • View examples

Examples

Given the following default configuration:

vector.toml
[transforms.merge_events]
type = "merge"
inputs = [...]

And these three partial log events:

first log event
{
"message": "First",
"_partial": true,
"custom_string_field": "value1",
"custom_int_field": 1
}

and

second log event
{
"message": "Second",
"_partial": true,
"custom_string_field": "value2",
"custom_int_field": 2
}

and

third log event
{
"message": "Third",
"custom_string_field": "value3",
"custom_int_field": 3
}

A single merged log event will be produced:

{
"message": "FirstSecondThird",
"custom_string_field": "value1",
"custom_int_field": 1
}

Notice that custom_string_field and custom_int_field were not overridden. This is because they were not listed in the merge_fields option.

How It Works

Complex Processing

If you encounter limitations with the merge transform then we recommend using a runtime transform. These transforms are designed for complex processing and give you the power of full programming runtime.

Environment Variables

Environment variables are supported through all of Vector's configuration. Simply add ${MY_ENV_VAR} in your Vector configuration file and the variable will be replaced before being evaluated.

You can learn more in the Environment Variables section.

Field Notation Syntax

The merge_fields, partial_event_marker_field, and stream_discriminant_fields options support Vector's field notation syntax, enabling access to root-level, nested, and array field values. For example:

vector.toml
[transforms.my_merge_transform_id]
# ...
merge_fields = ["message"]
merge_fields = ["message", "parent.child"]
# ...

You can learn more about Vector's field notation in the field notation reference.

When to use this transform

Where possible, Vector will handle event merging at the source level. For example, the file contains a message_start_indicator option and the docker contains an auto_partial_merge option. Both of these options should be used instead of this transform. Unfortunately, merging logs is not always this straight forward. It is precisely these edge cases that this transform hopes to solve.

If you're using this transform for a common use case, please consider opening an issue to let us know.