Merge Transform

The Vector merge transform merges partial log events into a single event.

Warnings

Configuration

[transforms.my_transform_id]
type = "merge" # required
inputs = ["my-source-or-transform-id", "prefix-*"] # required
fields = ["message"] # optional, default
partial_event_marker_field = "_partial" # optional, default
stream_discriminant_fields = [] # optional, default
  • commonoptional[string]

    fields

    Fields to merge. The values of these fields will be merged into the first partial event. Fields not specified here will be ignored. Merging process takes the first partial event and the base, then it merges in the fields from each successive partial event, until a non-partial event arrives. Finally, the non-partial event fields are merged in, producing the resulting merged event.

    • Default: ["message"]
  • commonoptionalstring

    partial_event_marker_field

    The field that indicates that the event is partial. A consequent stream of partial events along with the first non-partial event will be merged together.

    • Syntax: literal
    • Default: "_partial"
  • commonoptional[string]

    stream_discriminant_fields

    An ordered list of fields to distinguish streams by. Each stream has a separate partial event merging state. Should be used to prevent events from unrelated sources from mixing together, as this affects partial event processing.

    • Default: []

Telemetry

This component provides the following metrics that can be retrieved through the internal_metrics source. See the metrics section in the monitoring page for more info.

  • counter

    events_in_total

    The number of events accepted by this component either from tagged origin like file and uri, or cumulatively from other origins. This metric includes the following tags:

    • component_kind - The Vector component kind.

    • component_name - The Vector component ID.

    • component_type - The Vector component type.

    • container_name - The name of the container from which the event originates.

    • file - The file from which the event originates.

    • instance - The Vector instance identified by host and port.

    • job - The name of the job producing Vector metrics.

    • mode - The connection mode used by the component.

    • peer_addr - The IP from which the event originates.

    • peer_path - The pathname from which the event originates.

    • pod_name - The name of the pod from which the event originates.

    • uri - The sanitized uri from which the event originates.

  • counter

    processed_events_total

    The total number of events processed by this component. This metric includes the following tags:

    • component_kind - The Vector component kind.

    • component_name - The Vector component ID.

    • component_type - The Vector component type.

    • file - The file that produced the error

    • instance - The Vector instance identified by host and port.

    • job - The name of the job producing Vector metrics.

  • counter

    events_out_total

    The total number of events emitted by this component. This metric includes the following tags:

    • component_kind - The Vector component kind.

    • component_name - The Vector component ID.

    • component_type - The Vector component type.

    • instance - The Vector instance identified by host and port.

    • job - The name of the job producing Vector metrics.

  • counter

    processed_bytes_total

    The number of bytes processed by the component. This metric includes the following tags:

    • component_kind - The Vector component kind.

    • component_name - The Vector component ID.

    • component_type - The Vector component type.

    • container_name - The name of the container from which the bytes originate.

    • file - The file from which the bytes originate.

    • instance - The Vector instance identified by host and port.

    • job - The name of the job producing Vector metrics.

    • mode - The connection mode used by the component.

    • peer_addr - The IP from which the bytes originate.

    • peer_path - The pathname from which the bytes originate.

    • pod_name - The name of the pod from which the bytes originate.

    • uri - The sanitized uri from which the bytes originate.

Examples

Given the following Vector log event:

[
{
"message": "First",
"_partial": true,
"custom_string_field": "value1",
"custom_int_field": 1
},
{
"message": "Second",
"_partial": true,
"custom_string_field": "value2",
"custom_int_field": 2
},
{
"message": "Third",
"custom_string_field": "value3",
"custom_int_field": 3
}
]

And the following configuration:

vector.toml
[transforms.merge]
type = "merge"

The following Vector log event will be output:

{
"message": "FirstSecondThird",
"custom_string_field": "value1",
"custom_int_field": 1
}

How It Works

State

This component is stateful, meaning its behavior changes based on previous inputs (events). State is not preserved across restarts, therefore state-dependent behavior will reset between restarts and depend on the inputs (events) received since the most recent restart.