Sample
Sample events from an event stream based on supplied criteria and at a configurable rate
Configuration
Example configurations
{
"transforms": {
"my_transform_id": {
"type": "sample",
"inputs": [
"my-source-or-transform-id"
]
}
}
}
[transforms.my_transform_id]
type = "sample"
inputs = [ "my-source-or-transform-id" ]
transforms:
my_transform_id:
type: sample
inputs:
- my-source-or-transform-id
{
"transforms": {
"my_transform_id": {
"type": "sample",
"inputs": [
"my-source-or-transform-id"
],
"group_by": "{{ service }}",
"key_field": "message",
"rate": 1500,
"ratio": 0.13,
"sample_rate_key": "sample_rate"
}
}
}
[transforms.my_transform_id]
type = "sample"
inputs = [ "my-source-or-transform-id" ]
group_by = "{{ service }}"
key_field = "message"
rate = 1_500
ratio = 0.13
sample_rate_key = "sample_rate"
transforms:
my_transform_id:
type: sample
inputs:
- my-source-or-transform-id
group_by: "{{ service }}"
key_field: message
rate: 1500
ratio: 0.13
sample_rate_key: sample_rate
exclude
optional conditionAvailable syntaxes
Syntax | Description | Example |
---|---|---|
vrl | A Vector Remap Language (VRL) Boolean expression. | .status_code != 200 && !includes(["info", "debug"], .severity) |
datadog_search | A Datadog Search query string. | *stack |
is_log | Whether the incoming event is a log. |
|
is_metric | Whether the incoming event is a metric. |
|
is_trace | Whether the incoming event is a trace. |
|
Shorthand for VRL
If you opt for the vrl
syntax for this condition, you can set the condition
as a string via the condition
parameter, without needing to specify both a source
and a type
. The
table below shows some examples:
Config format | Example |
---|---|
YAML | condition: .status == 200 |
TOML | condition = ".status == 200" |
JSON | "condition": ".status == 200" |
Condition config examples
Standard VRL
exclude:
type: "vrl"
source: ".status == 500"
exclude = { type = "vrl", source = ".status == 500" }
"exclude": {
"type": "vrl",
"source": ".status == 500"
}
graph
optional objectExtra graph configuration
Configure output for component when generated with graph command
graph.node_attributes
optional objectNode attributes to add to this component’s node in resulting graph
They are added to the node as provided
graph.node_attributes.*
required string literalgroup_by
optional string templateThe value to group events into separate buckets to be sampled independently.
If left unspecified, or if the event doesn’t have group_by
, then the event is not
sampled separately.
inputs
required [string]A list of upstream source or transform IDs.
Wildcards (*
) are supported.
See configuration for more info.
key_field
optional string literalThe name of the field whose value is hashed to determine if the event should be sampled.
Each unique value for the key creates a bucket of related events to be sampled together
and the rate is applied to the buckets themselves to sample 1/N
buckets. The overall rate
of sampling may differ from the configured one if values in the field are not uniformly
distributed. If left unspecified, or if the event doesn’t have key_field
, then the
event is sampled independently.
This can be useful to, for example, ensure that all logs for a given transaction are
sampled together, but that overall 1/N
transactions are sampled.
rate
optional uintThe rate at which events are forwarded, expressed as 1/N
.
For example, rate = 1500
means 1 out of every 1500 events are forwarded and the rest are
dropped. This differs from ratio
which allows more precise control over the number of events
retained and values greater than 1/2. It is an error to provide a value for both rate
and ratio
.
ratio
optional floatThe rate at which events are forwarded, expressed as a percentage
For example, ratio = .13
means that 13% out of all events on the stream are forwarded and
the rest are dropped. This differs from rate
allowing the configuration of a higher
precision value and also the ability to retain values of greater than 50% of all events. It is
an error to provide a value for both rate
and ratio
.
sample_rate_key
optional string literalsample_rate
Outputs
<component_id>
Telemetry
Metrics
linkcomponent_discarded_events_total
counterfilter
transform, or false if due to an error.component_errors_total
countercomponent_received_event_bytes_total
countercomponent_received_events_count
histogramA histogram of the number of events passed in each internal batch in Vector’s internal topology.
Note that this is separate than sink-level batching. It is mostly useful for low level debugging performance issues in Vector due to small internal batches.