Configuration

This section covers configuring Vector and creating pipelines like the example below. Vector's configuration uses the TOML syntax, and the configuration file must be passed via the --config flag when starting Vector:

vector --config /etc/vector/vector.toml

Example

vector.toml
# Set global options
data_dir = "/var/lib/vector"
# Ingest data by tailing one or more files
[sources.apache_logs]
type = "file"
include = ["/var/log/apache2/*.log"] # supports globbing
ignore_older = 86400 # 1 day
# Structure and parse the data
[transforms.apache_parser]
inputs = ["apache_logs"]
type = "regex_parser" # fast/powerful regex
regex = '^(?P<host>[w.]+) - (?P<user>[w]+) (?P<bytes_in>[d]+) [(?P<timestamp>.*)] "(?P<method>[w]+) (?P<path>.*)" (?P<status>[d]+) (?P<bytes_out>[d]+)$'
# Sample the data to save on cost
[transforms.apache_sampler]
inputs = ["apache_parser"]
type = "sampler"
rate = 50 # only keep 50%
# Send structured data to a short-term storage
[sinks.es_cluster]
inputs = ["apache_sampler"] # only take sampled data
type = "elasticsearch"
host = "http://79.12.221.222:9200" # local or external host
index = "vector-%Y-%m-%d" # daily indices
# Send structured data to a cost-effective long-term storage
[sinks.s3_archives]
inputs = ["apache_parser"] # don't sample for S3
type = "aws_s3"
region = "us-east-1"
bucket = "my-log-archives"
key_prefix = "date=%Y-%m-%d" # daily partitions, hive friendly format
compression = "gzip" # compress final objects
encoding = "ndjson" # new line delimited JSON
[sinks.s3_archives.batch]
max_size = 10000000 # 10mb uncompressed

The key thing to notice above is the use of the inputs option. This connects Vector's component to create a pipeline. For a simple introduction, please refer to the:

Getting Started Guide

Reference

Vector provides a full reference that you can use to build your configuration files.

Sources
Transforms
Sinks

And for more advanced techniques:

Env Vars
Global options
Templating
Tests

How It Works

Config File Location

The location of your Vector configuration file depends on your installation method. For most Linux based systems the file can be found at /etc/vector/vector.toml.

Environment Variables

Vector will interpolate environment variables within your configuration file with the following syntax:

vector.toml
[transforms.add_host]
type = "add_fields"
[transforms.add_host.fields]
host = "${HOSTNAME}"
environment = "${ENV:-development}" # default value when not present

Please refer to the environment variables reference for more info.

Multiple config files

You can pass multiple configuration files when starting Vector:

vector --config vector1.toml --config vector2.toml

Or use a globbing syntax:

vector --config /etc/vector/*.toml

Syntax

The Vector configuration file follows the TOML syntax for it's simplicity, explicitness, and relaxed white-space parsing. For more information, please refer to the TOML documentation.

Templating

Select configuration options support Vector's templating syntax to produce dynamic values derived from the event's data. Two syntaxes are supported for fields that support field interpolation:

  1. Strptime specifiers. Ex: date=%Y/%m/%d
  2. Event fields. Ex: {{ field_name }}

For example:

vector.toml
[sinks.es_cluster]
type = "elasticsearch"
index = "user-{{ user_id }}-%Y-%m-%d"

The above index value will be calculated for each event. For example, given the following event:

{
"timestamp": "2019-05-02T00:23:22Z",
"message": "message",
"user_id": 2
}

The index value will result in:

index = "user-2-2019-05-02"

Learn more in the templating reference.

Types

All TOML values types are supported. For convenience this includes: