In order to understand Vector, you must first understand the fundamental concepts. The following concepts are ordered progressively, starting with the individual unit of data (events) and broadening all the way to Vector's deployment models (pipelines).


"Events" represent the individual units of data in Vector. They must fit into one of the following types.

Data model


A "log" event is a generic key/value representation of an event.

Log events


A "metric" event is a first-class representation of numerical operation performed on a time series. Vector's metric events are fully interoperable.

Metric events


"Component" is the generic term we use for sources, transforms, and sinks. Components ingest, transform, and route events. You compose components to create topologies.



Vector wouldn't be very useful if it couldn't ingest data. A "source" defines where Vector should pull data from, or how it should receive data pushed to it. A topology can have any number of sources, and as they ingest data they proceed to normalize it into events (see next section). This sets the stage for easy and consistent processing of your data. Examples of sources include file, syslog, StatsD, and stdin.



A "transform" is responsible for mutating events as they are transported by Vector. This might involve parsing, filtering, sampling, or aggregating. You can have any number of transforms in your pipeline and how they are composed is up to you.



A "sink" is a destination for events. Each sink's design and transmission method is dictated by the downstream service it is interacting with. For example, the socket sink will stream individual events, while the aws_s3 sink will buffer and flush data.



A "Pipeline" is a directed acyclic graph of components. Each component is a node on the graph with directed edges. Data must flow in one direction, from sources to sinks. Components can produce zero or more events.

Pipeline model


A "role" refers to a deployment role that Vector fills in order to create end-to-end pipelines.

Deployment roles


The "agent" role is designed for deploying Vector to the edge, typically for data collection.

Agent role


The "aggregator" role is designed to collect and process data from multiple upstream sources. These upstream sources could be other Vector agents or non-Vector agents such as Syslog-ng.

Aggregator role


A "topology" refers to the end result of deploying Vector into your infrastructure. A topology may be as simple as deploying Vector as an agent, or it may be as complex as deploying Vector as an agent and routing data through multiple Vector aggregators.

Deployment topologies