In order to understand Vector, you must first understand the fundamental concepts. The following concepts are ordered progressively, starting with the individual unit of data (events) and broadening all the way to Vector’s deployment models (pipelines).


Events represent the individual units of data in Vector.


A log event is a generic key/value representation of an event.


A metric event represents a numerical operation performed on a time series. Vector’s metric events are fully interoperable.


Component is the generic term for sources, transforms, and sinks. Components ingest, transform, and route events. You compose components to create topologies.


Vector wouldn’t be very useful if it couldn’t ingest data. A source defines where Vector should pull data from, or how it should receive data pushed to it. A topology can have any number of sources, and as they ingest data they proceed to normalize it into events (see the next section). This sets the stage for easy and consistent processing of your data. Examples of sources include file, syslog, statsd, and stdin.


A transform is responsible for mutating events as they are transported by Vector. This might involve parsing, filtering, sampling, or aggregating. You can have any number of transforms in your pipeline and how they are composed is up to you.


A sink is a destination for events. Each sink’s design and transmission method is dictated by the downstream service it interacts with. The socket sink, for example, streams individual events, while the aws_s3 sink buffers and flushes data.


A pipeline is a directed acyclic graph of components. Each component is a node in the graph with directed edges. Data must flow in one direction, from sources to sinks. Components can produce zero or more events.


A role is a deployment role that Vector fills in order to create end-to-end pipelines.


The agent role is designed for deploying Vector to the edge, typically for data collection.


The aggregator role is designed to collect and process data from multiple upstream sources. These upstream sources could be other Vector agents or non-Vector agents such as Syslog-ng.


A topology is the end result of deploying Vector into your infrastructure. A topology may be as simple as deploying Vector as an agent, or it may be as complex as deploying Vector as an agent and routing data through multiple Vector aggregators.