Vector implements a concurrency model that scales naturally with incoming data volume as shown above. Each Vector source is responsible for defining the unit of concurrency and implementing it accordingly. This allows for a natural concurrency model that adapts to however Vector is being used, avoiding the need for tedious concurrency tuning and configuration.
For example, the
file source implements concurrency across the number of files it’s tailing, and the
source implements concurrency across the number active open connection it’s maintaining.
Stateless function transforms
As covered in the pipeline model documentation, Vector’s concurrency relies on stateless function transforms that can be parallelized. Task transforms cannot be parallelized, currently, and so can introduce bottlenecks in processing (we hope to improve this in the future).