Faults in distributed systems are like green Skittles, we all wish they'd never happen but in reality the best we can do is understand and control the damage they cause. In event streaming pipelines that means understanding delivery and stability guarantees and their implications on your overall system.
Vector attempts to make it clear which guarantees you can expect from it. We categorize all components by their targetted delivery guarantee and also by their general stability. This helps you make the appropriate tradeoffs for your usecase.
Here you can find an overview of delivery guarantee types and their meaning as well as how we label the stability of our components. Next, you can head over to the components page and use filters to see which components support specific guarantees.
at-least-oncedelivery guarantee ensures that an event received by Vector will be delivered at least once to the configured destination(s). While rare, it is possible for an event to be delivered more than once. See the Does Vector support exactly once delivery FAQ below).View all at-least-once components
best-effortdelivery guarantee means that Vector will make a best effort to deliver each event, but cannot guarantee delivery. This is usually due to limitations of the underlying protocol; outside the scope of Vector. It's important to note that while data loss is possible, it is usually rare and Vector does everything within it's control to ensure data is not lost. For more info, see the "Do I need at least once delivery?" FAQ.
Note that this is not the same as at-most-once delivery, as it is still possible for Vector to introduce duplicates under extreme circumstances.
prod-readystatus is a subjective status defined by the Vector team, intended to give you a general idea of a feature's reliability for production environments. A feature is
prod-readyif it meets the following criteria:
View all prod-ready components
- A meaningful amount of users (generally >50) have been using the feature in a production environment for sustained periods without issue.
- The feature has had sufficient time (generally >4 months) to be community tested.
- The feature API is stable and unlikely to change.
- There are no major open bugs for the feature.
betastatus means that a feature has not met the criteria outlined in the Prod-Ready section and therefore should be used with caution in production environments.
Do I need at least once delivery?
One of the unique advantages of the logging use case is that data is usually used for diagnostic purposes only. Therefore, losing the occasional event has little impact on your business. This affords you the opportunity to provision your pipeline towards performance, simplicity, and cost reduction. On the other hand, if you're using your data to perform business critical functions, then data loss is not acceptable and therefore requires "at least once" delivery.
To clarify, even though a source or sink is marked as "best effort" it does not mean Vector takes delivery lightly. In fact, once data is within the boundary of Vector it will not be lost if you've configured on-disk buffers. Data loss for "best effort" sources and sinks are almost always due to the limitations of the underlying protocol.
Does Vector support exactly once delivery?
No, Vector does not support exactly once delivery. There are future plans to partially support this for sources and sinks that support it (Kafka, for example), but it remains unclear if Vector will ever be able to achieve this. We recommend subscribing to our mailing list, which will keep you in the loop if this ever changes.
How can I find components that meet these guarantees?
Head over to the components section and use the guarantee filters.