Vector v0.31.0 release notes

The Vector team is pleased to announce version 0.31.0!

Be sure to check out the upgrade guide for breaking changes in this release.

In addition to the usual smaller enhancements and bug fixes, this release includes an opt-in beta of a new log event data model that we think will make it easier to process logs by moving event metadata out of the log event itself. We are looking for feedback on this new feature before beginning towards making it the default and eventually removing the old log event data model.

By way of example, an example event from the datadog_agent source currently looks like:

{
	"ddsource": "vector",
	"ddtags": "env:prod",
	"hostname": "alpha",
	"foo": "foo field",
	"service": "cernan",
	"source_type": "datadog_agent",
	"bar": "bar field",
	"status": "warning",
	"timestamp": "1970-02-14T20:44:57.570Z"
}

Will now look like:

{
	"foo": "foo field",
	"bar": "bar field"
}

(just the event itself)

with additional buckets for source added metadata:

{
	"ddsource": "vector",
	"ddtags": "env:prod",
	"hostname": "alpha",
	"service": "cernan",
	"status": "warning",
	"timestamp": "1970-02-14T20:44:57.570Z"
}

accessible via %<datadog_agent>.<field>, and Vector added metadata:

{
	"source_type": "datadog_agent",
	"ingest_timestamp": "1970-02-14T20:44:58.236Z"
}

accessible via %vector.<field>.

We think this new organization will be easier to reason about for users as well as avoid key conflicts between event fields and metadata.

You can opt into this feature by setting schema.log_namespace as a global setting or the log_namespace option now available on each source itself. See the blog post for an expanded explanation and details. Let us know what you think on this issue.

Upgrading Vector
When upgrading, we recommend stepping through minor versions as these can each contain breaking changes while Vector is pre-1.0. These breaking changes are noted in their respective upgrade guides.

Highlights

0.31 Upgrade Guide

type: breaking change

Changelog

12 enhancements

  • The aws_s3 source now support bucket notifications in SQS that originated as SNS messages. It still does not support receiving SNS messages directly. Thanks to sbalmos for contributing this change!
  • A from_unix_timestamp function was added to VRL to decode timestamp values from unix timestamps. This deprecates the to_timestamp function, which will be removed in a future release.
  • The parse_nginx_log function now supports ingress_upstreaminfo as a format.
  • The format_timestamp function now supports an optional timezone argument to control the timezone of the encoded timestamp.
  • Vector’s graceful shutdown time limit is now configurable (via --graceful-shutdown-limit-secs) and able to be disabled (via --no-graceful-shutdown-limit). See the CLI docs for more.
  • Support for zstd compression was added to sinks support compression. Thanks to akoshchiy for contributing this change!
  • The prometheus_remote_write sink now supports zstd and gzip compression in addition to snappy (the default). Thanks to zamazan4ik for contributing this change!
  • The journald source now supports a journal_namespace option to restrict the namespace of the units that the source consumes logs from.
  • The gelf, native_json, syslog, and json decoders (configurable as decoding.codec on sources) now have corresponding options for lossy UTF-8 decoding via decoding.<codec name>.lossy = true|false. This can be used to accept invalid UTF-8 where invalid characters are replaced before decoded.
  • The aws_kinesis_firehose and aws_kinesis_streams sinks are now able to retry requests with partial failures by setting request_retry_partial to true. The default is false to avoid writing duplicate data if proper event idempotency is not in place. Thanks to dengmingtong for contributing this change!
  • The component_sent_event_bytes_total and component_sent_event_total metrics can now optionally have a service and source tag added to them, driven from event data, from the added telemetry global config options. This can be used to break down processing volume by service and source.
  • The internal_metrics and internal_logs sources now shutdown last in order to capture as much telemetry as possible during Vector shutdown.

13 bug fixes

  • The fluent source now correctly sends back message acknowledgements in msgpack rather than JSON. Previously fluentbit would fail to process them. Thanks to ChezBunch for contributing this change!
  • VRL now supports the \0 null byte escape sequence in strings.
  • The statsd sink now correctly encodes all counters as incremental, per the spec.
  • A disk buffer deadlock that occurred on start-up after certain crash conditions was fixed.
  • The http_client no longer corrupts binary data by always trying to interpret as UTF-8 bytes. Instead options were added to encoders for lossy UTF-8 decoding (see above entry).
  • The Proxy-Authorization header is now added to HTTP requests from components that support HTTP proxies when authentication is used. Thanks to syedriko for contributing this change!
  • Vector now exits non-zero if the graceful shutdown time limit expires before Vector finishes shutting down.
  • The following components now log template render errors at the warning level rather than error and does not increment component_errors_total. This fixes a regression in v0.30.0 for the loki sink.

    • loki sink
    • papertrail sink
    • splunk_hec_logs sink
    • splunk_hec_metrics sink
    • throttle transform
    • log_to_metric transform
  • The datadog_metrics sink now incrementally encodes sketches. This avoids issues users have seen with sketch payloads exceeding the limits and being dropped.
  • The datadog_agent reporting of events and bytes received was fixed so it no longer double counted incoming events.
  • log_schema global configuration fields can now appear in a different file than defined sources. Thanks to Hexta for contributing this change!
  • Vector now supports running greater than 512 sources. Previously it would lock up if more than 512 file sources were defined. Thanks to honganan for contributing this change!
  • Internal metrics for the Adaptive Concurrency Request module are now correctly tagged with component metadata like other sink metrics (component_kind, component_id, component_type).

Download Version 0.31.0

Linux (deb)
deb
Linux (rpm)
rpm
macOS
tar.gz
Windows
zip
Windows (MSI)
msi