Vector v0.35.0 release notes

The Vector team is pleased to announce version 0.35.0!

Be sure to check out the upgrade guide for breaking changes in this release.

In addition to the usual enhancements and bug fixes, this release also includes

  • The ability to use VRL to specify inputs for unit tests
  • A new avro decoder that can used to decode AVRO data in sources

This release is also the first release only published to the new apt.vector.dev and yum.vector.dev OS package repositories and not to the deprecated repositories.timber.io. A reminder that the repositories.timber.io package repositories will be decommissioned on February 28th, 2024. Please see the release highlight for details about this change and instructions on how to migrate.

Upgrading Vector
When upgrading, we recommend stepping through minor versions as these can each contain breaking changes while Vector is pre-1.0. These breaking changes are noted in their respective upgrade guides.

Highlights

Changelog

11 enhancements

  • A new component-level internal metric, buffer_send_duration_max_seconds, was added to measure the time that a component spends waiting to push events to downstream components. This is a useful metric to use to identify back pressure in your topology.
  • For the throttle transform, make the key tag added to events_discarded_total opt-in. This key can be of unbounded cardinality so should only be opted in if you are confident the cardinality is bounded to avoid runaway memory growth.

    See upgrade guide for details.

  • File-based components (file source, kubernetes_logs source, file sink) now include a internal_metrics.include_file_tag config option that determines whether the file tag is included on the component’s corresponding internal metrics. This config option defaults to false, as this tag is likely to be of high cardinality.

    See upgrade guide for details.

  • The file, aws_s3, and gcp_cloud_storage sink now use the configured timezone when templating out timestamps as part of creating object key names. It will use the globally configured timezone option or the newly added timezone option on each of these sinks. Previously it always used UTC when templating timestamps. Thanks to kates for contributing this change!
  • Sinks with retries now add jitter to the retries to spread out retries. This behavior can be disabled by setting request.retry_jitter_mode to none.
  • Sink request behavior was improved by:

    • Capping the retry duration at 30 seconds by default for faster recovery when downstream services recover, rather than the previous default of an hour. This can be configured via request.retry_max_duration_secs
    • Ensuring defaults are correctly applied as documented
    • Adding a request.max_concurrency_limit that can be used to cap the maximum number of concurrent requests when adaptive request concurrency is in-use
  • HTTP server-based sources include a new keepalive.max_connection_age_secs configuration option, which defaults to 5 minutes (300 seconds). When enabled, this closes incoming TCP connections that reach the maximum age by sending a Connection: close header in the response. While this parameter is crucial for managing the lifespan of persistent, incoming connections to Vector and for effective load balancing, it can be disabled by setting keepalive.max_connection_age_secs to a large number like 100000000.
  • The splunk_hec_logs, splunk_hec_metrics, and humio sinks now allow accessing event metadata when specifying host_key and timestamp_key when log namespacing is enabled. Thanks to sbalmos for contributing this change!
  • The http_server source now allows a glob wildcard to be used when specifying the headers to capture to use as fields to received events. For example, setting headers to ["X-*"] will capture all headers starting with X- and add them as fields on the event (or in the metadata when log namespacing is enabled). Thanks to sonnens for contributing this change!
  • The datadog_logs,datadog_metrics, and datadog_traces sinks now default the values of the default_api_key and site configuration options to the values of environment variables DD_API_KEY and DD_SITE, respectively.
  • The jemalloc memory allocator, which Vector uses on Linux systems, is now also used by any native dependencies, like librdkafka, on Linux systems as well. This results in improved memory use by, for example, the kafka source and sink. Thanks to Ilmarii for contributing this change!

5 new features

  • The aws_cloudwatch_logs sink now allows for the log group retention to be configured for any log groups created by Vector via the new retention options. Thanks to AndrewChubatiuk for contributing this change!
  • The log_to_metric now has the ability to convert logs that have the same structure as metrics directly into metrics rather than only deriving metrics from logs. This “mode” can be enabled by setting the all_metrics configuration option. Incoming metrics should match the structure described by the native codec. Thanks to dygfloyd for contributing this change!
  • Vector configuration unit tests now have the ability to use VRL to specify the input to each test case rather than needing to specify the input as structure directly in the configuration file (via log_fields). See unit tests for details. Thanks to MichaHoffmann for contributing this change!
  • VRL was updated to 0.9.1. This includes the following changes:

    • parse_regex_all pattern parameter can now be resolved from a variable
    • fixed parse_json data corruption issue for numbers greater or equal to i64::MAX
    • support timestamp comparison using operators <, <=, >, >=
  • Support for decoding AVRO data in sources was added via a new codec configurable by setting decoding.codec to avro on components that support it. Additional AVRO-specific codec options are configurable via decoding.avro. Thanks to Ion-manden for contributing this change!

12 bug fixes

  • The kafka source and sink now add component tags to published Kafka consumer and producer metrics.
  • The heroku_logs, http_server, prometheus_remote_write, and splunk_hec sources now correctly report decompressed bytes, rather than compressed bytes, for the component_received_bytes_total internal metric.
  • Memory use by the elasticsearch sink was improved through reduced buffering.
  • The appsignal, datadog_metrics, greptimedb, gcp_stackdriver, honeycomb, and http sinks now correctly report uncompressed bytes, rather than compressed bytes, for the component_sent_bytes_total internal metric.
  • The kafka source and sink now correctly propagate the component-level tls.verify_certificate setting. Previously this was always set to true. Thanks to zjj for contributing this change!
  • vector tap now performs better by not recompiling glob matches on each fetch interval. Thanks to aholmberg for contributing this change!
  • The tag_cardinality_limit transform has improved performance in probabilistic mode via caching the count of entries in the bloom filter.
  • The remap transform no longer emits errors or increments component_discarded_events_total when reroute_dropped is true and events error during processing as the events are not actually dropped, but instead routed to the dropped output.
  • The file source now emits logs with the correct offset field when aggregating multiline events. Thanks to jches for contributing this change!
  • The aws_kinesis_firehose sink now has a partition_key_field that can be used to configure a log event field to use as the Kinesis partition key. By default, Kinesis will use a unique identifier. Thanks to gromnsk for contributing this change!
  • The remap transform now filters out the source contents from error messages when the VRL program is read from a file. This removes the ability to use Vector to execute an attack to read files that the user wouldn’t otherwise have permissions to (e.g. /etc/passwd).
  • Running Vector with -v and -vv to output debug and trace logs, respectively, or -q and -qq to output warn and fatal logs, respectively, now behaves the same as setting VECTOR_LOG to debug, trace, warn, and fatal, respectively. Previously the CLI flags would only apply to some of Vector’s internal modules and dependencies unlike VECTOR_LOG which applied to everything.

Download Version 0.35.0

Linux (deb)
deb
Linux (rpm)
rpm
macOS
tar.gz
Windows
zip
Windows (MSI)
msi