Kafka Source

The Vector kafka source ingests data through Kafka 0.9 or later and outputs log events.

Configuration

  • Common
  • Advanced
vector.toml
[sources.my_source_id]
# REQUIRED
type = "kafka" # must be: "kafka"
bootstrap_servers = "10.14.22.123:9092,10.14.23.332:9092" # example
group_id = "consumer-group-name" # example
topics = ["^(prefix1|prefix2)-.+", "topic-1", "topic-2"] # example
# OPTIONAL
key_field = "user_id" # example, no default

Requirements

Options

9 items
stringoptional

auto_offset_reset

If offsets for consumer group do not exist, set them using this strategy. librdkafka documentation for auto.offset.reset option for explanation.

Default: "largest"
View examples
stringcommonrequired

bootstrap_servers

A comma-separated list of host and port pairs that are the addresses of the Kafka brokers in a "bootstrap" Kafka cluster that a Kafka client connects to initially to bootstrap itself.

No default
View examples
int (milliseconds)optional

fetch_wait_max_ms

Maximum time the broker may wait to fill the response.

Default: 100 (milliseconds)
View examples
stringcommonrequired

group_id

The consumer group name to be used to consume events from Kafka.

No default
View examples
stringcommonoptional

key_field

The log field name to use for the topic key. If unspecified, the key would not be added to the log event. If the message has null key, then this field would not be added to the log event.

No default
View examples
tableoptional

librdkafka_options

Advanced consumer options. See librdkafka documentation for details.

stringoptional

[field-name]

The options and their values. Accepts string values.

No default
View examples
int (milliseconds)optional

session_timeout_ms

The Kafka session timeout in milliseconds.

Default: 10000 (milliseconds)
View examples
int (milliseconds)optional

socket_timeout_ms

Default timeout for network requests.

Default: 60000 (milliseconds)
View examples
[string]commonrequired

topics

The Kafka topics names to read events from. Regex is supported if the topic begins with ^.

No default
View examples

Output

The kafka source ingests data through Kafka 0.9 or later and outputs log events. For example:

{
"message": "Started GET / for 127.0.0.1 at 2012-03-10 14:28:14 +0100",
"timestamp": "2019-11-01T21:15:47+00:00"
}

More detail on the output schema is below.

2 items
stringcommonrequired

message

The raw event message, unaltered.

No default
View examples
timestampcommonrequired

timestamp

The exact time the event was ingested.

No default
View examples

How It Works

Environment Variables

Environment variables are supported through all of Vector's configuration. Simply add ${MY_ENV_VAR} in your Vector configuration file and the variable will be replaced before being evaluated.

You can learn more in the Environment Variables section.

librdkafka

The kafka source uses lib_rdkafka under the hood. This is a battle tested, performant, and reliabile library that facilitates communication with Kafka. And because Vector produces static MUSL builds, this dependency is packaged with Vector, meaning you do not need to install it.