APT and RPM repositories at repositories.timber.io will be decommissioned on February 28th Migration instructions

Kubernetes logs

Collect logs from Kubernetes Nodes

status: stable role: daemon delivery: best effort acknowledgements: no egress: stream state: stateless output: log
Collects Pod logs from Kubernetes Nodes, automatically enriching data with metadata via the Kubernetes API.

Requirements

Kubernetes version >= 1.19 is required.
This source requires read access to the /var/log/pods directory. When run in a Kubernetes cluster this can be provided with a hostPath volume.

Warnings

This source is only tested on Linux. Your mileage may vary for clusters on Windows.

Configuration

Example configurations

{
  "sources": {
    "my_source_id": {
      "type": "kubernetes_logs"
    }
  }
}
[sources.my_source_id]
type = "kubernetes_logs"
sources:
  my_source_id:
    type: kubernetes_logs
{
  "sources": {
    "my_source_id": {
      "type": "kubernetes_logs",
      "auto_partial_merge": true,
      "data_dir": "/var/local/lib/vector/",
      "delay_deletion_ms": 60000,
      "exclude_paths_glob_patterns": [
        "**/exclude/**"
      ],
      "extra_field_selector": "metadata.name!=pod-name-to-exclude",
      "extra_label_selector": "my_custom_label!=my_value",
      "extra_namespace_label_selector": "my_custom_label!=my_value",
      "fingerprint_lines": 1,
      "glob_minimum_cooldown_ms": 60000,
      "ignore_older_secs": 600,
      "ingestion_timestamp_field": ".ingest_timestamp",
      "kube_config_file": "/path/to/.kube/config",
      "max_line_bytes": 32768,
      "max_read_bytes": 2048,
      "oldest_first": true,
      "read_from": "beginning",
      "self_node_name": "${VECTOR_SELF_NODE_NAME}",
      "timezone": "local"
    }
  }
}
[sources.my_source_id]
type = "kubernetes_logs"
auto_partial_merge = true
data_dir = "/var/local/lib/vector/"
delay_deletion_ms = 60_000
exclude_paths_glob_patterns = [ "**/exclude/**" ]
extra_field_selector = "metadata.name!=pod-name-to-exclude"
extra_label_selector = "my_custom_label!=my_value"
extra_namespace_label_selector = "my_custom_label!=my_value"
fingerprint_lines = 1
glob_minimum_cooldown_ms = 60_000
ignore_older_secs = 600
ingestion_timestamp_field = ".ingest_timestamp"
kube_config_file = "/path/to/.kube/config"
max_line_bytes = 32_768
max_read_bytes = 2_048
oldest_first = true
read_from = "beginning"
self_node_name = "${VECTOR_SELF_NODE_NAME}"
timezone = "local"
sources:
  my_source_id:
    type: kubernetes_logs
    auto_partial_merge: true
    data_dir: /var/local/lib/vector/
    delay_deletion_ms: 60000
    exclude_paths_glob_patterns:
      - "**/exclude/**"
    extra_field_selector: metadata.name!=pod-name-to-exclude
    extra_label_selector: my_custom_label!=my_value
    extra_namespace_label_selector: my_custom_label!=my_value
    fingerprint_lines: 1
    glob_minimum_cooldown_ms: 60000
    ignore_older_secs: 600
    ingestion_timestamp_field: .ingest_timestamp
    kube_config_file: /path/to/.kube/config
    max_line_bytes: 32768
    max_read_bytes: 2048
    oldest_first: true
    read_from: beginning
    self_node_name: ${VECTOR_SELF_NODE_NAME}
    timezone: local

auto_partial_merge

optional bool

Whether or not to automatically merge partial events.

Partial events are messages that were split by the Kubernetes Container Runtime log driver.

default: true

data_dir

optional string literal

The directory used to persist file checkpoint positions.

By default, the global data_dir option is used. Make sure the running user has write permissions to this directory.

If this directory is specified, then Vector will attempt to create it.

Examples
"/var/local/lib/vector/"

delay_deletion_ms

optional uint

How long to delay removing metadata entries from the cache when a pod deletion event event is received from the watch stream.

A longer delay allows for continued enrichment of logs after the originating Pod is removed. If relevant metadata has been removed, the log is forwarded un-enriched and a warning is emitted.

default: 60000 (milliseconds)

exclude_paths_glob_patterns

optional [string]
A list of glob patterns to exclude from reading the files.
Array string literal
Examples
[
  "**/exclude/**"
]
default: [**/*.gz **/*.tmp]

extra_field_selector

optional string literal

Specifies the field selector to filter Pods with, to be used in addition to the built-in Node filter.

The built-in Node filter uses self_node_name to only watch Pods located on the same Node.

Examples
"metadata.name!=pod-name-to-exclude"
"metadata.name!=pod-name-to-exclude,metadata.name=mypod"

extra_label_selector

optional string literal
Specifies the label selector to filter Pods with, to be used in addition to the built-in exclude filter.
Examples
"my_custom_label!=my_value"
"my_custom_label!=my_value,my_other_custom_label=my_value"

extra_namespace_label_selector

optional string literal
Specifies the label selector to filter Namespaces with, to be used in addition to the built-in exclude filter.
Examples
"my_custom_label!=my_value"
"my_custom_label!=my_value,my_other_custom_label=my_value"

fingerprint_lines

optional uint

The number of lines to read for generating the checksum.

If your files share a common header that is not always a fixed size,

If the file has less than this amount of lines, it won’t be read at all.

default: 1 (lines)

glob_minimum_cooldown_ms

optional uint

The interval at which the file system is polled to identify new files to read from.

This is quite efficient, yet might still create some load on the file system; in addition, it is currently coupled with checksum dumping in the underlying file server, so setting it too low may introduce a significant overhead.

default: 60000 (milliseconds)

ignore_older_secs

optional uint
Ignore files with a data modification date older than the specified number of seconds.
Examples
600

ingestion_timestamp_field

optional string literal

Overrides the name of the log field used to add the ingestion timestamp to each event.

This is useful to compute the latency between important event processing stages. For example, the time delta between when a log line was written and when it was processed by the kubernetes_logs source.

Examples
".ingest_timestamp"
"ingest_ts"

internal_metrics

optional object
Configuration of internal metrics for file-based components.

Whether or not to include the “file” tag on the component’s corresponding internal metrics.

This is useful for distinguishing between different files while monitoring. However, the tag’s cardinality is unbounded.

default: false

kube_config_file

optional string literal

Optional path to a readable kubeconfig file.

If not set, a connection to Kubernetes is made using the in-cluster configuration.

Examples
"/path/to/.kube/config"

max_line_bytes

optional uint

The maximum number of bytes a line can contain before being discarded.

This protects against malformed lines or tailing incorrect files.

default: 32768 (bytes)

max_read_bytes

optional uint

Max amount of bytes to read from a single file before switching over to the next file. Note: This does not apply when oldest_first is true.

This allows distributing the reads more or less evenly across the files.

default: 2048 (bytes)

namespace_annotation_fields

optional object
Configuration for how the events are enriched with Namespace metadata.

Event field for the Namespace’s labels.

Set to "" to suppress this key.

Examples
".k8s.ns_labels"
"k8s.ns_labels"
""
default: .kubernetes.namespace_labels

node_annotation_fields

optional object
Configuration for how the events are enriched with Node metadata.

node_annotation_fields.node_labels

optional string literal

Event field for the Node’s labels.

Set to "" to suppress this key.

Examples
".k8s.node_labels"
"k8s.node_labels"
""
default: .kubernetes.node_labels

oldest_first

optional bool
Instead of balancing read capacity fairly across all watched files, prioritize draining the oldest files before moving on to read data from more recent files.
default: true

pod_annotation_fields

optional object
Configuration for how the events are enriched with Pod metadata.

pod_annotation_fields.container_id

optional string literal

Event field for the Container’s ID.

Set to "" to suppress this key.

Examples
".k8s.container_id"
"k8s.container_id"
""
default: .kubernetes.container_id

Event field for the Container’s image.

Set to "" to suppress this key.

Examples
".k8s.container_image"
"k8s.container_image"
""
default: .kubernetes.container_image

Event field for the Container’s image ID.

Set to "" to suppress this key.

Examples
".k8s.container_image_id"
"k8s.container_image_id"
""
default: .kubernetes.container_image_id

Event field for the Container’s name.

Set to "" to suppress this key.

Examples
".k8s.container_name"
"k8s.container_name"
""
default: .kubernetes.container_name

Event field for the Pod’s annotations.

Set to "" to suppress this key.

Examples
".k8s.pod_annotations"
"k8s.pod_annotations"
""
default: .kubernetes.pod_annotations

pod_annotation_fields.pod_ip

optional string literal

Event field for the Pod’s IPv4 address.

Set to "" to suppress this key.

Examples
".k8s.pod_ip"
"k8s.pod_ip"
""
default: .kubernetes.pod_ip

pod_annotation_fields.pod_ips

optional string literal

Event field for the Pod’s IPv4 and IPv6 addresses.

Set to "" to suppress this key.

Examples
".k8s.pod_ips"
"k8s.pod_ips"
""
default: .kubernetes.pod_ips

pod_annotation_fields.pod_labels

optional string literal

Event field for the Pod’s labels.

Set to "" to suppress this key.

Examples
".k8s.pod_labels"
"k8s.pod_labels"
""
default: .kubernetes.pod_labels

pod_annotation_fields.pod_name

optional string literal

Event field for the Pod’s name.

Set to "" to suppress this key.

Examples
".k8s.pod_name"
"k8s.pod_name"
""
default: .kubernetes.pod_name

Event field for the Pod’s namespace.

Set to "" to suppress this key.

Examples
".k8s.pod_ns"
"k8s.pod_ns"
""
default: .kubernetes.pod_namespace

Event field for the Pod’s node_name.

Set to "" to suppress this key.

Examples
".k8s.pod_host"
"k8s.pod_host"
""
default: .kubernetes.pod_node_name

pod_annotation_fields.pod_owner

optional string literal

Event field for the Pod’s owner reference.

Set to "" to suppress this key.

Examples
".k8s.pod_owner"
"k8s.pod_owner"
""
default: .kubernetes.pod_owner

pod_annotation_fields.pod_uid

optional string literal

Event field for the Pod’s UID.

Set to "" to suppress this key.

Examples
".k8s.pod_uid"
"k8s.pod_uid"
""
default: .kubernetes.pod_uid

read_from

optional string literal enum
File position to use when reading a new file.
Enum options string literal
OptionDescription
beginningRead from the beginning of the file.
endStart reading from the current end of the file.
default: beginning

self_node_name

optional string literal

The name of the Kubernetes Node that is running.

Configured to use an environment variable by default, to be evaluated to a value provided by Kubernetes at Pod creation.

default: ${VECTOR_SELF_NODE_NAME}

timezone

optional string literal
The default time zone for timestamps without an explicit zone.
Examples
"local"
"America/New_York"
"EST5EDT"

use_apiserver_cache

optional bool
Determines if requests to the kube-apiserver can be served by a cache.
default: false

Outputs

<component_id>

Default output stream of the component. Use this component’s ID as an input to downstream transforms and sinks.

Output Data

Logs

Warning

The fields shown below will be different if log namespacing is enabled. See Log Namespacing for more details

Line

An individual line from a Pod log file.
Fields
file required string literal
The absolute path of originating file.
Examples
/var/log/pods/pod-namespace_pod-name_pod-uid/container/1.log
kubernetes.container_id optional string literal
Container id.
Examples
docker://f24c81dcd531c5d353751c77fe0556a4f602f7714c72b9a58f9b26c0628f1fa6
kubernetes.container_image optional string literal
Container image.
Examples
busybox:1.30
kubernetes.container_image_id optional string literal
Container image ID.
Examples
busybox@sha256:1e7b63c09af457b93c17d25ef4e6aee96b5bb95f087840cffd7c4bb2fe8ae5c6
kubernetes.container_name optional string literal
Container name.
Examples
coredns
kubernetes.namespace_labels optional object
Set of labels attached to the Namespace.
Examples
{
  "mylabel": "myvalue"
}
kubernetes.pod_annotations optional object
Set of annotations attached to the Pod.
Examples
{
  "myannotation": "myvalue"
}
kubernetes.pod_ip optional string literal
Pod IPv4 address.
Examples
192.168.1.1
kubernetes.pod_ips optional string literal
Pod IPv4 and IPv6 addresses.
Examples
192.168.1.1
::1
kubernetes.pod_labels optional object
Set of labels attached to the Pod.
Examples
{
  "mylabel": "myvalue"
}
kubernetes.pod_name optional string literal
Pod name.
Examples
coredns-qwertyuiop-qwert
kubernetes.pod_namespace optional string literal
Pod namespace.
Examples
kube-system
kubernetes.pod_node_name optional string literal
Pod node name.
Examples
minikube
kubernetes.pod_owner optional string literal
Pod owner.
Examples
ReplicaSet/coredns-565d847f94
kubernetes.pod_uid optional string literal
Pod uid.
Examples
ba46d8c9-9541-4f6b-bbf9-d23b36f2f136
message required string literal
The raw line from the Pod log file.
Examples
53.126.150.246 - - [01/Oct/2020:11:25:58 -0400] "GET /disintermediate HTTP/2.0" 401 20308
source_type required string literal
The name of the source type.
Examples
kubernetes_logs
stream required string literal
The name of the stream the log line was submitted to.
Examples
stdout
stderr
timestamp required timestamp
The exact time the event was processed by Kubernetes.
Examples
2020-10-10T17:07:36.452332Z

Telemetry

Metrics

link

component_discarded_events_total

counter
The number of events dropped by this component.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
intentional
True if the events were discarded intentionally, like a filter transform, or false if due to an error.
pid optional
The process ID of the Vector instance.

component_errors_total

counter
The total number of errors encountered by this component.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
error_type
The type of the error
host optional
The hostname of the system Vector is running on.
pid optional
The process ID of the Vector instance.
stage
The stage within the component at which the error occurred.

component_received_bytes_total

counter
The number of raw bytes accepted by this component from source origins.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
container_name optional
The name of the container from which the data originated.
file optional
The file from which the data originated.
host optional
The hostname of the system Vector is running on.
mode optional
The connection mode used by the component.
peer_addr optional
The IP from which the data originated.
peer_path optional
The pathname from which the data originated.
pid optional
The process ID of the Vector instance.
pod_name optional
The name of the pod from which the data originated.
uri optional
The sanitized URI from which the data originated.

component_received_event_bytes_total

counter
The number of event bytes accepted by this component either from tagged origins like file and uri, or cumulatively from other origins.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
container_name optional
The name of the container from which the data originated.
file optional
The file from which the data originated.
host optional
The hostname of the system Vector is running on.
mode optional
The connection mode used by the component.
peer_addr optional
The IP from which the data originated.
peer_path optional
The pathname from which the data originated.
pid optional
The process ID of the Vector instance.
pod_name optional
The name of the pod from which the data originated.
uri optional
The sanitized URI from which the data originated.

component_received_events_count

histogram

A histogram of the number of events passed in each internal batch in Vector’s internal topology.

Note that this is separate than sink-level batching. It is mostly useful for low level debugging performance issues in Vector due to small internal batches.

component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
container_name optional
The name of the container from which the data originated.
file optional
The file from which the data originated.
host optional
The hostname of the system Vector is running on.
mode optional
The connection mode used by the component.
peer_addr optional
The IP from which the data originated.
peer_path optional
The pathname from which the data originated.
pid optional
The process ID of the Vector instance.
pod_name optional
The name of the pod from which the data originated.
uri optional
The sanitized URI from which the data originated.

component_received_events_total

counter
The number of events accepted by this component either from tagged origins like file and uri, or cumulatively from other origins.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
container_name optional
The name of the container from which the data originated.
file optional
The file from which the data originated.
host optional
The hostname of the system Vector is running on.
mode optional
The connection mode used by the component.
peer_addr optional
The IP from which the data originated.
peer_path optional
The pathname from which the data originated.
pid optional
The process ID of the Vector instance.
pod_name optional
The name of the pod from which the data originated.
uri optional
The sanitized URI from which the data originated.

component_sent_event_bytes_total

counter
The total number of event bytes emitted by this component.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
output optional
The specific output of the component.
pid optional
The process ID of the Vector instance.

component_sent_events_total

counter
The total number of events emitted by this component.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
output optional
The specific output of the component.
pid optional
The process ID of the Vector instance.

k8s_docker_format_parse_failures_total

counter
The total number of failures to parse a message as a JSON object.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
pid optional
The process ID of the Vector instance.

k8s_format_picker_edge_cases_total

counter
The total number of edge cases encountered while picking format of the Kubernetes log message.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
pid optional
The process ID of the Vector instance.

k8s_reflector_desyncs_total

counter
The total number of desyncs for the reflector.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
pid optional
The process ID of the Vector instance.

k8s_state_ops_total

counter
The total number of state operations.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
op_kind optional
The kind of operation performed.
pid optional
The process ID of the Vector instance.

k8s_stream_chunks_processed_total

counter
The total number of chunks processed from the stream of Kubernetes resources.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
pid optional
The process ID of the Vector instance.

k8s_stream_processed_bytes_total

counter
The number of bytes processed from the stream of Kubernetes resources.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
pid optional
The process ID of the Vector instance.

k8s_watch_requests_failed_total

counter
The total number of watch requests failed.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
pid optional
The process ID of the Vector instance.

k8s_watch_requests_invoked_total

counter
The total number of watch requests invoked.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
pid optional
The process ID of the Vector instance.

k8s_watch_stream_failed_total

counter
The total number of watch streams failed.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
pid optional
The process ID of the Vector instance.

k8s_watch_stream_items_obtained_total

counter
The total number of items obtained from a watch stream.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
pid optional
The process ID of the Vector instance.

k8s_watcher_http_error_total

counter
The total number of HTTP error responses for the Kubernetes watcher.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
pid optional
The process ID of the Vector instance.

source_lag_time_seconds

histogram
The difference between the timestamp recorded in each event and the time when it was ingested, expressed as fractional seconds.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
pid optional
The process ID of the Vector instance.

Examples

Sample Output

Given this event...
F1015 11:01:46.499073       1 main.go:39] error getting server version: Get "https://10.96.0.1:443/version?timeout=32s": dial tcp 10.96.0.1:443: connect: network is unreachable
...and this configuration...
sources:
  my_source_id:
    type: kubernetes_logs
[sources.my_source_id]
type = "kubernetes_logs"
{
  "sources": {
    "my_source_id": {
      "type": "kubernetes_logs"
    }
  }
}
...this Vector event is produced:
{
  "file": "/var/log/pods/kube-system_storage-provisioner_93bde4d0-9731-4785-a80e-cd27ba8ad7c2/storage-provisioner/1.log",
  "kubernetes.container_image": "gcr.io/k8s-minikube/storage-provisioner:v3",
  "kubernetes.container_name": "storage-provisioner",
  "kubernetes.namespace_labels": {
    "kubernetes.io/metadata.name": "kube-system"
  },
  "kubernetes.pod_annotations": {
    "prometheus.io/scrape": "false"
  },
  "kubernetes.pod_ip": "192.168.1.1",
  "kubernetes.pod_ips": [
    "192.168.1.1",
    "::1"
  ],
  "kubernetes.pod_labels": {
    "addonmanager.kubernetes.io/mode": "Reconcile",
    "gcp-auth-skip-secret": "true",
    "integration-test": "storage-provisioner"
  },
  "kubernetes.pod_name": "storage-provisioner",
  "kubernetes.pod_namespace": "kube-system",
  "kubernetes.pod_node_name": "minikube",
  "kubernetes.pod_uid": "93bde4d0-9731-4785-a80e-cd27ba8ad7c2",
  "message": "F1015 11:01:46.499073       1 main.go:39] error getting server version: Get \"https://10.96.0.1:443/version?timeout=32s\": dial tcp 10.96.0.1:443: connect: network is unreachable",
  "source_type": "kubernetes_logs",
  "stream": "stderr",
  "timestamp": "2020-10-15T11:01:46.499555308Z"
}

How it works

Checkpointing

Vector checkpoints the current read position after each successful read. This ensures that Vector resumes where it left off if restarted, preventing data from being read twice. The checkpoint positions are stored in the data directory which is specified via the global data_dir option, but can be overridden via the data_dir option in the file source directly.

Container exclusion

The kubernetes_logs source can skip the logs from the individual Containers of a particular Pod. Add an annotation vector.dev/exclude-containers to the Pod, and enumerate the names of all the Containers to exclude in the value of the annotation like so:

vector.dev/exclude-containers: "container1,container2"

This annotation will make Vector skip logs originating from the container1 and container2 of the Pod marked with the annotation, while logs from other Containers in the Pod will still be collected.

Context

By default, the kubernetes_logs source augments events with helpful context keys.

Enrichment

Vector will enrich data with Kubernetes context. A comprehensive list of fields can be found in the kubernetes_logs source output docs.

Filtering

Vector provides rich filtering options for Kubernetes log collection:

  • Built-in Pod and Container exclusion rules.
  • The exclude_paths_glob_patterns option allows you to exclude Kubernetes log files by the file name and path.
  • The extra_field_selector option specifies the field selector to filter Pods with, to be used in addition to the built-in Node filter.
  • The extra_label_selector option specifies the label selector to filter Pods with, to be used in addition to the built-in vector.dev/exclude filter.

Globbing

By default, the kubernetes_logs source ignores compressed and temporary files. This behavior can be configured with the exclude_paths_glob_patterns option.

Globbing is used to continually discover Pods' log files at a rate defined by the glob_minimum_cooldown option. In environments when files are rotated rapidly, we recommend lowering the glob_minimum_cooldown to catch files before they are compressed.

Kubernetes API access control

Vector requires access to the Kubernetes API. Specifically, the kubernetes_logs source uses the /api/v1/pods, /api/v1/namespaces, and /api/v1/nodes endpoints to list and watch resources we use to enrich events with additional metadata.

Modern Kubernetes clusters run with RBAC (role-based access control) scheme. RBAC-enabled clusters require some configuration to grant Vector the authorization to access the Kubernetes API endpoints. As RBAC is currently the standard way of controlling access to the Kubernetes API, we ship the necessary configuration out of the box: see the ClusterRole, ClusterRoleBinding, and ServiceAccount in our Kubectl YAML config, and the rbac.yaml template configuration of the Helm chart.

If your cluster doesn’t use any access control scheme and doesn’t restrict access to the Kubernetes API, you don’t need to do any extra configuration - Vector will just work.

Clusters using legacy ABAC scheme are not officially supported (although Vector might work if you configure access properly) - we encourage switching to RBAC. If you use a custom access control scheme - make sure Vector’s Pod/ServiceAccount is granted list and watch access to the /api/v1/pods, /api/v1/namespaces, and /api/v1/nodes resources.

Kubernetes API communication

Vector communicates with the Kubernetes API to enrich the data it collects with Kubernetes context. Therefore, Vector must have access to communicate with the Kubernetes API server. If Vector is running in a Kubernetes cluster then Vector will connect to that cluster using the Kubernetes provided access information.

In addition to access, Vector implements proper desync handling to ensure communication is safe and reliable. This ensures that Vector will not overwhelm the Kubernetes API or compromise its stability.

Namespace exclusion

By default, the kubernetes_logs source will skip logs from the Namespaces that have a vector.dev/exclude: "true" label. You can configure additional exclusion rules via label selectors, see the available options.

Partial message merging

Vector, by default, will merge partial messages that are split due to the Docker size limit. For everything else, it is recommended to use the reduce transform which offers the ability to handle custom merging of things like stacktraces.

Pod exclusion

By default, the kubernetes_logs source will skip logs from the Pods that have a vector.dev/exclude: "true" label. You can configure additional exclusion rules via label or field selectors, see the available options.

Pod removal

To ensure all data is collected, Vector will continue to collect logs from the Pod for some time after its removal. This ensures that Vector obtains some of the most important data, such as crash details.

Resource limits

Vector recommends the following resource limits.

Agent resource limits

If deploy Vector as an agent (collecting data for each of your Nodes), then we recommend the following limits:

resources:
  requests:
    memory: "64Mi"
    cpu: "500m"
  limits:
    memory: "1024Mi"
    cpu: "6000m"

As with all Kubernetes resource limit recommendations, use these as a reference point and adjust as necessary. If your configured Vector pipeline is complex, you may need more resources; if you have a more straightforward pipeline, you may need less.

State

This component is stateless, meaning its behavior is consistent across each input.

State management

Agent state management

For the agent role, Vector stores its state at the host-mapped dir with a static path, so if it’s redeployed it’ll continue from where it was interrupted.

Testing & reliability

Vector is tested extensively against Kubernetes. In addition to Kubernetes being Vector’s most popular installation method, Vector implements a comprehensive end-to-end test suite for all minor Kubernetes versions starting with 1.19.