Docker logs

Collect logs from Docker

status: stable role: daemon delivery: best effort acknowledgements: no egress: stream state: stateless output: log previously known as: docker

Alias

This component was previously called the docker source. Make sure to update your Vector configuration to accommodate the name change:

[sources.my_docker_logs_source]
+type = "docker_logs"
-type = "docker"

Warnings

To avoid collecting logs from itself when deployed as a container, the Docker source uses current hostname to find out which container it is inside. If a container’s ID matches the hostname, that container will be excluded. If you change container’s hostname, consider manually excluding Vector container using exclude_containers.

Configuration

Example configurations

{
  "sources": {
    "my_source_id": {
      "type": "docker_logs"
    }
  }
}
[sources.my_source_id]
type = "docker_logs"
sources:
  my_source_id:
    type: docker_logs
{
  "sources": {
    "my_source_id": {
      "type": "docker_logs",
      "auto_partial_merge": true,
      "docker_host": "http://localhost:2375",
      "exclude_containers": [
        "exclude_"
      ],
      "include_containers": [
        "include_"
      ],
      "include_images": [
        "httpd"
      ],
      "include_labels": [
        "org.opencontainers.image.vendor=Vector"
      ],
      "partial_event_marker_field": "_partial",
      "retry_backoff_secs": 2
    }
  }
}
[sources.my_source_id]
type = "docker_logs"
auto_partial_merge = true
docker_host = "http://localhost:2375"
exclude_containers = [ "exclude_" ]
include_containers = [ "include_" ]
include_images = [ "httpd" ]
include_labels = [ "org.opencontainers.image.vendor=Vector" ]
partial_event_marker_field = "_partial"
retry_backoff_secs = 2
sources:
  my_source_id:
    type: docker_logs
    auto_partial_merge: true
    docker_host: http://localhost:2375
    exclude_containers:
      - exclude_
    include_containers:
      - include_
    include_images:
      - httpd
    include_labels:
      - org.opencontainers.image.vendor=Vector
    partial_event_marker_field: _partial
    retry_backoff_secs: 2

auto_partial_merge

optional bool
Enables automatic merging of partial events.
default: true

docker_host

optional string literal

Docker host to connect to.

Use an HTTPS URL to enable TLS encryption.

If absent, the DOCKER_HOST environment variable is used. If DOCKER_HOST is also absent, the default Docker local socket (/var/run/docker.sock on Unix platforms, //./pipe/docker_engine on Windows) is used.

Examples
"http://localhost:2375"
"https://localhost:2376"
"unix:///var/run/docker.sock"
"npipe:////./pipe/docker_engine"
"/var/run/docker.sock"
"//./pipe/docker_engine"

exclude_containers

optional [string]

A list of container IDs or names of containers to exclude from log collection.

Matching is prefix first, so specifying a value of foo would match any container named foo as well as any container whose name started with foo. This applies equally whether matching container IDs or names.

By default, the source collects logs for all containers. If exclude_containers is configured, any container that matches a configured exclusion is excluded even if it is also included with include_containers, so care should be taken when using prefix matches as they cannot be overridden by a corresponding entry in include_containers, for example, excluding foo by attempting to include foo-specific-id.

This can be used in conjunction with include_containers.

Array string literal
Examples
[
  "exclude_",
  "exclude_me_0",
  "ad08cc418cf9"
]

host_key

optional string literal

Overrides the name of the log field used to add the current hostname to each event.

By default, the global log_schema.host_key option is used.

include_containers

optional [string]

A list of container IDs or names of containers to include in log collection.

Matching is prefix first, so specifying a value of foo would match any container named foo as well as any container whose name started with foo. This applies equally whether matching container IDs or names.

By default, the source collects logs for all containers. If include_containers is configured, only containers that match a configured inclusion and are also not excluded get matched.

This can be used in conjunction with exclude_containers.

Array string literal
Examples
[
  "include_",
  "include_me_0",
  "ad08cc418cf9"
]

include_images

optional [string]

A list of image names to match against.

If not provided, all images are included.

Array string literal
Examples
[
  "httpd",
  "redis"
]

include_labels

optional [string]

A list of container object labels to match against when filtering running containers.

Labels should follow the syntax described in the Docker object labels documentation.

Array string literal
Examples
[
  "org.opencontainers.image.vendor=Vector",
  "com.mycorp.internal.animal=fish"
]

multiline

optional object

Multiline aggregation configuration.

If not specified, multiline aggregation is disabled.

multiline.condition_pattern

required string literal

Regular expression pattern that is used to determine whether or not more lines should be read.

This setting must be configured in conjunction with mode.

Examples
"^[\\s]+"
"\\\\$"
"^(INFO|ERROR) "
";$"

multiline.mode

required string literal enum

Aggregation mode.

This setting must be configured in conjunction with condition_pattern.

Enum options
OptionDescription
continue_past

All consecutive lines matching this pattern, plus one additional line, are included in the group.

This is useful in cases where a log message ends with a continuation marker, such as a backslash, indicating that the following line is part of the same message.

continue_through

All consecutive lines matching this pattern are included in the group.

The first line (the line that matched the start pattern) does not need to match the ContinueThrough pattern.

This is useful in cases such as a Java stack trace, where some indicator in the line (such as a leading whitespace) indicates that it is an extension of the proceeding line.

halt_before

All consecutive lines not matching this pattern are included in the group.

This is useful where a log line contains a marker indicating that it begins a new message.

halt_with

All consecutive lines, up to and including the first line matching this pattern, are included in the group.

This is useful where a log line ends with a termination marker, such as a semicolon.

Examples
"continue_past"
"continue_through"
"halt_before"
"halt_with"

multiline.start_pattern

required string literal
Regular expression pattern that is used to match the start of a new message.
Examples
"^[\\s]+"
"\\\\$"
"^(INFO|ERROR) "
";$"

The maximum amount of time to wait for the next additional line, in milliseconds.

Once this timeout is reached, the buffered message is guaranteed to be flushed, even if incomplete.

Examples
1000
600000

partial_event_marker_field

optional string literal

Overrides the name of the log field used to mark an event as partial.

If auto_partial_merge is disabled, partial events are emitted with a log field, set by this configuration value, indicating that the event is not complete.

default: _partial

retry_backoff_secs

optional uint
The amount of time to wait before retrying after an error.
default: 2 (seconds)

tls

optional object

Configuration of TLS when connecting to the Docker daemon.

Only relevant when connecting to Docker with an HTTPS URL.

If not configured, the environment variable DOCKER_CERT_PATH is used. If DOCKER_CERT_PATH is absent, then DOCKER_CONFIG is used. If both environment variables are absent, the certificates in ~/.docker/ are read.

tls.ca_file

required string literal
Path to the CA certificate file.

tls.crt_file

required string literal
Path to the TLS certificate file.

tls.key_file

required string literal
Path to the TLS key file.

Environment variables

DOCKER_CERT_PATH

common optional string literal

Path to look for TLS certificates when tls configuration is absent. Vector will use:

  • $DOCKER_CERT_PATH/ca.pem: CA certificate.
  • $DOCKER_CERT_PATH/cert.pem: TLS certificate.
  • $DOCKER_CERT_PATH/key.pem: TLS key.
Examples
certs/

DOCKER_CONFIG

common optional string literal
Path to look for TLS certificates when both tls configuration and DOCKER_CERT_PATH are absent.
Examples
certs/

DOCKER_HOST

common optional string literal
The Docker host to connect to when docker_host configuration is absent.
Examples
unix:///var/run/docker.sock

Outputs

<component_id>

Default output stream of the component. Use this component’s ID as an input to downstream transforms and sinks.

Output Data

Logs

Warning

The fields shown below will be different if log namespacing is enabled. See Log Namespacing for more details

Log

A Docker log event
Fields
container_created_at required timestamp
A UTC timestamp representing when the container was created.
Examples
2020-10-10T17:07:36.452332Z
container_id required string literal
The Docker container ID that the log was collected from.
Examples
9b6247364a03
715ebfcee040
container_name required string literal
The Docker container name that the log was collected from.
Examples
evil_ptolemy
nostalgic_stallman
host required string literal
The local hostname, equivalent to the gethostname command.
Examples
my-host.local
image required string literal
The image name that the container is based on.
Examples
ubuntu:latest
busybox
timberio/vector:latest-alpine
label required object
Each container label is inserted with it’s exact key/value pair.
Examples
{
  "mylabel": "myvalue"
}
message required string literal
The raw log message.
Examples
Started GET / for 127.0.0.1 at 2012-03-10 14:28:14 +0100
source_type required string literal
The name of the source type.
Examples
docker
stream required string literal
The standard stream that the log was collected from.
Examples
stdout
stderr
timestamp required timestamp
The UTC timestamp extracted from the Docker log event.
Examples
2020-10-10T17:07:36.452332Z

Telemetry

Metrics

link

component_discarded_events_total

counter
The number of events dropped by this component.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
intentional
True if the events were discarded intentionally, like a filter transform, or false if due to an error.
pid optional
The process ID of the Vector instance.

component_errors_total

counter
The total number of errors encountered by this component.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
error_type
The type of the error
host optional
The hostname of the system Vector is running on.
pid optional
The process ID of the Vector instance.
stage
The stage within the component at which the error occurred.

component_received_bytes_total

counter
The number of raw bytes accepted by this component from source origins.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
container_name optional
The name of the container from which the data originated.
file optional
The file from which the data originated.
host optional
The hostname of the system Vector is running on.
mode optional
The connection mode used by the component.
peer_addr optional
The IP from which the data originated.
peer_path optional
The pathname from which the data originated.
pid optional
The process ID of the Vector instance.
pod_name optional
The name of the pod from which the data originated.
uri optional
The sanitized URI from which the data originated.

component_received_event_bytes_total

counter
The number of event bytes accepted by this component either from tagged origins like file and uri, or cumulatively from other origins.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
container_name optional
The name of the container from which the data originated.
file optional
The file from which the data originated.
host optional
The hostname of the system Vector is running on.
mode optional
The connection mode used by the component.
peer_addr optional
The IP from which the data originated.
peer_path optional
The pathname from which the data originated.
pid optional
The process ID of the Vector instance.
pod_name optional
The name of the pod from which the data originated.
uri optional
The sanitized URI from which the data originated.

component_received_events_count

histogram

A histogram of the number of events passed in each internal batch in Vector’s internal topology.

Note that this is separate than sink-level batching. It is mostly useful for low level debugging performance issues in Vector due to small internal batches.

component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
container_name optional
The name of the container from which the data originated.
file optional
The file from which the data originated.
host optional
The hostname of the system Vector is running on.
mode optional
The connection mode used by the component.
peer_addr optional
The IP from which the data originated.
peer_path optional
The pathname from which the data originated.
pid optional
The process ID of the Vector instance.
pod_name optional
The name of the pod from which the data originated.
uri optional
The sanitized URI from which the data originated.

component_received_events_total

counter
The number of events accepted by this component either from tagged origins like file and uri, or cumulatively from other origins.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
container_name optional
The name of the container from which the data originated.
file optional
The file from which the data originated.
host optional
The hostname of the system Vector is running on.
mode optional
The connection mode used by the component.
peer_addr optional
The IP from which the data originated.
peer_path optional
The pathname from which the data originated.
pid optional
The process ID of the Vector instance.
pod_name optional
The name of the pod from which the data originated.
uri optional
The sanitized URI from which the data originated.

component_sent_event_bytes_total

counter
The total number of event bytes emitted by this component.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
output optional
The specific output of the component.
pid optional
The process ID of the Vector instance.

component_sent_events_total

counter
The total number of events emitted by this component.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
output optional
The specific output of the component.
pid optional
The process ID of the Vector instance.

container_processed_events_total

counter
The total number of container events processed.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
pid optional
The process ID of the Vector instance.

containers_unwatched_total

counter
The total number of times Vector stopped watching for container logs.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
pid optional
The process ID of the Vector instance.

containers_watched_total

counter
The total number of times Vector started watching for container logs.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
pid optional
The process ID of the Vector instance.

source_lag_time_seconds

histogram
The difference between the timestamp recorded in each event and the time when it was ingested, expressed as fractional seconds.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
pid optional
The process ID of the Vector instance.

Examples

Dummy Logs

Given this event...
 ```json
 {
   "stream": "stdout",
   "message": "150.75.72.205 - - [03/Oct/2020:16:11:29 +0000] "HEAD /initiatives HTTP/1.1" 504 117"
 }
```
...and this configuration...
sources:
  my_source_id:
    type: docker_logs
    include_images:
      - mingrammer/flog
[sources.my_source_id]
type = "docker_logs"
include_images = [ "mingrammer/flog" ]
{
  "sources": {
    "my_source_id": {
      "type": "docker_logs",
      "include_images": [
        "mingrammer/flog"
      ]
    }
  }
}
...this Vector event is produced:
{
  "container_created_at": "2020-10-03T16:11:29.443232Z",
  "container_id": "fecc98177eca7fb75a2b2186c418bf9a0cd3a05a1169f2e2293bf8987a9d96ab",
  "container_name": "flog",
  "host": "my-host.local",
  "image": "mingrammer/flog",
  "message": "150.75.72.205 - - [03/Oct/2020:16:11:29 +0000] \"HEAD /initiatives HTTP/1.1\" 504 117",
  "source_type": "docker",
  "stream": "stdout"
}

How it works

Context

By default, the docker_logs source augments events with helpful context keys.

Merging Split Messages

Docker, by default, splits log messages that exceed 16kb. This can be a rather frustrating problem because it produces malformed log messages that are difficult to work with. Vector’s solves this by default, automatically merging these messages into a single message. You can turn this off via the auto_partial_merge option. Furthermore, you can adjust the marker that we use to determine if an event is partial via the partial_event_marker_field option.

State

This component is stateless, meaning its behavior is consistent across each input.