JournalD

Collect logs from JournalD

status: stable role: daemon delivery: at-least-once acknowledgements: yes egress: batch state: stateless output: log

Requirements

This source requires permissions to run journalctl. When installed from a package manager this should be handled automatically, otherwise ensure the running user is part of the systemd-journal group.

Configuration

Example configurations

{
  "sources": {
    "my_source_id": {
      "type": "journald"
    }
  }
}
[sources.my_source_id]
type = "journald"
sources:
  my_source_id:
    type: journald
{
  "sources": {
    "my_source_id": {
      "type": "journald",
      "batch_size": 16,
      "current_boot_only": true,
      "data_dir": "/var/lib/vector",
      "exclude_matches": {
        "_SYSTEMD_UNIT": [
          "sshd.service",
          "ntpd.service"
        ],
        "_TRANSPORT": [
          "kernel"
        ]
      },
      "exclude_units": [
        "badservice"
      ],
      "extra_args": [
        "--merge"
      ],
      "include_matches": {
        "_SYSTEMD_UNIT": [
          "sshd.service",
          "ntpd.service"
        ],
        "_TRANSPORT": [
          "kernel"
        ]
      },
      "include_units": [
        "ntpd"
      ]
    }
  }
}
[sources.my_source_id]
type = "journald"
batch_size = 16
current_boot_only = true
data_dir = "/var/lib/vector"
exclude_units = [ "badservice" ]
extra_args = [ "--merge" ]
include_units = [ "ntpd" ]

  [sources.my_source_id.exclude_matches]
  _SYSTEMD_UNIT = [ "sshd.service", "ntpd.service" ]
  _TRANSPORT = [ "kernel" ]

  [sources.my_source_id.include_matches]
  _SYSTEMD_UNIT = [ "sshd.service", "ntpd.service" ]
  _TRANSPORT = [ "kernel" ]
sources:
  my_source_id:
    type: journald
    batch_size: 16
    current_boot_only: true
    data_dir: /var/lib/vector
    exclude_matches:
      _SYSTEMD_UNIT:
        - sshd.service
        - ntpd.service
      _TRANSPORT:
        - kernel
    exclude_units:
      - badservice
    extra_args:
      - --merge
    include_matches:
      _SYSTEMD_UNIT:
        - sshd.service
        - ntpd.service
      _TRANSPORT:
        - kernel
    include_units:
      - ntpd

acknowledgements

optional object

Deprecated

This field is deprecated.

Controls how acknowledgements are handled by this source.

This setting is deprecated in favor of enabling acknowledgements at the global or sink level.

Enabling or disabling acknowledgements at the source level has no effect on acknowledgement behavior.

See End-to-end Acknowledgements for more information on how event acknowledgement is handled.

Whether or not end-to-end acknowledgements are enabled for this source.

batch_size

optional uint

The systemd journal is read in batches, and a checkpoint is set at the end of each batch.

This option limits the size of the batch.

default: 16 (events)

current_boot_only

optional bool
Only include entries that occurred after the current boot of the system.
default: true

data_dir

optional string literal

The directory used to persist file checkpoint positions.

By default, the global data_dir option is used. Make sure the running user has write permissions to this directory.

If this directory is specified, then Vector will attempt to create it.

Examples
"/var/lib/vector"

emit_cursor

optional bool
Whether to emit the __CURSOR field. See also sd_journal_get_cursor.
default: false

exclude_matches

optional object

A list of sets of field/value pairs that, if any are present in a journal entry, excludes the entry from this source.

If exclude_units is specified, it is merged into this list.

exclude_matches.*

required [string]
The set of field values to match in journal entries that are to be excluded.

exclude_units

optional [string]

A list of unit names to exclude from monitoring.

Unit names lacking a . have .service appended to make them a valid service unit name.

Array string literal
Examples
[
  "badservice",
  "sysinit.target"
]

extra_args

optional [string]

A list of extra command line arguments to pass to journalctl.

If specified, it is merged to the command line arguments as-is.

Array string literal
Examples
[
  "--merge"
]

include_matches

optional object

A list of sets of field/value pairs to monitor.

If empty or not present, all journal fields are accepted.

If include_units is specified, it is merged into this list.

include_matches.*

required [string]
The set of field values to match in journal entries that are to be included.

include_units

optional [string]

A list of unit names to monitor.

If empty or not present, all units are accepted.

Unit names lacking a . have .service appended to make them a valid service unit name.

Array string literal
Examples
[
  "ntpd",
  "sysinit.target"
]

journal_directory

optional string literal

The full path of the journal directory.

If not set, journalctl uses the default system journal path.

journal_namespace

optional string literal

The journal namespace.

This value is passed to journalctl through the --namespace option. If not set, journalctl uses the default namespace.

journalctl_path

optional string literal

The full path of the journalctl executable.

If not set, a search is done for the journalctl path.

remap_priority

optional bool

Deprecated

This option has been deprecated, use the remap transform and to_syslog_level function instead.

Enables remapping the PRIORITY field from an integer to string value.

Has no effect unless the value of the field is already an integer.

default: false

since_now

optional bool
Only include entries that appended to the journal after the entries have been read.
default: false

Outputs

<component_id>

Default output stream of the component. Use this component’s ID as an input to downstream transforms and sinks.

Output Data

Logs

Warning

The fields shown below will be different if log namespacing is enabled. See Log Namespacing for more details

Event

A Journald event
Fields
* optional string literal
Any Journald field
Examples
/usr/sbin/ntpd
c36e9ea52800a19d214cb71b53263a28
host required string literal
The local hostname, equivalent to the gethostname command.
Examples
my-host.local
message required string literal
The raw line from the file.
Examples
53.126.150.246 - - [01/Oct/2020:11:25:58 -0400] "GET /disintermediate HTTP/2.0" 401 20308
source_type required string literal
The name of the source type.
Examples
journald
timestamp required timestamp
The time at which the event appeared in the journal.
Examples
2020-10-10T17:07:36.452332Z

Telemetry

Metrics

link

component_discarded_events_total

counter
The number of events dropped by this component.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
intentional
True if the events were discarded intentionally, like a filter transform, or false if due to an error.
pid optional
The process ID of the Vector instance.

component_errors_total

counter
The total number of errors encountered by this component.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
error_type
The type of the error
host optional
The hostname of the system Vector is running on.
pid optional
The process ID of the Vector instance.
stage
The stage within the component at which the error occurred.

component_received_bytes_total

counter
The number of raw bytes accepted by this component from source origins.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
container_name optional
The name of the container from which the data originated.
file optional
The file from which the data originated.
host optional
The hostname of the system Vector is running on.
mode optional
The connection mode used by the component.
peer_addr optional
The IP from which the data originated.
peer_path optional
The pathname from which the data originated.
pid optional
The process ID of the Vector instance.
pod_name optional
The name of the pod from which the data originated.
uri optional
The sanitized URI from which the data originated.

component_received_event_bytes_total

counter
The number of event bytes accepted by this component either from tagged origins like file and uri, or cumulatively from other origins.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
container_name optional
The name of the container from which the data originated.
file optional
The file from which the data originated.
host optional
The hostname of the system Vector is running on.
mode optional
The connection mode used by the component.
peer_addr optional
The IP from which the data originated.
peer_path optional
The pathname from which the data originated.
pid optional
The process ID of the Vector instance.
pod_name optional
The name of the pod from which the data originated.
uri optional
The sanitized URI from which the data originated.

component_received_events_count

histogram

A histogram of the number of events passed in each internal batch in Vector’s internal topology.

Note that this is separate than sink-level batching. It is mostly useful for low level debugging performance issues in Vector due to small internal batches.

component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
container_name optional
The name of the container from which the data originated.
file optional
The file from which the data originated.
host optional
The hostname of the system Vector is running on.
mode optional
The connection mode used by the component.
peer_addr optional
The IP from which the data originated.
peer_path optional
The pathname from which the data originated.
pid optional
The process ID of the Vector instance.
pod_name optional
The name of the pod from which the data originated.
uri optional
The sanitized URI from which the data originated.

component_received_events_total

counter
The number of events accepted by this component either from tagged origins like file and uri, or cumulatively from other origins.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
container_name optional
The name of the container from which the data originated.
file optional
The file from which the data originated.
host optional
The hostname of the system Vector is running on.
mode optional
The connection mode used by the component.
peer_addr optional
The IP from which the data originated.
peer_path optional
The pathname from which the data originated.
pid optional
The process ID of the Vector instance.
pod_name optional
The name of the pod from which the data originated.
uri optional
The sanitized URI from which the data originated.

component_sent_event_bytes_total

counter
The total number of event bytes emitted by this component.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
output optional
The specific output of the component.
pid optional
The process ID of the Vector instance.

component_sent_events_total

counter
The total number of events emitted by this component.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
output optional
The specific output of the component.
pid optional
The process ID of the Vector instance.

source_lag_time_seconds

histogram
The difference between the timestamp recorded in each event and the time when it was ingested, expressed as fractional seconds.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
pid optional
The process ID of the Vector instance.

Examples

Sample Output

Given this event...
2019-07-26 20:30:27 reply from 192.168.1.2: offset -0.001791 delay 0.000176, next query 1500s
...and this configuration...
sources:
  my_source_id:
    type: journald
[sources.my_source_id]
type = "journald"
{
  "sources": {
    "my_source_id": {
      "type": "journald"
    }
  }
}
...this Vector event is produced:
[{"log":{"PRIORITY":"6","SYSLOG_FACILITY":"3","SYSLOG_IDENTIFIER":"ntpd","_BOOT_ID":"124c781146e841ae8d9b4590df8b9231","_CAP_EFFECTIVE":"3fffffffff","_CMDLINE":"ntpd: [priv]","_COMM":"ntpd","_EXE":"/usr/sbin/ntpd","_GID":"0","_MACHINE_ID":"c36e9ea52800a19d214cb71b53263a28","_PID":"2156","_STREAM_ID":"92c79f4b45c4457490ebdefece29995e","_SYSTEMD_CGROUP":"/system.slice/ntpd.service","_SYSTEMD_INVOCATION_ID":"496ad5cd046d48e29f37f559a6d176f8","_SYSTEMD_SLICE":"system.slice","_SYSTEMD_UNIT":"ntpd.service","_TRANSPORT":"stdout","_UID":"0","__MONOTONIC_TIMESTAMP":"98694000446","__REALTIME_TIMESTAMP":"1564173027000443","host":"my-host.local","message":"reply from 192.168.1.2: offset -0.001791 delay 0.000176, next query 1500s","source_type":"journald","timestamp":"2020-10-10T17:07:36.452332Z"}}]

How it works

Checkpointing

Vector checkpoints the current read position after each successful read. This ensures that Vector resumes where it left off if restarted, preventing data from being read twice. The checkpoint positions are stored in the data directory which is specified via the global data_dir option, but can be overridden via the data_dir option in the file source directly.

Communication Strategy

To ensure the journald source works across all platforms, Vector interacts with the systemd journal via the journalctl command. This is accomplished by spawning a subprocess that Vector interacts with. If the journalctl command is not in the environment path you can specify the exact location via the journalctl_path option. For more information on this communication strategy please see issue #1473.

Context

By default, the journald source augments events with helpful context keys.

Non-ASCII Messages

When journald has stored a message that is not strict ASCII, journalctl will output it in an alternate format to prevent data loss. Vector handles this alternate format by translating such messages into UTF-8 in “lossy” mode, where characters that are not valid UTF-8 are replaced with the Unicode replacement character, .

State

This component is stateless, meaning its behavior is consistent across each input.