Pulsar

Collect observability events from Apache Pulsar topics

status: beta role: aggregator delivery: at-least-once acknowledgements: yes egress: stream state: stateless output: log

Configuration

Example configurations

{
  "sources": {
    "my_source_id": {
      "type": "pulsar",
      "endpoint": "pulsar://127.0.0.1:6650",
      "topics": "topic-1234"
    }
  }
}
[sources.my_source_id]
type = "pulsar"
endpoint = "pulsar://127.0.0.1:6650"
topics = "topic-1234"
sources:
  my_source_id:
    type: pulsar
    endpoint: pulsar://127.0.0.1:6650
    topics: topic-1234
{
  "sources": {
    "my_source_id": {
      "type": "pulsar",
      "endpoint": "pulsar://127.0.0.1:6650",
      "topics": "topic-1234"
    }
  }
}
[sources.my_source_id]
type = "pulsar"
endpoint = "pulsar://127.0.0.1:6650"
topics = "topic-1234"
sources:
  my_source_id:
    type: pulsar
    endpoint: pulsar://127.0.0.1:6650
    topics: topic-1234

auth

optional object
Options for the authentication strategy.

auth.name

optional string literal
The basic authentication name.
Examples
"${PULSAR_NAME}"
"name123"

auth.oauth2

optional object
Options for OAuth2 authentication.
auth.oauth2.audience
optional string literal
OAuth2 audience.
Examples
"${OAUTH2_AUDIENCE}"
"pulsar"
auth.oauth2.credentials_url
required string literal
The url for credentials. The data url is also supported.
Examples
"{OAUTH2_CREDENTIALS_URL}"
"file:///oauth2_credentials"
"data:application/json;base64,cHVsc2FyCg=="
auth.oauth2.issuer_url
required string literal
The issuer url.
Examples
"${OAUTH2_ISSUER_URL}"
"https://oauth2.issuer"
auth.oauth2.scope
optional string literal
OAuth2 scope.
Examples
"${OAUTH2_SCOPE}"
"admin"

auth.token

optional string literal
The basic authentication password.
Examples
"${PULSAR_TOKEN}"
"123456789"

endpoint

required string literal
Endpoint to which the pulsar client should connect to.
Examples
"pulsar://127.0.0.1:6650"

topics

required string literal
The Pulsar topic names to read events from.
Examples
"topic-1234"

Outputs

<component_id>

Default output stream of the component. Use this component’s ID as an input to downstream transforms and sinks.

Output Data

Logs

Warning

The fields shown below will be different if log namespacing is enabled. See Log Namespacing for more details

Record

An individual Pulsar record
Fields
message required string literal
The raw line from the Kafka record.
Examples
53.126.150.246 - - [01/Oct/2020:11:25:58 -0400] "GET /disintermediate HTTP/2.0" 401 20308
producer_name required string literal
The Pulsar producer’s name which the record came from.
Examples
pulsar-client
publish_time required timestamp
The timestamp encoded in the Pulsar message.
Examples
2020-10-10T17:07:36.452332Z
source_type required string literal
The name of the source type.
Examples
pulsar
timestamp required timestamp
The current time if it cannot be fetched.
Examples
2020-10-10T17:07:36.452332Z
topic required string literal
The Pulsar topic that the record came from.
Examples
topic

Telemetry

Metrics

link

component_discarded_events_total

counter
The number of events dropped by this component.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
intentional
True if the events were discarded intentionally, like a filter transform, or false if due to an error.
pid optional
The process ID of the Vector instance.

component_errors_total

counter
The total number of errors encountered by this component.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
error_type
The type of the error
host optional
The hostname of the system Vector is running on.
pid optional
The process ID of the Vector instance.
stage
The stage within the component at which the error occurred.

component_received_bytes_total

counter
The number of raw bytes accepted by this component from source origins.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
container_name optional
The name of the container from which the data originated.
file optional
The file from which the data originated.
host optional
The hostname of the system Vector is running on.
mode optional
The connection mode used by the component.
peer_addr optional
The IP from which the data originated.
peer_path optional
The pathname from which the data originated.
pid optional
The process ID of the Vector instance.
pod_name optional
The name of the pod from which the data originated.
uri optional
The sanitized URI from which the data originated.

component_received_event_bytes_total

counter
The number of event bytes accepted by this component either from tagged origins like file and uri, or cumulatively from other origins.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
container_name optional
The name of the container from which the data originated.
file optional
The file from which the data originated.
host optional
The hostname of the system Vector is running on.
mode optional
The connection mode used by the component.
peer_addr optional
The IP from which the data originated.
peer_path optional
The pathname from which the data originated.
pid optional
The process ID of the Vector instance.
pod_name optional
The name of the pod from which the data originated.
uri optional
The sanitized URI from which the data originated.

component_received_events_count

histogram

A histogram of the number of events passed in each internal batch in Vector’s internal topology.

Note that this is separate than sink-level batching. It is mostly useful for low level debugging performance issues in Vector due to small internal batches.

component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
container_name optional
The name of the container from which the data originated.
file optional
The file from which the data originated.
host optional
The hostname of the system Vector is running on.
mode optional
The connection mode used by the component.
peer_addr optional
The IP from which the data originated.
peer_path optional
The pathname from which the data originated.
pid optional
The process ID of the Vector instance.
pod_name optional
The name of the pod from which the data originated.
uri optional
The sanitized URI from which the data originated.

component_received_events_total

counter
The number of events accepted by this component either from tagged origins like file and uri, or cumulatively from other origins.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
container_name optional
The name of the container from which the data originated.
file optional
The file from which the data originated.
host optional
The hostname of the system Vector is running on.
mode optional
The connection mode used by the component.
peer_addr optional
The IP from which the data originated.
peer_path optional
The pathname from which the data originated.
pid optional
The process ID of the Vector instance.
pod_name optional
The name of the pod from which the data originated.
uri optional
The sanitized URI from which the data originated.

component_sent_event_bytes_total

counter
The total number of event bytes emitted by this component.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
output optional
The specific output of the component.
pid optional
The process ID of the Vector instance.

component_sent_events_total

counter
The total number of events emitted by this component.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
output optional
The specific output of the component.
pid optional
The process ID of the Vector instance.

source_lag_time_seconds

histogram
The difference between the timestamp recorded in each event and the time when it was ingested, expressed as fractional seconds.
component_id
The Vector component ID.
component_kind
The Vector component kind.
component_type
The Vector component type.
host optional
The hostname of the system Vector is running on.
pid optional
The process ID of the Vector instance.

How it works

Context

By default, the pulsar source augments events with helpful context keys.

State

This component is stateless, meaning its behavior is consistent across each input.