AWS Kinesis Firehose Source

The Vector aws_kinesis_firehose source receives logs from AWS Kinesis Firehose.

Configuration

[sources.my_source_id]
type = "aws_kinesis_firehose" # required
access_key = "A94A8FE5CCB19BA61C4C08" # optional, no default
address = "0.0.0.0:443" # required
region = "us-east-1" # required, required when endpoint = null
  • commonoptionalstring

    access_key

    AWS Kinesis Firehose can be configured to pass along an access key to authenticate requests. If configured, access_key should be set to the same value. If not specified, vector will treat all requests as authenticated. See Forwarding CloudWatch Log events for more info.

    • View examples
  • commonrequiredstring

    address

    The address to listen for connections on

    • View examples
  • optionalstring

    assume_role

    The ARN of an IAM role to assume at startup. See AWS Authentication for more info.

    • View examples
  • optionalstring

    endpoint

    Custom endpoint for use with AWS-compatible services. Providing a value for this option will make region moot.

    • Only relevant when: region = null
    • View examples
  • commonrequired*string

    region

    The AWS region of the target service. If endpoint is provided it will override this value since the endpoint includes the region.

    • Only required when: endpoint = null
    • View examples
  • optionaltable

    tls

    Configures the TLS options for incoming connections.

    • optionalstring

      ca_file

      Absolute path to an additional CA certificate file, in DER or PEM format (X.509), or an in-line CA certificate in PEM format.

      • View examples
    • optionalstring

      crt_file

      Absolute path to a certificate file used to identify this server, in DER or PEM format (X.509) or PKCS#12, or an in-line certificate in PEM format. If this is set, and is not a PKCS#12 archive, key_file must also be set. This is required if enabled is set to true.

      • View examples
    • optionalbool

      enabled

      Require TLS for incoming connections. If this is set, an identity certificate is also required.

      • Default: false
      • View examples
    • optionalstring

      key_file

      Absolute path to a private key file used to identify this server, in DER or PEM format (PKCS#8), or an in-line private key in PEM format.

      • View examples
    • optionalstring

      key_pass

      Pass phrase used to unlock the encrypted key file. This has no effect unless key_file is set.

      • View examples
    • optionalbool

      verify_certificate

      If true, Vector will require a TLS certificate from the connecting host and terminate the connection if the certificate is not valid. If false (the default), Vector will not request a certificate from the client.

      • Default: false
      • View examples

Env Vars

  • commonoptionalstring

    AWS_ACCESS_KEY_ID

    The AWS access key id. Used for AWS authentication when communicating with AWS services. See AWS Authentication for more info.

    • View examples
  • commonoptionalstring

    AWS_CONFIG_FILE

    Specifies the location of the file that the AWS CLI uses to store configuration profiles.

    • Default: "~/.aws/config"
  • commonoptionalstring

    AWS_CREDENTIAL_EXPIRATION

    Expiration time in RFC 3339 format. If unset, credentials won't expire.

    • View examples
  • commonoptionalstring

    AWS_DEFAULT_REGION

    The default AWS region.

    • Only relevant when: endpoint = null
    • View examples
  • commonoptionalstring

    AWS_PROFILE

    Specifies the name of the CLI profile with the credentials and options to use. This can be the name of a profile stored in a credentials or config file.

    • Default: "default"
    • View examples
  • commonoptionalstring

    AWS_ROLE_SESSION_NAME

    Specifies a name to associate with the role session. This value appears in CloudTrail logs for commands performed by the user of this profile.

    • View examples
  • commonoptionalstring

    AWS_SECRET_ACCESS_KEY

    The AWS secret access key. Used for AWS authentication when communicating with AWS services. See AWS Authentication for more info.

    • View examples
  • commonoptionalstring

    AWS_SESSION_TOKEN

    The AWS session token. Used for AWS authentication when communicating with AWS services.

    • View examples
  • commonoptionalstring

    AWS_SHARED_CREDENTIALS_FILE

    Specifies the location of the file that the AWS CLI uses to store access keys.

    • Default: "~/.aws/credentials"

Output

This component outputs log events with the following fields:

{
"message" : "Started GET / for 127.0.0.1 at 2012-03-10 14:28:14 +0100",
"request_id" : "ed1d787c-b9e2-4631-92dc-8e7c9d26d804",
"source_arn" : "arn:aws:firehose:us-east-1:111111111111:deliverystream/test",
"timestamp" : "2020-10-10T17:07:36+00:00"
}
  • commonrequiredstring

    message

    The raw record from the incoming payload.

    • View examples
  • commonrequiredstring

    request_id

    The AWS Kinesis Firehose request ID, value of the X-Amz-Firehose-Request-Id header.

    • View examples
  • commonrequiredstring

    source_arn

    The AWS Kinises Firehose delivery stream that issued the request, value of the X-Amz-Firehose-Source-Arn header.

    • View examples
  • commonrequiredtimestamp

    timestamp

    The exact time the event was ingested into Vector.

    • View examples

Telemetry

This component provides the following metrics that can be retrieved through the internal_metrics source. See the metrics section in the monitoring page for more info.

  • counter

    request_read_errors_total

    The total number of request read errors for this component. This metric includes the following tags:

    • component_kind - The Vector component kind.

    • component_name - The Vector component ID.

    • component_type - The Vector component type.

    • instance - The Vector instance identified by host and port.

    • job - The name of the job producing Vector metrics.

  • counter

    processed_events_total

    The total number of events processed by this component. This metric includes the following tags:

    • component_kind - The Vector component kind.

    • component_name - The Vector component ID.

    • component_type - The Vector component type.

    • file - The file that produced the error

    • instance - The Vector instance identified by host and port.

    • job - The name of the job producing Vector metrics.

  • counter

    requests_received_total

    The total number of requests received by this component. This metric includes the following tags:

    • component_kind - The Vector component kind.

    • component_name - The Vector component ID.

    • component_type - The Vector component type.

    • instance - The Vector instance identified by host and port.

    • job - The name of the job producing Vector metrics.

  • counter

    processed_bytes_total

    The total number of bytes processed by the component. This metric includes the following tags:

    • component_kind - The Vector component kind.

    • component_name - The Vector component ID.

    • component_type - The Vector component type.

    • instance - The Vector instance identified by host and port.

    • job - The name of the job producing Vector metrics.

Examples

Given the following input:

{
"requestId": "ed1d787c-b9e2-4631-92dc-8e7c9d26d804",
"timestamp": 1600110760138,
"records": [
{
"data": "H4sIABk1bV8AA52TzW7bMBCE734KQ2db/JdI3QzETS8FAtg91UGgyOuEqCQq5Mqua+TdS8lu0hYNUpQHAdoZDcn9tKfJdJo0EEL5AOtjB0kxTa4W68Xdp+VqtbheJrPB4A4t+EFiv6yzVLuHa+/6blARAr5UV+ihbH4vh/4+VN52aF37wdYIPkTDlyhF8SrabFsOWhIrtz+Dlnto8dV3Gp9RstshXKhMi0xpqk3GpNJccpFRKYw0WvCM5kIbzrVWipm4VK55rrSk44HGHLTx/lg2wxVYRiljVGWGCvPiuPRn2O60Se6P8UKbpOBZrulsk2xLhCEjljYJk2QFHeGU04KxQqpCsumcSko3SfQ+uoBnn8pTJmjKWZYyI0axAXx021G++bweS5136CpXj8WP6/UNYek5ycMOPPhReETsQkHI4XBIO2/bynZlXXkXwryrS9w536TWkab0XwED6e/tU2/R9eGS9NTD5VgEvnWwtQikcu0e/AO0FYyu4HpfwR3Gf2R0Btza9qxgiUNUISiLr30AP7fbyMzu7OWA803ynIzdfJ69B1EZpoVhsWMRZ8a5UVJoRoUyUlDNspxzZWiEnOXiXYiSvQOR5TnN/xsiNalmKZcy5Yr/yfB6+RZD/gbDC0IbOx8wQrMhxGGYx4lBW5X1wJBLkpO981jWf6EXogvIrm+rYYrKOn4Hgbg4b439/s8cFeVvcNwBtHBkOdWvQIdRnTxPfgCXvyEgSQQAAA=="
}
]
}

And the following configuration:

[sources.aws_kinesis_firehose]
type = "aws_kinesis_firehose"
address = "0.0.0.0:443"

The following Vector log event will be output:

[
{
"log": {
"request_id": "ed1d787c-b9e2-4631-92dc-8e7c9d26d804",
"source_arn": "arn:aws:firehose:us-east-1:111111111111:deliverystream/test",
"timestamp": "2020-09-14T19:12:40.138Z",
"message": "{\"messageType\":\"DATA_MESSAGE\",\"owner\":\"111111111111\",\"logGroup\":\"test\",\"logStream\":\"test\",\"subscriptionFilters\":[\"Destination\"],\"logEvents\":[{\"id\":\"35683658089614582423604394983260738922885519999578275840\",\"timestamp\":1600110569039,\"message\":\"{\\\"bytes\\\":26780,\\\"datetime\\\":\\\"14/Sep/2020:11:45:41 -0400\\\",\\\"host\\\":\\\"157.130.216.193\\\",\\\"method\\\":\\\"PUT\\\",\\\"protocol\\\":\\\"HTTP/1.0\\\",\\\"referer\\\":\\\"https://www.principalcross-platform.io/markets/ubiquitous\\\",\\\"request\\\":\\\"/expedite/convergence\\\",\\\"source_type\\\":\\\"stdin\\\",\\\"status\\\":301,\\\"user-identifier\\\":\\\"-\\\"}\"},{\"id\":\"35683658089659183914001456229543810359430816722590236673\",\"timestamp\":1600110569041,\"message\":\"{\\\"bytes\\\":17707,\\\"datetime\\\":\\\"14/Sep/2020:11:45:41 -0400\\\",\\\"host\\\":\\\"109.81.244.252\\\",\\\"method\\\":\\\"GET\\\",\\\"protocol\\\":\\\"HTTP/2.0\\\",\\\"referer\\\":\\\"http://www.investormission-critical.io/24/7/vortals\\\",\\\"request\\\":\\\"/scale/functionalities/optimize\\\",\\\"source_type\\\":\\\"stdin\\\",\\\"status\\\":502,\\\"user-identifier\\\":\\\"feeney1708\\\"}\"}]}"
}
}
]

How It Works

AWS Authentication

Vector checks for AWS credentials in the following order:

  1. Environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY.
  2. The credential_process command in the AWS config file. (usually located at ~/.aws/config)
  3. The AWS credentials file. (usually located at ~/.aws/credentials)
  4. The IAM instance profile. (will only work if running on an EC2 instance with an instance profile/role)

If credentials are not found the healtcheck will fail and an error will be logged.

Obtaining an access key

In general, we recommend using instance profiles/roles whenever possible. In cases where this is not possible you can generate an AWS access key for any user within your AWS account. AWS provides a detailed guide on how to do this.

Assuming roles

Vector can assume an AWS IAM role via the assume_role option. This is an optional setting that is helpful for a variety of use cases, such as cross account access.

Context

By default, the aws_kinesis_firehose source will augment events with helpful context keys as shown in the "Output" section.

Forwarding CloudWatch Log events

This source is the recommended way to ingest logs from AWS CloudWatch logs via [AWS CloudWatch Log subscriptions][aws_cloudwatch_logs_subscriptions]. To set this up:

  1. Deploy vector with a publicly exposed HTTP endpoint using this source. You will likely also want to use the [aws_cloudwatch_logs_subscription_parser][vector_transform_aws_cloudwatch_logs_subscription_parser] transform to extract the log events. Make sure to set the access_key to secure this endpoint. Your configuration might look something like:

    [sources.firehose]
    # General
    type = "aws_kinesis_firehose"
    address = "127.0.0.1:9000"
    access_key = "secret"
    [transforms.cloudwatch]
    type = "aws_cloudwatch_logs_subscription_parser"
    inputs = ["firehose"]
    [sinks.console]
    type = "console"
    inputs = ["cloudwatch"]
    encoding.codec = "json"
  2. Create a Kinesis Firewatch delivery stream in the region where the CloudWatch Logs groups exist that you want to ingest.

  3. Set the stream to forward to your Vector instance via its HTTP Endpoint destination. Make sure to configure the same access_key you set earlier.

  4. Setup a [CloudWatch Logs subscription][aws_cloudwatch_logs_subscriptions] to forward the events to your delivery stream

Transport Layer Security (TLS)

Vector uses Openssl for TLS protocols. You can adjust TLS behavior via the tls.* options.