AWS Kinesis Firehose Source
The Vector aws_kinesis_firehose
source
receives logs from AWS Kinesis
Firehose.
Configuration
- Common
- Advanced
- vector.toml
- vector.yaml
- vector.json
[sources.my_source_id]type = "aws_kinesis_firehose" # requiredaccess_key = "A94A8FE5CCB19BA61C4C08" # optional, no defaultaddress = "0.0.0.0:443" # requiredregion = "us-east-1" # required, required when endpoint = null
- optionalstring
access_key
AWS Kinesis Firehose can be configured to pass along an access key to authenticate requests. If configured,
access_key
should be set to the same value. If not specified, vector will treat all requests as authenticated. See Forwarding CloudWatch Log events for more info.- View examples
- requiredstring
address
The address to listen for connections on
- View examples
- optionalstring
assume_role
The ARN of an IAM role to assume at startup. See AWS Authentication for more info.
- View examples
- optionalstring
endpoint
Custom endpoint for use with AWS-compatible services. Providing a value for this option will make
region
moot.- Only relevant when: region = null
- View examples
- required*string
region
The AWS region of the target service. If
endpoint
is provided it will override this value since the endpoint includes the region.- Only required when: endpoint = null
- View examples
- optionaltable
tls
Configures the TLS options for incoming connections.
- optionalstring
ca_file
Absolute path to an additional CA certificate file, in DER or PEM format (X.509), or an in-line CA certificate in PEM format.
- View examples
- optionalstring
crt_file
Absolute path to a certificate file used to identify this server, in DER or PEM format (X.509) or PKCS#12, or an in-line certificate in PEM format. If this is set, and is not a PKCS#12 archive,
key_file
must also be set. This is required ifenabled
is set totrue
.- View examples
- optionalbool
enabled
Require TLS for incoming connections. If this is set, an identity certificate is also required.
- Default:
false
- View examples
- Default:
- optionalstring
key_file
Absolute path to a private key file used to identify this server, in DER or PEM format (PKCS#8), or an in-line private key in PEM format.
- View examples
- optionalstring
key_pass
Pass phrase used to unlock the encrypted key file. This has no effect unless
key_file
is set.- View examples
- optionalbool
verify_certificate
If
true
, Vector will require a TLS certificate from the connecting host and terminate the connection if the certificate is not valid. Iffalse
(the default), Vector will not request a certificate from the client.- Default:
false
- View examples
- Default:
Env Vars
- optionalstring
AWS_ACCESS_KEY_ID
The AWS access key id. Used for AWS authentication when communicating with AWS services. See AWS Authentication for more info.
- View examples
- optionalstring
AWS_CONFIG_FILE
Specifies the location of the file that the AWS CLI uses to store configuration profiles.
- Default:
"~/.aws/config"
- Default:
- optionalstring
AWS_CREDENTIAL_EXPIRATION
Expiration time in RFC 3339 format. If unset, credentials won't expire.
- View examples
- optionalstring
AWS_DEFAULT_REGION
The default AWS region.
- Only relevant when: endpoint = null
- View examples
- optionalstring
AWS_PROFILE
Specifies the name of the CLI profile with the credentials and options to use. This can be the name of a profile stored in a credentials or config file.
- Default:
"default"
- View examples
- Default:
- optionalstring
AWS_ROLE_SESSION_NAME
Specifies a name to associate with the role session. This value appears in CloudTrail logs for commands performed by the user of this profile.
- View examples
- optionalstring
AWS_SECRET_ACCESS_KEY
The AWS secret access key. Used for AWS authentication when communicating with AWS services. See AWS Authentication for more info.
- View examples
- optionalstring
AWS_SESSION_TOKEN
The AWS session token. Used for AWS authentication when communicating with AWS services.
- View examples
- optionalstring
AWS_SHARED_CREDENTIALS_FILE
Specifies the location of the file that the AWS CLI uses to store access keys.
- Default:
"~/.aws/credentials"
- Default:
Output
This component outputs log events with the following fields:
{"message" : "Started GET / for 127.0.0.1 at 2012-03-10 14:28:14 +0100","request_id" : "ed1d787c-b9e2-4631-92dc-8e7c9d26d804","source_arn" : "arn:aws:firehose:us-east-1:111111111111:deliverystream/test","timestamp" : "2020-10-10T17:07:36+00:00"}
- requiredstring
message
The raw record from the incoming payload.
- View examples
- requiredstring
request_id
The AWS Kinesis Firehose request ID, value of the
X-Amz-Firehose-Request-Id
header.- View examples
- requiredstring
source_arn
The AWS Kinises Firehose delivery stream that issued the request, value of the
X-Amz-Firehose-Source-Arn
header.- View examples
- requiredtimestamp
timestamp
The exact time the event was ingested into Vector.
- View examples
Telemetry
This component provides the following metrics that can be retrieved through
the internal_metrics
source. See the
metrics section in the
monitoring page for more info.
- counter
request_read_errors_total
The total number of request read errors for this component. This metric includes the following tags:
component_kind
- The Vector component kind.component_name
- The Vector component ID.component_type
- The Vector component type.instance
- The Vector instance identified by host and port.job
- The name of the job producing Vector metrics.
- counter
processed_events_total
The total number of events processed by this component. This metric includes the following tags:
component_kind
- The Vector component kind.component_name
- The Vector component ID.component_type
- The Vector component type.file
- The file that produced the errorinstance
- The Vector instance identified by host and port.job
- The name of the job producing Vector metrics.
- counter
requests_received_total
The total number of requests received by this component. This metric includes the following tags:
component_kind
- The Vector component kind.component_name
- The Vector component ID.component_type
- The Vector component type.instance
- The Vector instance identified by host and port.job
- The name of the job producing Vector metrics.
- counter
processed_bytes_total
The total number of bytes processed by the component. This metric includes the following tags:
component_kind
- The Vector component kind.component_name
- The Vector component ID.component_type
- The Vector component type.instance
- The Vector instance identified by host and port.job
- The name of the job producing Vector metrics.
Examples
Given the following input:
{"requestId": "ed1d787c-b9e2-4631-92dc-8e7c9d26d804","timestamp": 1600110760138,"records": [{"data": "H4sIABk1bV8AA52TzW7bMBCE734KQ2db/JdI3QzETS8FAtg91UGgyOuEqCQq5Mqua+TdS8lu0hYNUpQHAdoZDcn9tKfJdJo0EEL5AOtjB0kxTa4W68Xdp+VqtbheJrPB4A4t+EFiv6yzVLuHa+/6blARAr5UV+ihbH4vh/4+VN52aF37wdYIPkTDlyhF8SrabFsOWhIrtz+Dlnto8dV3Gp9RstshXKhMi0xpqk3GpNJccpFRKYw0WvCM5kIbzrVWipm4VK55rrSk44HGHLTx/lg2wxVYRiljVGWGCvPiuPRn2O60Se6P8UKbpOBZrulsk2xLhCEjljYJk2QFHeGU04KxQqpCsumcSko3SfQ+uoBnn8pTJmjKWZYyI0axAXx021G++bweS5136CpXj8WP6/UNYek5ycMOPPhReETsQkHI4XBIO2/bynZlXXkXwryrS9w536TWkab0XwED6e/tU2/R9eGS9NTD5VgEvnWwtQikcu0e/AO0FYyu4HpfwR3Gf2R0Btza9qxgiUNUISiLr30AP7fbyMzu7OWA803ynIzdfJ69B1EZpoVhsWMRZ8a5UVJoRoUyUlDNspxzZWiEnOXiXYiSvQOR5TnN/xsiNalmKZcy5Yr/yfB6+RZD/gbDC0IbOx8wQrMhxGGYx4lBW5X1wJBLkpO981jWf6EXogvIrm+rYYrKOn4Hgbg4b439/s8cFeVvcNwBtHBkOdWvQIdRnTxPfgCXvyEgSQQAAA=="}]}
And the following configuration:
[sources.aws_kinesis_firehose]type = "aws_kinesis_firehose"address = "0.0.0.0:443"
The following Vector log event will be output:
[{"log": {"request_id": "ed1d787c-b9e2-4631-92dc-8e7c9d26d804","source_arn": "arn:aws:firehose:us-east-1:111111111111:deliverystream/test","timestamp": "2020-09-14T19:12:40.138Z","message": "{\"messageType\":\"DATA_MESSAGE\",\"owner\":\"111111111111\",\"logGroup\":\"test\",\"logStream\":\"test\",\"subscriptionFilters\":[\"Destination\"],\"logEvents\":[{\"id\":\"35683658089614582423604394983260738922885519999578275840\",\"timestamp\":1600110569039,\"message\":\"{\\\"bytes\\\":26780,\\\"datetime\\\":\\\"14/Sep/2020:11:45:41 -0400\\\",\\\"host\\\":\\\"157.130.216.193\\\",\\\"method\\\":\\\"PUT\\\",\\\"protocol\\\":\\\"HTTP/1.0\\\",\\\"referer\\\":\\\"https://www.principalcross-platform.io/markets/ubiquitous\\\",\\\"request\\\":\\\"/expedite/convergence\\\",\\\"source_type\\\":\\\"stdin\\\",\\\"status\\\":301,\\\"user-identifier\\\":\\\"-\\\"}\"},{\"id\":\"35683658089659183914001456229543810359430816722590236673\",\"timestamp\":1600110569041,\"message\":\"{\\\"bytes\\\":17707,\\\"datetime\\\":\\\"14/Sep/2020:11:45:41 -0400\\\",\\\"host\\\":\\\"109.81.244.252\\\",\\\"method\\\":\\\"GET\\\",\\\"protocol\\\":\\\"HTTP/2.0\\\",\\\"referer\\\":\\\"http://www.investormission-critical.io/24/7/vortals\\\",\\\"request\\\":\\\"/scale/functionalities/optimize\\\",\\\"source_type\\\":\\\"stdin\\\",\\\"status\\\":502,\\\"user-identifier\\\":\\\"feeney1708\\\"}\"}]}"}}]
How It Works
AWS Authentication
Vector checks for AWS credentials in the following order:
- Environment variables
AWS_ACCESS_KEY_ID
andAWS_SECRET_ACCESS_KEY
. - The
credential_process
command in the AWS config file. (usually located at~/.aws/config
) - The AWS credentials file. (usually located at
~/.aws/credentials
) - The IAM instance profile. (will only work if running on an EC2 instance with an instance profile/role)
If credentials are not found the healtcheck will fail and an error will be logged.
Obtaining an access key
In general, we recommend using instance profiles/roles whenever possible. In cases where this is not possible you can generate an AWS access key for any user within your AWS account. AWS provides a detailed guide on how to do this.
Assuming roles
Vector can assume an AWS IAM role via the assume_role
option. This is an
optional setting that is helpful for a variety of use cases, such as cross
account access.
Context
By default, the aws_kinesis_firehose
source will augment events with helpful
context keys as shown in the "Output" section.
Forwarding CloudWatch Log events
This source is the recommended way to ingest logs from AWS CloudWatch logs via [AWS CloudWatch Log subscriptions][aws_cloudwatch_logs_subscriptions]. To set this up:
Deploy vector with a publicly exposed HTTP endpoint using this source. You will likely also want to use the [
aws_cloudwatch_logs_subscription_parser
][vector_transform_aws_cloudwatch_logs_subscription_parser] transform to extract the log events. Make sure to set theaccess_key
to secure this endpoint. Your configuration might look something like:[sources.firehose]# Generaltype = "aws_kinesis_firehose"address = "127.0.0.1:9000"access_key = "secret"[transforms.cloudwatch]type = "aws_cloudwatch_logs_subscription_parser"inputs = ["firehose"][sinks.console]type = "console"inputs = ["cloudwatch"]encoding.codec = "json"Create a Kinesis Firewatch delivery stream in the region where the CloudWatch Logs groups exist that you want to ingest.
Set the stream to forward to your Vector instance via its HTTP Endpoint destination. Make sure to configure the same
access_key
you set earlier.Setup a [CloudWatch Logs subscription][aws_cloudwatch_logs_subscriptions] to forward the events to your delivery stream
Transport Layer Security (TLS)
Vector uses Openssl for TLS protocols. You can
adjust TLS behavior via the tls.*
options.