Skip to content

Kinesis Data Firehose

  • Fully Managed Service, no administration, automatic scaling, serverless
    • AWS: Redshift / Amazon S3 / ElasticSearch
    • 3rd part partner: Splunk / MongoDB / DataDog / NewRelic / ...
    • Custom: send to any HTTP endpoint
  • Pay for data going through Firehose
  • Near Real Time
    • 60 seconds latency minumum for non full batches
    • or minumum 32 MB of data at a time
  • Supports many data formats, conversions, transformations, compression
  • Supports custom data transformations using AWS Lambda
  • Can send failed or all data to backup S3 bucket

Kinesis Data Streams vs Firehose

  • Kinesis Data Streams
    • Streaming service for ingest at scale
    • Write custom code (producer / consumer)
    • Real-time (~200ms)
    • Manage scaling (shard splitting / merging)
    • Data storage for 1 to 365 days
    • Supports replay capability
  • Kinesis Data Firehose
    • Load streaming data into S3 / Redshift / ES / 3rd party / Custom HTTP
    • Fully managed
    • Near real-time (buffer time min 60 sec)
    • Automatic scaling
    • No data storage
    • Doesn't support replay capability