Kinesis¶
- Kinesis is a managed alternative to [[Apache Kafka]]
- Great for allocation logs, metrics, IoT, clickstreams
- Great for "real-time" [[big data]]
- Great for streaming processing frameworks ([[Spark]], [[NiFi]], etc...)
-
Data is automatically replicated to 3 Availability Zones
-
[[Kinesis Streams]]: low latency streaming ingest at scale
- [[Kinesis Analytics]]: perform real-time analytics on streams using SQL
- Kinesis Firehose: load streams into AWS S3, Redshift, ElasticSearch...
Kinesis overview¶
- Streams are divided in ordered Shards / Partitions (like ordered queue)
- Data retention is 1 day by default, can go up to 7 days
- Ability to reprocess / replay data
- Multiple applications can consume the same stream
- Real-time processing with scale of throughput
- Once data inserted in Kinesis, it can't be deleted ([[immutability]])