AWS GlueΒΆ
- [[Fully managed]] [[ETL (Extract, Transform & Load)]] service
- [[Automating]] time consuming steps of [[data preparation for analytics]]
- Serverless, pay as you go, fully managed, provisions [[Apache Spark]]
- [[Crawl]]s data sources and identifies data formats ([[schema inference]])
- Automated [[Code Generation]]
- Sources: AWS Aurora, AWS RDS, Redshift & AWS S3
- Sinks: AWS S3, Redshift, etc.
- [[Glue Data Catalog]]: [[Metadata]] (definition & schema) of [[Source Table]]s