Skip to content

AWS GlueΒΆ

  • [[Fully managed]] [[ETL (Extract, Transform & Load)]] service
  • [[Automating]] time consuming steps of [[data preparation for analytics]]
  • Serverless, pay as you go, fully managed, provisions [[Apache Spark]]
  • [[Crawl]]s data sources and identifies data formats ([[schema inference]])
  • Automated [[Code Generation]]
  • Sources: AWS Aurora, AWS RDS, Redshift & AWS S3
  • Sinks: AWS S3, Redshift, etc.
  • [[Glue Data Catalog]]: [[Metadata]] (definition & schema) of [[Source Table]]s