A fully managed service to extract, transform and load (ETL) your data for analytics
Discover and search across different AWS data sets without moving your data
AWS Glue retrieves data from sources and writes data to targets stored and transported in various data formats
- If your data is stored or transported in Parquet data format, this document introduces you available features for using your data in AWS Glue
AWS glue consists of
- Central metadata repository
- ETL engine
- Flexible scheduler
Use Cases:
- Run queries against an Amazon S3 data lake
- You can use AWS Glue to make your data available for analytics without moving your data
- Analyze the log data in your data warehouse
- Create ETL transcripts to transform, flatten and enrich the data from source to target
- Run queries against an Amazon S3 data lake
Integration with AWS Glue
- To create database and table schema in the AWS Glue Data Catalog, you can run an AWS Glue crawler from within Athena on a data source, or you can run Data Definition Language (DDL) queries directly in the Athena Query Editor.
- Then, using the database and table schema that you created, you can use Data Manipulation (DML) queries in Athena to query the data.