Applicable when
- Glue job can save data into different tables (separate table per customer scenario)
- The data format is one of
json
,csv
,avro
, orglueparquet
Non-applicable when
- The different file format is required
Implementation
The code below is a Python script of the Glue job. It is used to write data frame (df) into specific glue table even if it does not exist (will be automatically created)
catalog_table_name = CATALOG_DB_NAME + "_" + CUSTOMER_ID
sink = glue_context.getSink(connection_type="s3", path=OUTPUT_FILE_PATH,
enableUpdateCatalog=True, updateBehavior="UPDATE_IN_DATABASE")
sink.setFormat("glueparquet")
sink.setCatalogInfo(catalogDatabase=CATALOG_DB_NAME, catalogTableName=catalog_table_name)
sink.writeFrame(df)
Comments
0 comments
Please sign in to leave a comment.