dayrize-usecase/dags/sustainability_score
Ricard Illa 5ed10fb179
feat: added DAG readme.md
2023-06-26 08:48:37 +02:00
..
sql feat: implemented incremental model for scored products 2023-06-25 22:41:58 +02:00
README.md feat: added DAG readme.md 2023-06-26 08:48:37 +02:00
__init__.py feat: implemented incremental model for scored products 2023-06-25 22:41:58 +02:00

README.md

Sustainability score

This DAG orchestrates the ingestion and transformation of prodcuts from Target's website to compute their sustainability score.

Steps

  • create_products_table: create the prodcuts table with it's schema
  • etl_pipeline: run the apache beam etl process
  • dbt_run: run dbt run to apply transformations
  • dbt_test: run dbt test to test the data quality

Config

The following parameters are available:

  • input: location of the CSV input file
  • beam_etl_path: location of the apache beam pipeline
  • dbt_path: location of the dbt project
  • products_table: products_table table name

I decided not to configure the rest of the table locations because that makes more sense to be defined in DBT.