From 5ed10fb179c79b2540b36588c2ecfb28b8415dc8 Mon Sep 17 00:00:00 2001 From: Ricard Illa Date: Mon, 26 Jun 2023 08:48:37 +0200 Subject: [PATCH] feat: added DAG readme.md --- dags/sustainability_score/README.md | 22 +++++++++++++++++++++- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/dags/sustainability_score/README.md b/dags/sustainability_score/README.md index 3a60a92..ebd1ffb 100644 --- a/dags/sustainability_score/README.md +++ b/dags/sustainability_score/README.md @@ -1,3 +1,23 @@ # Sustainability score -Placeholder +This DAG orchestrates the ingestion and transformation of prodcuts from +Target's website to compute their sustainability score. + +## Steps + +* create_products_table: create the prodcuts table with it's schema +* etl_pipeline: run the apache beam etl process +* dbt_run: run `dbt run` to apply transformations +* dbt_test: run `dbt test` to test the data quality + +## Config + +The following parameters are available: + +* `input`: location of the CSV input file +* `beam_etl_path`: location of the apache beam pipeline +* `dbt_path`: location of the dbt project +* `products_table`: products_table table name + +I decided not to configure the rest of the table locations because that makes +more sense to be defined in DBT.