feat: added DAG readme.md

main
Ricard Illa 2023-06-26 08:48:37 +02:00
parent 3d230263e2
commit 5ed10fb179
No known key found for this signature in database
GPG Key ID: F69A672B72E54902
1 changed files with 21 additions and 1 deletions

View File

@ -1,3 +1,23 @@
# Sustainability score
Placeholder
This DAG orchestrates the ingestion and transformation of prodcuts from
Target's website to compute their sustainability score.
## Steps
* create_products_table: create the prodcuts table with it's schema
* etl_pipeline: run the apache beam etl process
* dbt_run: run `dbt run` to apply transformations
* dbt_test: run `dbt test` to test the data quality
## Config
The following parameters are available:
* `input`: location of the CSV input file
* `beam_etl_path`: location of the apache beam pipeline
* `dbt_path`: location of the dbt project
* `products_table`: products_table table name
I decided not to configure the rest of the table locations because that makes
more sense to be defined in DBT.