dayrize-usecase/dags/sustainability_score/README.md

24 lines
732 B
Markdown
Raw Normal View History

2023-06-21 19:12:03 +02:00
# Sustainability score
2023-06-26 08:48:37 +02:00
This DAG orchestrates the ingestion and transformation of prodcuts from
Target's website to compute their sustainability score.
## Steps
* create_products_table: create the prodcuts table with it's schema
* etl_pipeline: run the apache beam etl process
* dbt_run: run `dbt run` to apply transformations
* dbt_test: run `dbt test` to test the data quality
## Config
The following parameters are available:
* `input`: location of the CSV input file
* `beam_etl_path`: location of the apache beam pipeline
* `dbt_path`: location of the dbt project
* `products_table`: products_table table name
I decided not to configure the rest of the table locations because that makes
more sense to be defined in DBT.