Commit Graph

52 Commits (main)

Author SHA1 Message Date
Ricard Illa e9446f42f3 feat: added a few more plots to notebooks 2023-06-26 13:03:06 +02:00
Ricard Illa 39d279f089 refactor: some cleanup on etl's code structure 2023-06-26 12:36:19 +02:00
Ricard Illa b28ddc350d doc: added readme to etl 2023-06-26 12:09:50 +02:00
Ricard Illa 4ba29e7e1d feat: postgresql port doesn't need to be exposed 2023-06-26 11:57:50 +02:00
Ricard Illa 0dd81715d4 feat: ignore dbt/.user.yaml 2023-06-26 11:54:17 +02:00
Ricard Illa 543004b51c feat: added pylint and pytest for etl 2023-06-26 11:53:40 +02:00
Ricard Illa ac1101ae96 fix: fix dbt writeable directories permissions 2023-06-26 10:04:53 +02:00
Ricard Illa 7d898c6297 fix: airflow's BashOperator's cwd cannot be templated 2023-06-26 10:04:28 +02:00
Ricard Illa 06c76f0b65
docs: added dbt readme 2023-06-26 08:51:54 +02:00
Ricard Illa 743c57a0d1
feat: minor changes 2023-06-26 08:49:16 +02:00
Ricard Illa 5ed10fb179
feat: added DAG readme.md 2023-06-26 08:48:37 +02:00
Ricard Illa 3d230263e2
feat: added reame 2023-06-26 08:42:26 +02:00
Ricard Illa d97fb6456a
feat: added some plots 2023-06-25 23:52:23 +02:00
Ricard Illa 15178b0b3c
feat: added analysis notebook 2023-06-25 23:24:54 +02:00
Ricard Illa 900955d92d
feat: some cleanup on exploration notebook 2023-06-25 23:24:43 +02:00
Ricard Illa c479a66405
feat: implemented incremental model for scored products 2023-06-25 23:21:40 +02:00
Ricard Illa 73df832a6c
feat: implemented incremental model for scored products 2023-06-25 22:41:58 +02:00
Ricard Illa 1268537695
feat: moved transformations to dbt 2023-06-25 20:53:51 +02:00
Ricard Illa 1bc5daa29e
feat: removed dimensions from schema 2023-06-25 13:20:26 +02:00
Ricard Illa 6d7f4909cc
feat: upsert products 2023-06-25 13:18:44 +02:00
Ricard Illa 2cf2007434
feat: added score calculation 2023-06-25 12:46:53 +02:00
Ricard Illa 6bb944c114
feat: minor renaming 2023-06-25 12:22:38 +02:00
Ricard Illa 4a3d43bcc5
feat: autocommit session 2023-06-25 12:22:06 +02:00
Ricard Illa 915f3c2a4c
feat: tcin is the primary key now 2023-06-25 11:33:06 +02:00
Ricard Illa 9f3a4865ce
refactor: some refactoring of file placements 2023-06-24 19:08:14 +02:00
Ricard Illa 02ad9fab8d
refactor: some refactoring of file placements 2023-06-24 19:07:02 +02:00
Ricard Illa 55b20bb897 feat: update table if there's a primary key conflict 2023-06-23 18:25:11 +02:00
Ricard Illa 8e89404b76 feat: import elements into database using beam 2023-06-23 18:02:01 +02:00
Ricard Illa 66782ec2ef feat: added query to create table to airflow 2023-06-23 16:23:21 +02:00
Ricard Illa cbfbf42d53 feat: added gtin13 column 2023-06-23 15:29:57 +02:00
Ricard Illa 3e42d55e7c feat: added postgres and terraform 2023-06-23 15:20:40 +02:00
Ricard Illa 63e961e9a2 feat: added module for reading and writing data 2023-06-23 11:20:42 +02:00
Ricard Illa 7a86988bd1 misc changes 2023-06-23 11:11:32 +02:00
Ricard Illa a0c868f2f0 refactor: generalize unit conversion function 2023-06-23 10:30:07 +02:00
Ricard Illa 9e3936fdc0 fix: typo 2023-06-23 10:20:08 +02:00
Ricard Illa e716fc1cd3 feat: added parsing dimensions 2023-06-23 10:05:24 +02:00
Ricard Illa 9c19835746 doc: small doc added 2023-06-22 17:36:49 +02:00
Ricard Illa 050323d583 feat: added clean_origin_name 2023-06-22 17:32:08 +02:00
Ricard Illa b018abe00e feat: remove older file 2023-06-22 17:11:49 +02:00
Ricard Illa f2a996d42e feat: tests for materials 2023-06-22 17:11:31 +02:00
Ricard Illa 691b8898a0 docs: added docstring to parse_xml module 2023-06-22 15:51:47 +02:00
Ricard Illa b3069d4ca2 refactor: put helpers in a separate folder 2023-06-22 15:43:26 +02:00
Ricard Illa 0b58d47acc feat: handle malformed xml file 2023-06-22 15:34:38 +02:00
Ricard Illa 9b5ce6d36f feat: more tests for parse_raw_specs 2023-06-22 15:34:11 +02:00
Ricard Illa 725114805a feat: added parse_raw_specification 2023-06-22 09:40:26 +02:00
Ricard Illa 5a4bca756e added skeleton for airflow DAG 2023-06-21 19:12:03 +02:00
Ricard Illa 02eaa4b7ff added skeleton for beam etl pipeline 2023-06-21 19:11:17 +02:00
Ricard Illa b5ab3ca687 feat: added files for containerized airflow 2023-06-21 16:02:12 +02:00
Ricard Illa 0a3323170f data exploration 2023-06-21 15:46:28 +02:00
Ricard Illa 1e7c1eac1c feat: added docker-compose for containerized notebook 2023-06-21 15:45:49 +02:00