Ricard Illa
|
e9446f42f3
|
feat: added a few more plots to notebooks
|
2023-06-26 13:03:06 +02:00 |
Ricard Illa
|
39d279f089
|
refactor: some cleanup on etl's code structure
|
2023-06-26 12:36:19 +02:00 |
Ricard Illa
|
b28ddc350d
|
doc: added readme to etl
|
2023-06-26 12:09:50 +02:00 |
Ricard Illa
|
4ba29e7e1d
|
feat: postgresql port doesn't need to be exposed
|
2023-06-26 11:57:50 +02:00 |
Ricard Illa
|
0dd81715d4
|
feat: ignore dbt/.user.yaml
|
2023-06-26 11:54:17 +02:00 |
Ricard Illa
|
543004b51c
|
feat: added pylint and pytest for etl
|
2023-06-26 11:53:40 +02:00 |
Ricard Illa
|
ac1101ae96
|
fix: fix dbt writeable directories permissions
|
2023-06-26 10:04:53 +02:00 |
Ricard Illa
|
7d898c6297
|
fix: airflow's BashOperator's cwd cannot be templated
|
2023-06-26 10:04:28 +02:00 |
Ricard Illa
|
06c76f0b65
|
docs: added dbt readme
|
2023-06-26 08:51:54 +02:00 |
Ricard Illa
|
743c57a0d1
|
feat: minor changes
|
2023-06-26 08:49:16 +02:00 |
Ricard Illa
|
5ed10fb179
|
feat: added DAG readme.md
|
2023-06-26 08:48:37 +02:00 |
Ricard Illa
|
3d230263e2
|
feat: added reame
|
2023-06-26 08:42:26 +02:00 |
Ricard Illa
|
d97fb6456a
|
feat: added some plots
|
2023-06-25 23:52:23 +02:00 |
Ricard Illa
|
15178b0b3c
|
feat: added analysis notebook
|
2023-06-25 23:24:54 +02:00 |
Ricard Illa
|
900955d92d
|
feat: some cleanup on exploration notebook
|
2023-06-25 23:24:43 +02:00 |
Ricard Illa
|
c479a66405
|
feat: implemented incremental model for scored products
|
2023-06-25 23:21:40 +02:00 |
Ricard Illa
|
73df832a6c
|
feat: implemented incremental model for scored products
|
2023-06-25 22:41:58 +02:00 |
Ricard Illa
|
1268537695
|
feat: moved transformations to dbt
|
2023-06-25 20:53:51 +02:00 |
Ricard Illa
|
1bc5daa29e
|
feat: removed dimensions from schema
|
2023-06-25 13:20:26 +02:00 |
Ricard Illa
|
6d7f4909cc
|
feat: upsert products
|
2023-06-25 13:18:44 +02:00 |
Ricard Illa
|
2cf2007434
|
feat: added score calculation
|
2023-06-25 12:46:53 +02:00 |
Ricard Illa
|
6bb944c114
|
feat: minor renaming
|
2023-06-25 12:22:38 +02:00 |
Ricard Illa
|
4a3d43bcc5
|
feat: autocommit session
|
2023-06-25 12:22:06 +02:00 |
Ricard Illa
|
915f3c2a4c
|
feat: tcin is the primary key now
|
2023-06-25 11:33:06 +02:00 |
Ricard Illa
|
9f3a4865ce
|
refactor: some refactoring of file placements
|
2023-06-24 19:08:14 +02:00 |
Ricard Illa
|
02ad9fab8d
|
refactor: some refactoring of file placements
|
2023-06-24 19:07:02 +02:00 |
Ricard Illa
|
55b20bb897
|
feat: update table if there's a primary key conflict
|
2023-06-23 18:25:11 +02:00 |
Ricard Illa
|
8e89404b76
|
feat: import elements into database using beam
|
2023-06-23 18:02:01 +02:00 |
Ricard Illa
|
66782ec2ef
|
feat: added query to create table to airflow
|
2023-06-23 16:23:21 +02:00 |
Ricard Illa
|
cbfbf42d53
|
feat: added gtin13 column
|
2023-06-23 15:29:57 +02:00 |
Ricard Illa
|
3e42d55e7c
|
feat: added postgres and terraform
|
2023-06-23 15:20:40 +02:00 |
Ricard Illa
|
63e961e9a2
|
feat: added module for reading and writing data
|
2023-06-23 11:20:42 +02:00 |
Ricard Illa
|
7a86988bd1
|
misc changes
|
2023-06-23 11:11:32 +02:00 |
Ricard Illa
|
a0c868f2f0
|
refactor: generalize unit conversion function
|
2023-06-23 10:30:07 +02:00 |
Ricard Illa
|
9e3936fdc0
|
fix: typo
|
2023-06-23 10:20:08 +02:00 |
Ricard Illa
|
e716fc1cd3
|
feat: added parsing dimensions
|
2023-06-23 10:05:24 +02:00 |
Ricard Illa
|
9c19835746
|
doc: small doc added
|
2023-06-22 17:36:49 +02:00 |
Ricard Illa
|
050323d583
|
feat: added clean_origin_name
|
2023-06-22 17:32:08 +02:00 |
Ricard Illa
|
b018abe00e
|
feat: remove older file
|
2023-06-22 17:11:49 +02:00 |
Ricard Illa
|
f2a996d42e
|
feat: tests for materials
|
2023-06-22 17:11:31 +02:00 |
Ricard Illa
|
691b8898a0
|
docs: added docstring to parse_xml module
|
2023-06-22 15:51:47 +02:00 |
Ricard Illa
|
b3069d4ca2
|
refactor: put helpers in a separate folder
|
2023-06-22 15:43:26 +02:00 |
Ricard Illa
|
0b58d47acc
|
feat: handle malformed xml file
|
2023-06-22 15:34:38 +02:00 |
Ricard Illa
|
9b5ce6d36f
|
feat: more tests for parse_raw_specs
|
2023-06-22 15:34:11 +02:00 |
Ricard Illa
|
725114805a
|
feat: added parse_raw_specification
|
2023-06-22 09:40:26 +02:00 |
Ricard Illa
|
5a4bca756e
|
added skeleton for airflow DAG
|
2023-06-21 19:12:03 +02:00 |
Ricard Illa
|
02eaa4b7ff
|
added skeleton for beam etl pipeline
|
2023-06-21 19:11:17 +02:00 |
Ricard Illa
|
b5ab3ca687
|
feat: added files for containerized airflow
|
2023-06-21 16:02:12 +02:00 |
Ricard Illa
|
0a3323170f
|
data exploration
|
2023-06-21 15:46:28 +02:00 |
Ricard Illa
|
1e7c1eac1c
|
feat: added docker-compose for containerized notebook
|
2023-06-21 15:45:49 +02:00 |