- Roles
- Self-Study, Software Developer, Technical Writer
- Skills
- SQL, Postgresql, Bash, Asciidoc, Mermaid & Graphviz
Curious about transitioning from publishing engineer to data engineer, I’ve been looking into the tools and processes involved in the field. I have experience with transactional SQL for use in application development, but this course was my first in-depth look at the data warehousing aspect, with a star schema and surrogate keys to index joins of dimensions.
On my Linux laptop, I chose to use Postgresql,
converting Baraa’s samples from Microsoft SQL Server throughout the course.
And since it was Postgresql, I used psql to run DDL, transforms, and
dynamic procedures from a little bash script. It runs a pipeline
from bronze to silver to gold, and runs tests and logs messages
along the way to exercise good serviceability practices.
His course is on YouTube and my project repository is on Github.
===================================================
Validating Silver Layer...
===================================================
---------------------------------------------------
CRM Data
---------------------------------------------------
-----------------------
crm_prd_info...
-----------------------
>> Check for nulls or duplicates.
psql:src/silver/validate_silver.sql:81: NOTICE: ✔ No duplicates found.
DO
>> Check for unwanted spaces
psql:src/silver/validate_silver.sql:96: NOTICE: ✔ No unwanted spaces found.
DO
>> Check for nulls or negative values in cost
psql:src/silver/validate_silver.sql:111: NOTICE: ✔ No negative values or nulls found.
DO
>> Data standardization & consistency
psql:src/silver/validate_silver.sql:126: NOTICE: ✔ No unwanted values found.
DO
>> Check for invalid date orders
psql:src/silver/validate_silver.sql:141: NOTICE: ✔ No invalid dates found.
DO
