4 Comments

I think most schema become a bit less needed after the introduction of SSD and Python in data analysis.

Still need it in situations but with the cloud computing and the cost/speed of computing is quite cheap the schema only relevant in big team, corporation where you need good data governance, audit, permission control, history tracking ... While for purely data analysis star schema may not really as useful.

Expand full comment

Such a good consolidation of knowledge!

Every time I read about the Star Schema I struggle with one thing: How is the process to take the data from OLTP and put it into the OLAP system? I'm not talking about the DMS like a Data Factory, but how to store the data

Now that I'm writing this, another question comes: In a lakehouse (I use Databricks + azure DL), we can just extract everything from the OLTP, put it into a staging or raw layer and then, in the bronze layer, we model our facts and dimensions?

I'd appreciate if you can clarify those questions for me!

Expand full comment

A good refresh of the basis.

BTW, Start Schema should be Star Schema

Expand full comment

Oh, thanks a lot, Jove! I edited it.

Expand full comment