Anyone have any good reads on the architecture of having separate steps of "dump raw data quickly with no transformations" and "process and model the data"? I'm wondering if that would make sense for my needs, or if it only makes sense at the huge scale of Uber.
Anyone have any good reads on the architecture of having separate steps of "dump raw data quickly with no transformations" and "process and model the data"? I'm wondering if that would make sense for my needs, or if it only makes sense at the huge scale of Uber.
I think now they are planning to move to GCP for data pipelines etc.