S3 as the Universal Back End
medium.comI'm still thinking about this post. S3 as a backend + the Rust/Arrow ecosystems is making unbundling data & compute a clean and straightforward reality for data engineering systems. I think this will be scary for companies that charge by storage
I believe the main pitfall of this approach is related to what the author of the og article mentioned around r/w and query latency from S3. I like the old Twitter eng blog post on their distributed kv store called Manhattan (https://blog.x.com/engineering/en_us/a/2014/manhattan-our-re...). For any serious real-time data, we probably still need some semblance of fast R/W storage, but analytics workloads or long-running tasks can def benefit for cheap blob storage.
Using S3 as a storage layer and storing is certainly interesting, especially the argument to host on your customer's cloud, which would certainly reduce hosting costs.
How well would this scale to a few hundred thousand users per day?