Intro
This week, I’m excited to explore AutoMQ, a cloud-native, Kafka-compatible streaming system developed by former Alibaba engineers. In this article, we’ll dive into one of AutoMQ’s standout technical features: running Kafka entirely on object storage.
Overview
Before we move on, let’s revisit the Kafka design. The message system uses the OS filesystem for data storage and leverages the kernel page cache mechanism. Rather than trying to keep as much data in memory and flush it to the filesystem, the OS transfers all data to the page cache before flushing it to the disk. All the messages’ write and read operations must go through the page cache.
Modern OS systems usually borrow unused memory (RAM) portions for page cache. The frequently used disk data is populated to this cache, avoiding touching the disk directly too often, which lead to performance improvement
Press enter or click to view image in full size
This design tightly couples computing and storage, meaning adding more machines is the only way to scale storage. If…