Scheduling workloads in a cluster is not a new topic and have years of research behind it. But scheduling recently became popular because of the popularity of containers (Docker) and the rise of scheduling frameworks that can run more than just MapReduce and OpenMP jobs (Mesos and Kubernetes).
Having the opportunity to work on Mesos at Mesosphere and becoming a PMC in this project has made me much more aware of what’s going on in this space. However, as I see different areas we can improve in Mesos or Spark on Mesos scheduling and some problems we are facing, I like to go back and understand the scheduling literature for me to have a better understanding when considering how to make improvements in our schedulers.
Motivated by these reasons, I decide to cover the recent datacenter scheduling paper in a chronological order, and being inspried by Adrian Coyler’s `The Morning Paper`, I like to post a short summary of each scheduling paper on this blog. I also will add my personal comments and thoughts after reading the paper from a Mesos perspective.
Also very thankful for Malte Schwarzkopf (MIT, co-author of Google Omega and Firmament) for providing me a list of papers that I should cover.
I’ll update the sections below as I add more blog posts, but stay tuned if you also like to learn more about datacenter scheduling!
Papers covered
-
Improving MapReduce Performance in Heterogeneous Environments
-
Quincy: fair scheduling for distributed computing clusters
-
Delay Scheduling: A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling
-
Jockey: Guaranteed Job Latency in Data Parallel Clusters
- alsched: Algebraic Scheduling of Mixed Workloads in Heterogeneous Clouds
- Tetrisched: Space-Time Scheduling for Heterogeneous Datacenters
Published