Zircon Fair Scheduler

fuchsia.googlesource.com

117 points by return_0e 7 years ago · 78 comments

Reader

> A NOTE ABOUT DEADLINES: While fair scheduling is appropriate for the vast majority of workloads, there are some tasks that require very specific timing and/or do not adapt well to overload conditions. For example, these workloads include low-latency audio / graphics, high-frequency sensors, and high-rate / low-latency networking. These specialized tasks are better served with a deadline scheduler, which is planned for later in the Zircon scheduler development cycle.

Those seem like important workloads. Does this imply that the deadline scheduler runs concurrently with the fair scheduler? Otherwise, what's the point of developing an ideal scheduler for common workloads if it cannot be used for critical workloads. Is it common to run two different schedulers in the same system?

colechristensen 7 years ago

Different scheduling algorithms and implementations have tradeoffs. Pick one based on your workload. No need to have only one for all applications.
You can approximate the choice down to two dimensions, latency vs. throughput. Pick your poison.
- VanillaCafe 7 years ago
  
  For instance, I assume for instance a common workload that would otherwise benefit from the fair scheduler has a fair chance of wanting to do low latency audio. I believe Android has had this issue. As soon as the platform cares about low latency audio it would need to abandon the fair scheduler?
  - colechristensen 7 years ago
    
    Low latency audio would fit for doing live or studio audio production, not necessarily watching videos, listening to music or doing video calls (where there is already network delay an order of magnitude larger than the scheduler would impart). "Low" will have context-dependent meaning.
    If you're doing low latency audio or flight controls on an aircraft for example, you would absolutely need to abandon the fair scheduler for one that could guarantee the deadlines you need are met. Sacrificing "performance" that is throughput/efficiency/etc of your processor for timing performance.
    Think of the difference between doing statistics on a billion lines of data and flight controls on a missile. VERY different needs for scheduling, why not have a selectable algorithm?
    
    jblow 7 years ago
    
    > (where there is already network delay an order of magnitude larger than the scheduler would impart)
    Not the right way to think about it; the connection latency is irrelevant. What is relevant is that you need to play audio in sync with the video, and that audio is coming to you approximately simultaneously with the video it's meant to be synced with.
- cbetti 7 years ago
  
  Three. And they're blended, which makes the selection hard. Latency, throughput, tolerance for dropped work.
derefr 7 years ago

Yes, it's common, though in many cases it's an implicit rather than explicit part of the design—"deadline scheduling" is just what happens when you allow kernel-mode drivers to register CPU interrupts (usually, these days, coming from DMA completion events.)
coreytabaka 7 years ago

It does imply that. Multiple concurrent scheduling algorithms is nothing new, Linux and MacOS both support per-thread algorithm selection.
- VanillaCafe 7 years ago
  
  What does the super scheduler that coordinates the two schedulers approximately look like?
  - the8472 7 years ago
    
    RT tasks get prioritized over normal ones but only up to a configurable fraction of CPU time slices.
    If you need even more guarantees than that another option is to pin tasks to a set of CPU cores to isolate workloads from each other. The kernel can also be told to not use certain cores for interrupt handling or kernel-internal tasks. So with some effort it's possible to almost entirely dedicate a core to a single thread.
  - coreytabaka 7 years ago
    
    A simple hierarchy: when there is deadline work to do that work takes precedence, fair work gets the rest of the time. This is effective because deadline work has bounded execution time, whereas fair work is elastic and can adjust to use the available bandwidth.
    
    WhatIsDukkha 7 years ago
    
    Where is io scheduling in this?
    It's my perception that on current workstations, for example, that and "ionice -c3" in front of any build I do is far more useful then simply nicing it.
    
    coreytabaka 7 years ago
    
    IO scheduling is a separate problem, though it shares the same fundamental properties. The same goes for network packet scheduling. All of these are ongoing efforts. Stay tuned! :)
    
    WhatIsDukkha 7 years ago
    
    Well I look forward to seeing this develop.
    Two of my primary areas of interest, audio engineering and VR both REALLY show how poorly modern operating systems do with user interactivity when it counts.

impostir 7 years ago

There are many references to multiple cpu systems throughout the document. Maybe I missed something, but I didn't know Fuschia was aimed at systems like that. I am no expert, but aren't the vast majority of multiple cpu systems servers or high-rnd workstations? If google can supply their own server os, Linux could lose a lot of support and funding

wolf550e 7 years ago

"cpu" is "core". Almost all phone CPUs are multi-core. The low end of phone (and Raspberry Pi 2 V1.2 and newer) CPUs is quad core ARM Cortex A53, which is a small slow in-order design, similar to the original Intel Pentium from 1995. Older low end phones used the 32bit quad core ARM Cortex-A7. single core ARM11 phones are extinct.
They don't mean multi-socket systems, and I don't see mention of NUMA, which is the interesting case for servers where RAM is connected to some memory controller in a socket and to reach it from a CPU in a different socket you need to do extra hops, so some memory addresses are more distance than others and schedulers should take that into account to achieve good performance.
- tonfa 7 years ago
  
  And phone CPUs can be more complex than most desktops, afaik big.LITTLE is fairly common (multi core, with different performance / power tradeoffs between the cores).
  https://lwn.net/Articles/501501/ has some pointers.
  - Twirrim 7 years ago
    
    big.LITTLE is fascinating, and I'm somewhat curious to see if Apple tries to do something along those lines as it supposedly pushes towards ARM chips in their machines.
    Google have added some interesting stuff to Android to help it learn over time if tasks should be allocated to the bigger cores or the lower powered smaller ones, based on all sorts of metrics, and I believe this stuff is making its way upstream. There's certainly some really interesting potential around that.
    
    wolf550e 7 years ago
    
    Apple chips are big.LITTLE too, at least since the 2016 A10 in the iPhone7.
    https://en.wikichip.org/wiki/apple/ax/a12
    https://en.wikichip.org/wiki/apple/ax/a10
    https://www.anandtech.com/show/13392/the-iphone-xs-xs-max-re...
    
    Twirrim 7 years ago
    
    Sure, I should be clear I was thinking about OSX and laptops/desktops, rather than mobile devices.
  - wolf550e 7 years ago
    
    I was just saying the low end chips have only "little" cores, no "big" cores, but they are still multi-core. AFAIK nobody makes single core Cortex A chips. Single core Cortex M chips are made, but those are really puny.
    
    ansible 7 years ago
    
    Hilariously, it is common to see some SoCs, like the i.MX8, have Cortex-A and Cortex-M cores. You can do fun stuff like run different operating systems on them too. They can be used for real-time applications, safety-critical functions, etc..
    And by "fun" I mean not-fun, because you've got multiple build processes and releases to manage.
    
    wolf550e 7 years ago
    
    But I think that's not used as a multiprocessor, with threads getting scheduling on either (or both) depending on power policy. The embedded cores are, I think, used as separate systems running separate OS that happen to have access to the same DRAM, like a peripheral that can do DMA.
    Thinking of running that as a single multiprocessing system I'm reminded of the bug caused by Samsung doing big.LITTLE with cores that have the same instruction set but differently sized cache lines: https://news.ycombinator.com/item?id=12481700
    
    ansible 7 years ago
    
    > But I think that's not used as a multiprocessor, with threads getting scheduling on either (or both) depending on power policy. The embedded cores are, I think, used as separate systems running separate OS that happen to have access to the same DRAM, like a peripheral that can do DMA.
    Right, I didn't mean to imply otherwise.
    
    brandmeyer 7 years ago
    
    There are variations on this theme where the two types of cores are also cache coherent with each other.
loudmax 7 years ago

I'm no expert, but when they talk about multiple CPUs I understood them to mean multiple cores, not necessarily multiple CPUs on different motherboard sockets. Even midrange phones today typically have CPUs with multiple cores.
Google may have some intention of running Fuschia on servers, but even if you're developing a kernel for pocket mobile devices, you're still going to want to handle multiple cores.
Whether this signals Google's intentions to stop supporting Linux, who knows, but there are still a lot of other organizations invested in supporting Linux on servers.
ubercore 7 years ago

Most (all?) phone SoCs these days are multi-core. My pixel 2 has 8 cores (4 big, 4 little) for example.
linuxftw 7 years ago

Hopefully the industry learned it's lesson with Google and it's handling of the Android Open Source Project [1].
I don't think manufacturers of devices can bet their futures on Google.
1: https://arstechnica.com/gadgets/2018/07/googles-iron-grip-on...
- geodel 7 years ago
  
  Well Samsung wrote their wonderful OS named Tizen. Not sure why they still sell Android phones. And if company as big as Samsung couldn't do it then lesson for industry would be that writing OS for their hardware is pretty much guaranteed failure.
  - jasonvorhe 7 years ago
    
    Tizen was a security laughing stock, iirc. This is from 2017: https://arstechnica.com/gadgets/2017/04/samsungs-tizen-is-ri...
tathougies 7 years ago

Anecdata, but my tiny ARM server at home has eight cores on one SOC and costs $80. I believe my phone also has eight cores.
- truncate 7 years ago
  
  $80 sounds nice! Which one is it?
  - tathougies 7 years ago
    
    Odroid hc2

bepvte 7 years ago

https://cchalpha.blogspot.com/2019/03/bmq-scheduler-call-out...

Someone made a scheduler on linux based on some of the ideas here. Its included in the postfactum (linux-pf) patch set i believe, which might have packages for your distro.

coreytabaka 7 years ago

No, that's based on the deprecated multi-level round-robin scheduler, which we find tremendously amusing. :)
- bepvte 7 years ago
  
  Oh gosh well thank you for pointing that out
- euyyn 7 years ago
  
  Who's "we"?

dpflan 7 years ago

This post has sky-rocketed to the top! I'm genuinely curious: can someone explain what's cool/interesting/important about this (maybe EL5)? Thanks!

terryschiavo22 7 years ago

Well, Google intends for this new OS to replace Android. They'll need to convince the public that this new OS is somehow better than Android, which everyone has come to know and love. It seems that they believe the best way to do that is a grassroots approach beginning with tech discussion hubs like HN and Reddit.
Of course, they could have just meme'd hard about the fact that they're moving away from evil Oracle technology and we'd have already all been on board.
- endorphone 7 years ago
  
  Nothing, at all, has demonstrated that they plan it to replace Android -- that was a narrative various tech blogs invented. Nor would it have any benefit in moving away from the Java inspired/cloned underpinnings of the Android user layer.
  Google has a variety of initiatives, and they really like reinventing things (which can sometimes yield great outcomes). This is a kernel that is in contrast with Linux.
  - terryschiavo22 7 years ago
    
    https://www.tomsguide.com/us/google-fuchsia-os-replace-andro...
    From the article, the OS allows for full compatibility with all Android apps. Furthermore, it notes that Google is going out of its way to avoid mentioning Android anymore.
    Of course, if I were Google and trying to sell the public on my new OS, I'd want them all to think that I'm not scrapping the old OS so that they feel they have a choice.
    
    endorphone 7 years ago
    
    "the OS allows for full compatibility with all Android apps"
    The article does not say that. The article mentions an ART target for Fuscia -- you have to build from something. That is approximately 0.1% of the way towards full compatibility. And for that matter you can run Android apps on a load of targets (although far from full compatibility), but that doesn't mean that they're replacing Android.
    Google may absolutely replace Android -- they've made loads and loads of mistakes along the way -- but the way people keep arguing it doesn't make sense, using examples that jettison the parts that work well and somehow keep the parts that don't work well (which includes ART, as an aside). And indeed I'm falling into this same trap while talking about Fuscia like it's a kernel, when really the kernel is a small part and they're, at a very small scale, spit-boarding a new take on virtually every part of the system.
    Regarding Google distancing itself from the Android name, that's just branding. To quote one analysis -- "Android sounds technical, has baggage, and might be stale". They've had enough missteps that it's an anchor more than a lift, so it makes sense that they stop highlighting it.
- shereadsthenews 7 years ago
  
  > Android, which everyone has come to know and love.
  Haha, honestly now. If my Android didn't cost $700 I would long since have smashed it to bits. It's scheduler is totally garbage, to the point where Google's own media apps like YouTube and Music drop samples while the screen is redrawing. Who "loves" Android? To me it is the Win98 of mobile operating systems.
  - stronglikedan 7 years ago
    
    When you only have two real choices, each with their own significant set of distinct problems, I think the term "love" can be substituted for "hate the least". Android, which everyone has come to know and hate the least. Well, not everyone, but my point is the same regardless.
    
    pjmlp 7 years ago
    
    I left Symbian for Android 2.1, after several Android devices, became part of the 10% of WP share in Europe, now still using one of them as secondary device.
  - linuxftw 7 years ago
    
    Android has been pretty solid for me. I'm on my second Android device, this one cost me like $150 or so.
    I've never had a higher-end phone, so I'm not missing anything with the mid-range. Web works great, as do most apps.
  - ubercore 7 years ago
    
    That's a turn of phrase to indicate that not everyone loves, but it _is_ well known and probably the biggest consumer OS on the planet in terms of volume.
    Also though, I do love it. Different strokes and all
- pjmlp 7 years ago
  
  It is very easy to convince the public, OEMs are the ones that need to be convinced.
  ART is already being ported into Fuchsia and Linuxisms are not part of NDK stable APIs.
  So managed Android apps will just work, and NDK libraries only need to be recompiled.
  - endorphone 7 years ago
    
    In that case it wouldn't have replaced Android at all -- It would have simply replaced the Android kernel, which happens to be Linux. And at that point you have to ask what is gained, and at this point the answer is "nothing".
    
    wolfgke 7 years ago
    
    > And at that point you have to ask what is gained, and at this point the answer is "nothing".
    Not having to care about the permanent breaking of internal kernel APIs is not "nothing".
    
    endorphone 7 years ago
    
    How, exactly, will they "not have to care"? The identical ramifications occur with Fuscia as they happen with Linux! This is farce.
    Every single problem that Android has had, from low-latency audio issues (they've rebuilt that a dozen times in a dozen cartoonish ways) to driver stagnancy, is completely and directly a result of Google choices and implementations (and they do the same thing again and again! It's remarkable). The notion that Google is going to fix all of their own self-sabotage by starting anew is comedic in a sense, and is the folly of countless foolish projects. "We keep fucking up again and again...let's start from scratch and this time we'll surely do it right!"
    This time, however, it'll be different...
    
    wolfgke 7 years ago
    
    It is a difference whether the fault for this lies at the Google engineers or the kernel maintainers.
    
    endorphone 7 years ago
    
    Google has never been required to commit upstream, but they've chosen to do so because of the ease of integrating changes downstream. They could have forked off and bashed the code into whatever form they wanted to. Breaking anything and everything.
    But they didn't. They kept keeping it in sync. They probably had a reason for doing that.
    
    pjmlp 7 years ago
    
    Google has chosen to cherry pick commits to upstream, there is still plenty of stuff on AOSP, and much more on internal Googleplex repos.
    
    monocasa 7 years ago
    
    Well a change of the kernel's license from GPLv2 to BSD is certainly a change. I'm sure the OEMs think that gains them something.
    
    endorphone 7 years ago
    
    Android OEMs are not in the business of writing kernels, though, and the changes they do are minimal. And their HAL/chipset code -- the thing they might actually care about as IP -- is not governed by the GPL at all, nor is any of the enormous volume of system and userspace code they write.
    It's a neat initiative and might yield something interesting, but if the Linux kernel was replaced by Fuscia the ramifications are seemingly very minor. Android's many issues have never been at the kernel level.
    
    monocasa 7 years ago
    
    The components of their HAL/chipset code in the kernel are most of the time governed by the GPL. We can take some examples apart of need be.
    
    pjmlp 7 years ago
    
    After Project Treble that only applies to the legacy HAL code.
    
    monocasa 7 years ago
    
    Even fully after Treble, there are absolutely still kernel drivers.
    
    endorphone 7 years ago
    
    Qualcomm has no obligation to GPL their kernel drivers. Nor does nvidia, or any other KLM maker. This has never demanded that kernel drivers be open sourced. That's why it's impossible to make a runnable AOSP image for many devices.
    Vendors still make generally minor changes to the kernel, but these are not unique or special or some competitive edge. They've just nuisance necessities.
    
    monocasa 7 years ago
    
    It's way more grey than that. Nvidia claims that their driver doesn't need to be opened because since it's A) from a codebase that existed prior to the Linux port, and B) doesn't actually import any Linux code or symbols (that's left to a dual licensed shim layer), that in a legal sense their driver isn't 'derived from' the Linux kernel and isn't subject to the viral parts of the GPL. And, all of this is the greyest of grey area. Towards that end, Nvidia has been committing to Nouveau lately for Tegra chips because I think they realize how precarious their legal situation is for chips essientially designed to run Linux.
    That reasoning doesn't apply to most kernel drivers, and those vendors are just openly in conflict with the GPL.
    
    pjmlp 7 years ago
    
    Sure there are, they are known as "legacy HALs" on Project Treble documentation.
    
    monocasa 7 years ago
    
    Literally any driver that has a piece that runs in interrupt context has to have (in part) a kernel mode driver under Linux.
    On top of that, GPUs generally have to have a kernel component, even under systems that put as much as possible under user mode like sel4, because they have their own MMUs that can subvert kernel integrity.
    
    pjmlp 7 years ago
    
    Again, yes there are drivers running in the kernel space, yes they are mostly GPL, no Google isn't re-writing them into Binderized HALs, they can stay as Passthrough HALs.
    Nothing of that prevents that all new drivers, except for the ones marked as legacy, are required to be Binderized.
    The kernel code of a driver from OEMs allergic to GPL, is hardly different from a signal handler, implementing the minimal set of kernel code and deferring everything else to userspace.
    
    monocasa 7 years ago
    
    Sort of...
    I'm going to go out on a limb here and guess that you don't actually write kernel drivers like I do professionally.
    
    pjmlp 7 years ago
    
    Ah appealing to authority as last argument, naturally any kind of opinion and knowledge of what I might have as Android developer, or former professional experience in systems programming is worthless.
    Nice.
    
    monocasa 7 years ago
    
    More that I recognized a position that comes from a cursory understanding of the situation. About the level I'd expect from having taken a few classes on it, but lacks true experience in the domain. One of those "in theory, theory and practice are the same. In practice, they're not" sort of situations. You've taken those ideas to conclusions that could have been valid, but the industry took other options.
    Now, as someone who has spent a decade in that domain, I'm down to continue having a conversation about it and helping overcome a few of those hurdles, but you need to be willing to hear something that isn't your own conclusions.
    
    pjmlp 7 years ago
    
    Just like you need to be willing to accept that GPL won't stay forever on Android, but you are right I don't have a clue.
    As for systems programming experience, you would be surprised, but I am not here to justify myself with appeals to authority.
    
    endorphone 7 years ago
    
    Linux has countless LKMs (modules that run in the kernel, but are not -- broadly -- bound by the GPL). Many, many are closed-source and proprietary. Others aren't because they don't have to be.
    You hinted that this transgresses the GPL, and there are those that argue that virtually anything transgresses the GPL. But a lot of very large corporations say otherwise, and there have been zero successful challenges against it.
- monocasa 7 years ago
  
  Well, they're not moving away from Android per se. ART is being ported to run on fuschia.
- ansible 7 years ago
  
  It is sort of replacing Android, but not really. But it sort of is.
  The kernel is designed to have a stable binary interface for drivers. This has been a problem with Linux-based Android devices (all of them so far), because the OEMs (or more properly the chipset vendors like Qualcomm) will only support a particular chipset for a short amount of time (maybe a couple years, it depends).
  After that, it becomes hard to bring new kernels to the platform, so we all end up with phones stuck at whatever major release was out at the time, with low prospects of upgrades.
  If you can make a stable binary API, and furthermore keep to the micro-kernel model, then most of the OS can be easily upgraded, because you don't need (and aren't going to get) new versions of the device drivers for the chipset.
  Also, there's an effort towards better low-latency real-time support. This is critical for AR/VR applications with tight rendering deadlines.
- ASinclair 7 years ago
  
  > Well, Google intends for this new OS to replace Android.
  Can you point me to anything that demonstrates that intention?
  - davidb_ 7 years ago
    
    Currently it's just rumors that it will replace Android based on Fushia and Flutter's target devices. You can see a summary on [the wikipedia page](https://en.wikipedia.org/wiki/Google_Fuchsia).
  - terryschiavo22 7 years ago
    
    https://www.tomsguide.com/us/google-fuchsia-os-replace-andro...
jitl 7 years ago

I don’t care about the android aspects. I’m just fascinated by OS development. Fuschia is a new capabilities-secure microckernel OS backed by Big Google, so it’s got a lot of potential and engineering support behind it. I read and upvote pretty much anything about Fuschia.
pjmlp 7 years ago

Depending on how internal politics play out at Google, Android's kernel might eventually be replaced by Fuchsia (ART is already being ported, commits are visible on AOSP).
And Fuchsia is mikrokernel based.
rhizome 7 years ago

tl;dr: Good question, it's not answered anywhere in the family of repositories that Zircon touches.

Settings

Zircon Fair Scheduler

Keyboard Shortcuts