Alpine Linux is reducing dependencies on Busybox

gitlab.alpinelinux.org

200 points by senzilla 4 years ago · 86 comments (85 loaded)

Reader

g051051 4 years ago

That's not what I read in the link:

> More generally, and this is more a matter of opinion and totally debatable, I would like functionality to be progressively stripped from busybox-initscripts, which is a package that gathers a bunch of miscellaneous policy scripts that are only related by the fact that their mechanism is provided by busybox. I don't think this package makes sense from a semantics point of view; it is more logical to provide the policy scripts classified by service, no matter whether or not the implementation of the service is done by busybox. To me, ideally, busybox-initscripts would be empty, and we'd have virtual packages for every service that is currently defined in it, so support for alternative implementations can be added over time. This would also ease the path to getting out of busybox, or at least providing alternative coreutils/low-level utilities implementations, is there is ever a will from Alpine to do so.

So it sounds like they just want to change how the scripts are packaged. The only mention of getting away from busybox is at the end, which is qualified with "[if] there is ever a will from Alpine to do so".

wpietri 4 years ago

I like this part:
> I don't think this package makes sense from a semantics point of view; it is more logical to provide the policy scripts classified by service, no matter whether or not the implementation of the service is done by busybox.
That's a lesson I see learned over and over. Something like, "Group by meaning, not mechanism."
- WatchDog 4 years ago
  
  It's one of the many system factoring challenges. It's difficult to define a hard and fast rule as to which is better. Often some combination of the two is ideal, particularly as the system grows.
- andy_ppp 4 years ago
  
  Is this so that you don’t end up creating the wrong abstraction when inevitably over time the meaning changes and you end up special casing the implementation?
tremon 4 years ago

From Ariadne's update:
> The TSC [..] has concluded that there is a general need to begin decoupling hardcoded preferences for BusyBox from the distribution.
That's a bit stronger than just "we want to reorganize our script packaging". It still isn't explicitly "reducing dependencies on Busybox", but removing hardcoded dependencies is a prerequiste for the former.
- dundarious 4 years ago
  
  Debatable. I view it as a restatement of the goal to package scripts by service, instead of having a grab bag package for scripts, and one tied to a specific impl at that.
  > More generally, and this is more a matter of opinion and totally debatable, I would like functionality to be progressively stripped from busybox-initscripts, which is a package that gathers a bunch of miscellaneous policy scripts that are only related by the fact that their mechanism is provided by busybox. I don't think this package makes sense from a semantics point of view; it is more logical to provide the policy scripts classified by service, no matter whether or not the implementation of the service is done by busybox. To me, ideally, busybox-initscripts would be empty, and we'd have virtual packages for every service that is currently defined in it, so support for alternative implementations can be added over time.
rkangel 4 years ago

The underlying goal (in the long follow up message) seems to be the desire to move from mdev (in busybox) to mdevd (not in busybox) so the title is to some degree justified. Although the situation is a complicated and subtle one, as per usual with Linux distributions.

rahen 4 years ago

For the trivia, this is pushed by Laurent Bercot (skarnet), creator of s6, execline and many others. He's also working on implementing s6 as Alpine init and rc systems.

https://skarnet.org/software/s6/

https://skarnet.com/projects/service-manager.html

1vuio0pswjnm7 4 years ago

Further trivia: Going one level up from busybox, the maintainer of dash is the creator of runit. Both runit and s6 are "inspired by" djb's daemontools. In truth, neither [wc]ould have been created if not for daemontools.
wpietri 4 years ago

Oh, interesting! Anybody used s6? I like the theory, but for me what matters is mainly the practice.
- nisa 4 years ago
  
  Used s6-overlay[1] to start a lot daemons in a docker-image for demo purposes - postgres, tomcat, mysql, php-fpm, apache (don't ask why ;) - s6 worked really well and was reliable and stable - I enjoyed it very much. It was also possible to reliable pass SIGTERM to the daemons in the image for clean shutdown and it was easily possible to configure logging to stdout with a prefix. Modelling dependencies (waiting on database before starting app etc.pp) is possible via shell-scripts. It's super flexible but out of the box it's more like a collection of powerful tools not a complete package - but that's good. It's in the tradition of djb daemontools and is very unix - as in doesn't talk a lot and you better know how each part works but - and that's really cool - it's modular and simple and once you get a grip on it you can easily reason about it. systemd takes a completely different approach and also solves a kind of differnt problem - this is like small pieces of lego that compose well instead of one big chunk of glib/dbus/glibc only c-code.
  1: https://github.com/just-containers/s6-overlay
- bruce_one 4 years ago
  
  We do - and have found it very reliable and I really like it :-)
  One of the members of our team finds it more complex when it comes to diagnosing why things aren't running/starting as expected, but that's also down to the complexity we have around s6 with other setup scripts (we use it to manage the full suite of processes in our product).
  Hence, they're not the biggest fan of it (and would talk negatively about it), but I _think_ s6 isn't really the culprit and instead the other complexity is.
  Although, when things go wrong it can be a little bit harder to chase down than it was with our former manual "start this process" type scripts... But, you can just `./run` the run script which may tell you enough :-)
  - wpietri 4 years ago
    
    Thanks! That makes sense to me. Personally, I usually like tools and practices that can feel rough when there's too much complexity. When there are problems, I think it's generally good that people feel like there are problems.
- ajnin 4 years ago
  
  linuxserver.io uses s6-overlay for all their images, which are very popular in the self-hosting community.
jart 4 years ago

They're really still going through with that? I didn't expect I'd have to find another distro so soon.
- stock_toaster 4 years ago
  
  Based on my reading of [1] and [2] it sounds a bit more nuanced than "just incorporate s6 into alpine".
  [1]: https://ariadne.space/2021/03/25/lets-build-a-new-service-ma...
  [2]: https://skarnet.com/projects/service-manager.html
- CharlesW 4 years ago
  
  Can you elaborate on why this is relationship-ending for you? https://skarnet.org/software/s6/why.html makes it seem like a reasonable direction.
  - pronoiac 4 years ago
    
    I'm not the OP, but a past employer used it on some legacy systems. It makes some choices, like avoiding spinning up new processes, that feel like it's been optimized for embedded systems. It uses a new-to-me language, that we never invested time in. So doing things like "send polite shutdown signals, wait 30 seconds, send harsher shutdown signals" became a matter of separate scripts or documentation. In short, you'd need more context for working with it than with other systems.
    
    skarnet 4 years ago
    
    Your information is incorrect.
    "Avoiding spinning up new processes" is incorrect characterization of s6. Processes are not a scarce resource; spawning a process is not a costly operation in the context of process supervision, even on embedded systems. s6 focuses on optimizing some metrics that are indeed important to embedded systems, like RAM use and code path length, but "spinning up new processes" isn't one of these metrics.
    It is not, and has never been, necessary to learn execline, the scripting language you're speaking of, in order to use s6. execline is used _internally_, and you can also use it in your own scripts if you so choose, but it is not a requirement.
    "Sending a polite shutdown signal, waiting for some time, and sending a harsher shutdown signal" is a matter of exactly one command: s6-svc -d. That is precisely one of the benefits of s6 over other daemontools-style supervisors: it handles this sequence natively.
    I welcome fact-based criticism of s6. I do not welcome FUD-based criticism.
- rahen 4 years ago
  
  It will only be an option, you can keep sysvinit + openrc if you prefer it this way.
- dannyobrien 4 years ago
  
  what's the objection?
silon42 4 years ago

How about alpine-systemd?
- nine_k 4 years ago
  
  Why would Alpine need systemd? Why would you need it in a container?
  - Deukhoofd 4 years ago
    
    Why would you assume Alpine is only used for containers? I use it for my own local home server, as it's very lightweight and easy to tweak to how I want it.
    
    Arnavion 4 years ago
    
    Not to mention every postmarketOS-running phone and tablet.
    
    matthews2 4 years ago
    
    I don't think there are many of those...
  - rahen 4 years ago
    
    I'm not a fan of using Alpine in containers.
    Use it baremetal for your servers and desktops, then use Debian/CentOS containers on top when something is missing.
    
    mkesper 4 years ago
    
    Sadly when playing the CVE game, Debian containers are no contest to alpine. For CentOS in a container I can see absolutely no reasons except if the SW is only tested against it.
  - josteink 4 years ago
    
    To make sure that once services are up and running, they keep running and are restarted automatically? To have good, centralized logging? To manage in-container services the same way you do on the container host?
    There's plenty of good reasons for one to want to do so.
    
    spockz 4 years ago
    
    I think it is counter to best practices to run multiple services in a single container. Although maybe you just need to with some proprietary software.
    
    strzibny 4 years ago
    
    No, it's just another way how to do things.
    
    raverbashing 4 years ago
    
    None of those things are actually needed in a container.
    If you're actually doing those things inside one, you're literally doing it wrong
    Yes you might have some special cases where some of this is needed but I'd use anything besides systemd
    
    btbuilder 4 years ago
    
    Some use-cases do require an init-providing process of some sort but you likely want to use tini.
    See for more details: https://github.com/krallin/tini/issues/8#issuecomment-146135...
    
    otabdeveloper4 4 years ago
    
    a) Starting all your services and shims and mocks inside a CI/CD tests container is not "doing it wrong"; in fact, it's the only correct way to do it.
    b) systemd is the only thing that can start services cleanly and correctly. Every other solution is in various states of brokenness.
    
    megous 4 years ago
    
    All of the cheap Linux container hosting services are doing it wrong, I guess.
    
    nine_k 4 years ago
    
    Many of them offer k8s out of the box.
    I still think that orchestration should live outside containers, whether within one box or several. BTW systemd has some vestiges of container orchestration built in.
    OTOH there are other approaches; say, LXD directly assumes an ecosystem of processes within a container, more akin to a VM than to a single chroot-ed / jailed service process.
    
    megous 4 years ago
    
    > OTOH there are other approaches; say, LXD directly assumes an ecosystem of processes within a container, more akin to a VM than to a single chroot-ed / jailed service process.
    Those are what I had in mind. Basically cheaper VPS like service but built on Linux container tech instead of full virtualization. Maybe they are less of a rage these days, when fully virtualized KVM/Vmware VPS are so cheap. But about 5 years ago I did run my email server, webserver and database on such a container for $1-1.5$/mo. When I was able to switch to full virtualization at similar price point, I did. But there's nothing weird about runing multiple different processes in a "container" :)

notacoward 4 years ago

This really looks like an example of open source done right. Obviously there are some strong opinions, but the person suggesting the change was pretty gracious about the pushback they got. Since then, stakeholders have had a chance to discuss and agree on a way forward. Nobody is trying to sweep all the "nasty bits" under the rug, like most developers tend to, and there's even mention of regression tests. I've seen few other projects (including but not limited to those where I was a maintainer) handle possibly-disruptive change so well. Kudos.

pmarreck 4 years ago

> and there's even mention of regression tests
I've been doing (mostly) full-coverage unit and integration testing since, oh... 2005? At least in the Ruby on Rails and now Elixir/Phoenix development spaces, it's absolutely de rigeur, and has probably saved me countless hours of debugging and simply not breaking stuff that already worked, or validating that things worked the way I expected them to.
The fact that in 2022 someone even has to qualify regression testing with an "even" (as in "EVEN mention of regression testing!") saddens me. Tests reduce developer pain and increase developer productivity, full stop. If you get hit by a bus, someone else who is working on your code will know they didn't break anything thanks to your test suite. Get with the program, folks, it's been decades now since this was known.
- notacoward 4 years ago
  
  It saddens me too, but it still seems necessary. In my (quite long and varied) experience, most developers do not appreciate the value of regression tests. Therefore, reminders and positive reinforcement are still beneficial.
  - R0b0t1 4 years ago
    
    It is possible to make useless tests.
    
    notacoward 4 years ago
    
    Does that seem like a strong argument to you? Of course it's possible to make useless tests. It's possible to be injured by an airbag. It's possible to overdose on a drug that usually saves lives. That doesn't mean any of these things aren't generally beneficial, or that they should be foregone.
    As I said recently on Twitter, I've measured the likely cost of production bugs that were fixed early by static analysis, by leaving log messages to mark where the bug would have occurred. I know that's not the same as regression tests, but if anything it's even more of a long shot, more of a hard sell to my fellow developers, and even in that case the value was strongly positive. About a half million 2005 dollars in that case, for less than a fifth of that in license costs and my own time. The ROI for regression tests is likely to be in the same ballpark.
    The fact that something can be done poorly by lazy people is in no way an argument against trying to do it well, or even semi-competently, by people who take their profession seriously.
    
    jhugo 4 years ago
    
    I think it's a big assumption that the ROI of tests in general is in the same ballpark as the ROI of static analysis.
    In my experience, many tests written in commercial software engineering have ~zero or even negative ROI. This mostly applies to micro-level testing like unit tests; macro-level testing like integration tests can be fantastically valuable and I've even come to believe that they're the only type of tests most teams should spend time writing.
    
    pmarreck 4 years ago
    
    It depends on how you design your software and where the API surfaces are and how isolated the pieces can be made to be.
    Which, if you're writing unit tests at the same time as the unit under test, leads naturally to pro-isolation, pro-modular designs, which are both easier to test, more reliable and generally have a more concise purpose.
    
    notacoward 4 years ago
    
    Well, we were talking about regression tests, which are much closer to integration tests than unit tests, so I'm not sure we actually disagree. Writing regression tests is still valuable, and still too rare.
    
    pmarreck 4 years ago
    
    Yeah, what exactly are you arguing, here? "assert 1=1" is a useless test, that doesn't invalidate tests whatsoever
- yjftsjthsd-h 4 years ago
  
  While I agree that it should be a standard feature, it is worth pointing out that operating systems tend to be more difficult and more expensive to run full tests for.
  - cortesoft 4 years ago
    
    Yeah, the scope of things that an operating system needs to be able to do is basically, “all things that can be done on a computer”, so if you are trying to write full regression tests you are never going to hit all the possible combinations.
    
    yjftsjthsd-h 4 years ago
    
    It's not even just that, but also that a lot of the things OSs do are painful to artificially test because they're on the edge of hardware and software, or involve building the abstractions that let other software run without worrying about those details. How do you make a CI job that tests that mdevd correctly handles enumerating devices and setting their /dev nodes correctly, when the edge cases are finicky hardware devices and nondeterministic enumeration?
    
    pmarreck 4 years ago
    
    Does mdevd not send a signal or command out, and get one back? If so, the hardware behavior can be simulated, as can timing issues.
    
    notacoward 4 years ago
    
    You can test the response to a signal that way, but testing whether it was sent when it should have been can get arbitrarily hairy. I've worked places where we had to do things like rig up relays to push physical buttons (or in one case physically pull a cable) to do this level of testing. An adjacent team (in 1994!) had an actual robot that would rove over the a disk-controller circuit board delivering small electric shocks to get the same kind of test coverage on the other end of the SCSI bus. When your code has to deal with devices, which can misbehave in arbitrary ways from the voltage levels on single electrical pulses up to complex protocol violations, simulating purely-software behaviors doesn't cut it. Maybe preach about the cost/benefit ratio of different kinds of testing after you've had to literally build the tests out in the physical world.
    
    pmarreck 4 years ago
    
    So because you see testing all the things together as too prohibitive, you balk at testing the individual bits in isolation (hence "UNIT" testing)?
  - pmarreck 4 years ago
    
    That's not an excuse (speaking as someone who has worked on a million-plus-line codebase). Pieces can be broken out and tested. That's the entire point of UNIT testing, it's right in the name
- unethical_ban 4 years ago
  
  Heh. I'm writing a custom tool for a security product that pulls configs down and looks for deviations from expected config values.
  Instead of running the script against the client config and validating it works correctly, I thought to myself "Hey, what if I made a sample configuration with known good and bad values, and have a known result output to quickly validate the script's function?"
  I just invented testing. No, large scale programming and devops is not my primary job. Yes, I have built validation before, but it isn't habit and this is a bespoke project so I didn't think about it at first.

yjftsjthsd-h 4 years ago

> The TSC has discussed this issue at today's meeting and has concluded that there is a general need to begin decoupling hardcoded preferences for BusyBox from the distribution.

Neat. I wonder if the general decoupling will make it eventually easy to drop in ex. toybox or one of the rust/golang coreutils implementations. Or, for that matter, to drop in GNU coreutils, since the current way to add those to Alpine strikes me as a little inelegant in comparison.

senzillaOP 4 years ago

As far as I understand, this initiative is primarily about reducing hardcoded dependencies on Busybox. As such, this is indeed what would enable alternative implementations to exist cleanly alongside whatever is the default.
Because yeah, trying to change Alpine's init system, mdev, or other coreutils is indeed not easy/feasible at the moment.
- blueflow 4 years ago
  
  You can already install coreutils and udev over busybox, without busybox being removed.
  - skarnet 4 years ago
    
    And if you look under the hood, you will see that it's only possible because of a bunch of ad-hoc hacks, which is exactly what my proposal was about: level the playing field and make it possible to have alternatives _without_ the ad-hoc hacks.
    
    blueflow 4 years ago
    
    Are you referring to the busybox symlink farm?
    
    skarnet 4 years ago
    
    Not at all. I'm referring to the structure of the Alpine packaging, especially around boot scripts, that hardcodes busybox in a number of places and makes it difficult to package an alternative without workarounds - which accumulates tech debt.
nibbleshifter 4 years ago

One of the reasons people use alpine is to get away from GNU cruft.
- Bender 4 years ago
  
  The simplicity of Alpine and the lack of systemd is what drew me to it. I have converted all my routers, firewalls, VM's and home appliances to Alpine. I do not have a single regret.
- yjftsjthsd-h 4 years ago
  
  Sure, I like having a distro that defaults to not using GNU (if nothing else, to flavor the GNU/Linux naming debate), but there's nothing wrong with allowing GNU or others to be used. Further, Alpine already packages GNU tools, so I prefer the most elegant implementation possible.
- tannhaeuser 4 years ago
  
  If that's indeed the case, I think it might be because of the fear of license trouble ie the whole business of Docker images referencing/installing a Debian base as the first thing always had the smell of a GPL circumvention device to me, in part at least. But it seems license holders don't really bother, or we've heard about it by now. And in that context, Alpine already got rid of the OS and glib, leaving only the rest of the userland GPLish. OTOH busybox is GPL, too, and actually known for going after violators, so what do I know.
  - nibbleshifter 4 years ago
    
    Its not the licence shit that bothers me, its the absolute garbage state of GNU maintained code.

LAC-Tech 4 years ago

I really like alpine linux. I used it as my WSL2 env for years. I run Void Linux on actual hardware these days (better to use photon for games than WSL2 for work), but would probably switch back to alpine if it had more packages and rolling release, as it had the best package manager I had ever used.

rurban 4 years ago

Better than void? I'm convinced void has the best package manager I've ever used.
yjftsjthsd-h 4 years ago

You could always just use edge if you want a rolling release.
blueflow 4 years ago

Is Alpine edge not a rolling release? I'm using it on my laptop right now.
ryan-duve 4 years ago

I've only had to use `apk` for a Dockerfile layer when we were really trying to minimize the footprint of an image. From what I could tell, there was no discernible difference from `yum` or `apt`. What are the features that make it stand out to you?
- 3np 4 years ago
  
  The packaging format and system is very different to yum and deb, despite the similarities in cli interface for local maintenance. It's quite similar to PKGBUILD in Arch, except with more streamlined tooling.
- LAC-Tech 4 years ago
  
  intuitive CLI, things are packaged a bit better, easy configuration.
  I swore off of apt based distros after I accidentally installed some graphical things on my WSL and multiple debian wizards couldn't figure out how to remove them, even when I installed stuff like dpigs and aptitude.
  - jack_pp 4 years ago
    
    I've been running debian for 8 or 9 years now, on hardware not WSL2 and have no issues with apt. I use debian on my own machine because we use debian on all our cloud VMs and I don't want to learn more than one distro so idk if your WSL2 problems are a great reason to swear debian off
  - onefuncman 4 years ago
    
    My golden rule of sys admin is if you can't fix it, you reformat and replace it.
    If you reproduce the error, congrats, now you get to figure out where to file the bug report with reproduction steps.
    
    herewulf 4 years ago
    
    My golden rule of sys admin is that anything on a Linux based OS can be fixed. Preferably without a reboot.
    Though a friend of mine once discovered that it's a bad idea to force remove glibc (on a non BusyBox distro, of course).
    
    5e92cb50239222b 4 years ago
    
    God, yes. If all you know is reboot & reinstall (bad habits probably brought from the windows world where you generally can't do anything else), you'll never get past the basics.
    
    Gordonjcp 4 years ago
    
    Reinstalling broken Linux systems has been my go-to technique for 20-odd years. It shouldn't take any time at all to get them back up and running, because all the installation and deployment is automated.
    I've never tried Windows but I've heard it's a faff to do this. Good to hear it's catching up, though!
    
    Gordonjcp 4 years ago
    
    > My golden rule of sys admin is that anything on a Linux based OS can be fixed. Preferably without a reboot.
    Is it worth your time, though?

octoberfranklin 4 years ago

Editorialized title is extremely misleading.

At most, this MR is reducing dependencies on busybox's init scripts.

A far more accurate title would be the title of the MR itself: "main/mdevd: make it a fully supported alternative to mdev". The MR is mainly about mdev.

tecleandor 4 years ago

As Ariadne was talking yesterday about mid-long term removing busybox, there was a big hype around it and maybe we/some others made more from this MR than it was.
https://twitter.com/ariadneconill/status/1554846536521207808

freemint 4 years ago

Good.

gtirloni 4 years ago

Do not editorialize submissions.

Settings

Alpine Linux is reducing dependencies on Busybox

Keyboard Shortcuts