Settings

Theme

Beyond Ctrl-C: The dark corners of Unix signal handling

sunshowers.io

165 points by PuercoPop a year ago · 75 comments

Reader

chrsig a year ago

My favorite signal surprise was running nginx and/or httpd in the foreground and wondering why on earth it quit whenver i resized the window.

Turns out, they use SIGWINCH (which is sent on WINdow CHange) for graceful shutdown.

It's a silly, silly problem.

  • eadmund a year ago

    > Turns out, they use SIGWINCH (which is sent on WINdow CHange) for graceful shutdown.

    That’s … that’s even worse than people who send errors with an HTTP 200 response code.

    • aunderscored a year ago

      Disagree. Annoyingly there is a reasonable case for 200 but with an error, if http is your transport but not your application, then 200 says "yes, the message was transfered and understood correctly, here is your response" which may be an error response from the application

      • eadmund a year ago

        If you’re using HTTP for something other than transferring hypertext — i.e., if your application is not a hypermedia application — then you are doing something just as wrong as encoding IP in DNS packets or email messages. Don’t do that. It’s wrong, even if it is technically interesting.

        If, OTOH, your application is a hypermedia application, then returning a success status for errors is just wrong.

        • aunderscored a year ago

          Every JSON API under the sun disagrees, but I do agree in principle. People very much like using HTTP as a JSON (or XML) transfer protocol

        • sunshowers a year ago

          This ship sailed the day the first HTTP proxy was installed, and likely well before that.

        • andreyvit a year ago

          Sorry, what? HTTP is perfectly fine for APIs which are not hypermedia.

      • Izkata a year ago

        For example: Apache (httpd) replaces the 4xx and 5xx response body with its own content instead of whatever you'd returned from an external handler like wsgi. You have to use a 2xx (except for 204) to get a relevant error message back out.

        • AdieuToLogic a year ago

          > For example: Apache (httpd) replaces the 4xx and 5xx response body with its own content instead of whatever you'd returned from an external handler like wsgi.

          This is the default behavior. Apache httpd can be configured to produce different responses by way of ErrorDocument[0]. From the documentation:

            Customized error responses can be defined for any HTTP
            status code designated as an error condition - that is,
            any 4xx or 5xx status.
          
          HTH

          0 - https://httpd.apache.org/docs/trunk/custom-error.html

          • jjnoakes a year ago

            Even with custom error documents configured in the web server, you still lose the application-specific (and probably request- and error-specific) message generated by the application itself.

            • Izkata a year ago

              Yeah, this is how we ran across it - whoever originally wrote a particular feature was trying to do the right thing by using an HTTP error code, but with a message that would be presented to the user about why that operation failed. A generic response wouldn't work, there were multiple possible reasons all fixable by the user, and tying a whole error code to one specific feature would've probably been a bad idea anyway.

      • Groxx a year ago

        Which is why "you resized the terminal window, clearly you meant to shut down this web server" is even crazier, yes

    • thezilch a year ago

      That's ... not what most people are doing. People send _application_ errors on HTTP 200 response codes, because HTTP response codes are for HTTP and not applications. Most "REST" libraries and webdev get this wrong, building ever more fragile web services.

      • ChocolateGod a year ago

        Applications using status codes is useful because it can tell browsers and load balancers to not cache the page in a uniform way.

      • sunshowers a year ago

        I don't think the distinction is as clear-cut as you're making it out to be.

        For example, HTTP 409 Conflict generally means an application-level conflict (e.g. an optimistic concurrency mechanism detected a conflict).

        HTTP 422 Unprocessable Entity is also usually an application-level error (e.g. hash validation failure, or identifier not recognized by the server).

      • LoganDark a year ago

        Task failed successfully

    • chrsig a year ago

      y'know...what really is an error, anyway?

  • thayne a year ago

    Why? That's what SIGTERM is for.

    • chrsig a year ago

      No clue what the decision making process was.

      There's a bug report for httpd dating back to 2011[0]. The nginx mailling list also has a grumpy person contemporary with that[1].

      My guess is someone thought "httpd is a server running somewhere without a monitor attached, why on earth would it get a SIGWINCH!? surely it's available to use for something completely different", not considering users running it in the foreground during development. Nginx probably followed suit for convention, but that's pure speculation on my part.

      Also that was before docker really took off (I'm not sure if it was around in 2011 yet; still in it's infancy maybe). Running it in the foreground didn't happen as much yet. People were still using wamp or installing it via apt and restarting via sudo.

      [0] https://bz.apache.org/bugzilla/show_bug.cgi?id=50669

      [1] https://mailman.nginx.org/pipermail/nginx/2011-August/028640...

      • hulitu a year ago

        > why on earth would it get a SIGWINCH!?

        Reminds me of those "/* not reached */" stories.

    • lolinder a year ago

      They use SIGWINCH for gracefully shutting down workers but not the main process [0]. SIGQUIT is used for a graceful shutdown and SIGTERM for a sort of graceful shutdown (with timeouts).

      SIGWINCH is apparently used for an online upgrade [1]. Because it only shuts the workers down you can quickly transition back to the old binary and old configuration if there's a problem, even after upgrading the binary or config stored on the hard drive.

      I'm sure there are other ways to get a similar capability, but this set of signals is apparently what they came up with.

      [0] http://nginx.org/en/docs/dev/development_guide.html#processe...

      [1] https://www.digitalocean.com/community/tutorials/how-to-upgr...

    • ibash a year ago

      I tried to find out why.

      Unfortunately the change that introduces it predates the official release by a few months. And predates the mailing list by about a year:

      https://trac.nginx.org/nginx/changeset/5238e93961a189c13eeff...

  • ykonstant a year ago

    I don't know whether to laugh or cry.

    • chrsig a year ago

      definitely laugh! life's too short, you'll never get out alive :)

layer8 a year ago

> Another common extension is to use what is sometimes called a double Ctrl-C pattern. The first time the user hits Ctrl-C, you attempt to shut down the database cleanly, but the second time you encounter it, you give up and exit immediately.

This is a terrible behavior, because users tend to hit Ctrl-C multiple times without intending anything different than on a single hit (not to mention bouncing key mechanics and short key repeat delays). Unclean exits should be reserved for SIGQUIT (Ctrl-\) and SIGKILL (by definition).

  • tripdout a year ago

    If you don't know about it, sure, but I find it's kind of convenient to get a safe shutdown and then be able to easily say "I don't care, just stop this program" without needing a separate kill -9 command or something.

    • wombatpm a year ago

      Kids these day. Try resetting server windows on a sgi.

      Subject: -42- How can I restart the X server? Date: 10 Sep 1995 00:00:01 EST

        To restart the X server (Xsgi) once, do any one of the following
        (in increasing order of brutality):
      
        - killall -TERM Xsgi
        - hold down the left-Control, left-Shift, F12 and keypad slash keys
          (this is fondly known as the "Vulcan Death Grip")
        - /usr/gfx/stopgfx; /usr/gfx/startgfx
        - reboot
      
        To restart the X server every time someone logs out of the console,
        edit /var/X11/xdm/xdm-config, change the setting of
        "DisplayManager._0.terminateServer" from "False" to "True" and do
        'killall -HUP xdm'.
    • layer8 a year ago

      As I wrote, Ctrl-\ should do the trick. And it’s just not practical having to know which program applies the double pattern, and having to train yourself to not accidentally hit Ctrl-C twice.

      • __MatrixMan__ a year ago

        My brush with the double-ctrl-C pattern was in a place that wrote a lot of Java. It was generally frowned on to write any code that relied on signals which windows users can't send, and if I recall, Java made it quite difficult anyhow.

        Windows does have a tradition of using ctrl-c to quit though, so SIGINT ends up being one of the few that you can use in both places. It's not pretty, but giving it a different meaning based on how many times you've ordered it seems like a somewhat natural next step, if a hacky one.

  • bonzini a year ago

    In the Meson build system's test harness, a single Ctrl-C terminates the longest running test with a SIGTERM; while three Ctrl-C in a second interrupt the whole run as if you sent the harness a SIGTERM. This was done because it's not uncommon that there are hundreds of tests left to run and you have seen what you want, and it's useful to have an intuitive shortcut for that case.

    However, in both cases it's a clean shutdown, all running are terminated and the test report is printed.

  • jcelerier a year ago

    > Unclean exits should be reserved for SIGQUIT (Ctrl-\) and SIGKILL (by definition).

    I don't know how it works on your keyboard but on french layout, Ctrl-\ is a two-hands, three-fingers, very unpleasant on the wrist, keyboard shortcut. Not a chance I'd use that for such a common operation.

    • mananaysiempre a year ago

      The byte that sends SIGQUIT is very much configurable with stty quit ^X , but unfortunately X has to be a-z or one of \]^_ (that is, 0x41 through 0x5F except 0x5B = [ which would conflict with other uses of ESC = ^[ = 0x1B) because of how the Ctrl modifier traditionally works. Looking at a map of AZERTY, I don’t see any good options, but you may still want to experiment.

      • jks a year ago

        Curiously, on many terminal emulators the following work:

        Ctrl-2 = Ctrl-@ = NUL byte

        Ctrl-3 = Ctrl-[ = ESC

        Ctrl-4 = Ctrl-\ = default for SIGQUIT

        Ctrl-5 = Ctrl-] = jump to definition in vim

        Ctrl-6 = Ctrl-^ = mosh escape key

        Ctrl-7 = Ctrl-_ = undo in Emacs

        I think these probably originate in xterm.

      • cperciva a year ago

        I map SIGQUIT to ^Q because that's the easiest to remember.

        • glandium a year ago

          I suppose you never hit CTRL+S by accident?

          • kzrdude a year ago

            stty -ixon

            Make sure that thing is disabled

            • marcosdumay a year ago

              I like that Konlose defaults into disabling that thing. And also that there is a visual sign of the terminal being stopped.

          • icedchai a year ago

            Ctrl-S / Ctrl-Q was super useful in the dialup modem days.

          • cperciva a year ago

            Rarely enough that needing to open another terminal and use kill to send a signal doesn't bother me.

    • remram a year ago

      I think the point is that it is not to be a common operation.

      • jcelerier a year ago

        well I don't know, it feels like I must mash ctrl-c twenty times per day on average at least

    • Sophira a year ago

      While on UK keyboards it's the opposite "problem" - the left Ctrl key and the \ key are right next to each other (making it potentially a one-finger operation), which is the opposite of how a US keyboard is laid out (where Ctrl-\ was presumably intended to need to be a two-handed, two-finger operation).

      • Izkata a year ago

        > which is the opposite of how a US keyboard is laid out (where Ctrl-\ was presumably intended to need to be a two-handed, two-finger operation).

        We have a right Ctrl, so one-hand two-finger.

      • LtWorf a year ago

        two handed operations shouldn't exist.

        • Sophira a year ago

          I completely agree - they're very inaccessible. That's why I quoted the word "problem"; it's not actually a problem at all.

      • mananaysiempre a year ago

        stty quit ^] ?

  • marcosdumay a year ago

    It's worse, because there are languages that encode interruption into the error handling functionality, so it's common that people mismanage their errors and programs require several Ctrl-C presses to actually reach the interruption handler.

    What means that you have to memorize a list of "oh, this program needs Ctrl-C 3 times; oh, this program must only receive Ctrl-C once!"... I don't know of any "oh, this program needs Ctrl-C exactly 2 times", but it's an annoying possibility.

    • wongarsu a year ago

      Any software I've come across that uses intentional double ctrl-c shows a message after the first ctrl-c. Something to the effect of "shutting down gracefully, press ctrl-c again for immediate shutdown".

      Hence you can just press it once and wait half a second, if no message to this effect appears you can spam ctrl-c.

  • bcrl a year ago

    That shouldn't matter. Your database should be consistent in the face of an unclean exit. ACID has been around for a long time.

  • Levitating a year ago

    They can print a message that states that it is attempting to quit cleanly but can be forced to quit by pressing Ctrl+C another time(s). Unison does this.

  • sunshowers a year ago

    While I agree in spirit, I also want to meet users where they are.

cperciva a year ago

The article doesn't mention the most useful of all signals: SIGINFO, aka "please print to stderr your current status". Very useful for tools like dd and tar.

Probably because Linux doesn't implement it. Worst mistake Linus ever made.

Also, it talks about self-pipe but doesn't mention that self-socket is much better since you can't select on a pipe.

  • epcoa a year ago

    > self-socket is much better since you can't select on a pipe.

    This needs further explanation. Why can’t you select on a pipe? You certainly can use select/poll on pipes in general and I’m not sure of any reason in particular they won’t work for the self pipe notification.

    Its even right in the original: https://cr.yp.to/docs/selfpipe.html

    • cperciva a year ago

      Oops, brainfart. Sadly it's too late for me to edit that comment.

      Yes, you can select just fine on pipes. What I was thinking of is that recv and send doesn't work on pipes, and asynchronous I/O frameworks typically want to use send/recv rather than write/read because the latter don't have a flags parameter.

  • sunshowers a year ago

    Thanks for the feedback! As the talk and the post both mentioned, I was focusing on signals that work on all Unix platforms. Within the constraints of a 30 minute talk there must be material left on the cutting room floor. (If I started talking about the specifics of various Unix lineages I could fill up a whole day...)

    For most users in the real world, self-pipes are sufficient. This includes mio (Tokio's underlying library)'s portable Unix implementation of wakers (how parts of the system tell other parts to wake up).

  • avidiax a year ago

    SIGSTOP and SIGCONT are very useful as well.

    SIGSTOP is the equivalent of Ctrl-Z in a shell, but you can address it to any process. If you have a server being bogged down, you can stop the offending process temporarily.

    SIGCONT undoes SIGSTOP.

    The cpulimit tool does this in an automated way so that a process can be limited to use x% of CPU. Nice/renice doesn't keep your CPU from hitting 100% even with an idle priority process, which may be undesirable if it drains battery quickly or makes the cooling fan loud.

    • sunshowers a year ago

      Note Ctrl-Z is actually SIGTSTP, which is basically "SIGSTOP except the process can install a signal handler for it".

      I have a very exciting blog post about debugging a nasty bug with how SIGTSTP works, coming very soon.

  • fragmede a year ago

    dd prints out status when sent SIGUSR1, but yeah that would be cool if other utilities did that as well off SIGINFO.

    • cperciva a year ago

      And does ^T map to SIGUSR1? That's the other thing which makes it so useful in BSD.

      • saagarjha a year ago

        You wouldn’t want it to, because the default behavior for SIGUSR1 is to terminate.

        • cperciva a year ago

          Exactly. Whereas on BSD hitting ^T is (a) very likely to print useful information, and (b) if it doesn't do that, won't do anything at all.

efxhoy a year ago

I recently wrote a little data transfer service in python that runs in ECS. When developing it locally it was easy to handle SIGINT: try write a batch, except KeyboardInterrupt, if caught mark the transfer as incomplete and finally commit the change and shut down.

But there’s no exception in python to catch for a SIGTERM, which is what ECS and other service mangers send when it’s time to shut down. So I had to add a signal handler. Would have been neat if SIGTERM could be caught like SIGINT with a “native” exception.

  • mananaysiempre a year ago

      from signal import SIGTERM, raise_signal, signal
      import sys # for excepthook
      class Terminate(BaseException):
          pass
      def _excepthook(type, value, traceback):
          if not issubclass(type, Terminate):
              return _prevhook(type, value, traceback)
          # If a Terminate went unhandled, make sure we are killed
          # by SIGTERM as far as wait(2) and friends are concerned.
          signal(SIGTERM, _prevterm)
          raise_signal(SIGTERM)
      _prevhook, sys.excepthook = sys.excepthook, _excepthook
      def terminate(signo=SIGTERM, frame=None):
          signal(SIGTERM, _prevterm)
          raise Terminate
      _prevterm = signal(SIGTERM, terminate)
  • Spivak a year ago

    I mean you can just have the signal handler throw StopRequested in your Python boilerplate and never think about it again.

    One common pattern is raising KeyboardInterrupt from your handler so it's all handled the same.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection