Settings

Theme

The Linux Backdoor Attempt of 2003 (2013)

freedom-to-tinker.com

298 points by t4h4 5 years ago · 143 comments

Reader

yyyk 5 years ago

This is an obvious backdoor attempt, as the code doesn't make sense otherwise. Yet, the attempt was far too unsubtle and underspecific for agencies such as the NSA. The payoff was low compared to the possibilities - local privilege escalations were a dime-a-dozen.

Worse, agencies such as the NSA have two missions: offence and defence. Adding in backdoors helps the offensive mission, but hurts the defensive mission, so it only makes sense if the backdoor isn't so easy to find. An obvious backdoor hurts the US far more than it helps the US. This one was too obvious.

Some ideas:

1) A script kiddie found some way to break-in and edit CVS. The entire idea being to have something to brag about. This was caught too early to be brag-worthy (breaking ancient CVS isn't something to brag about).

2) It was a warning shot from some Western agency meaning "tighten up your security".

  • sslalready 5 years ago

    If memory serves me right the CVS bug was originally discovered and exploited by a member of an infamous file sharing site. After descriptions(?) of that bug were leaked in underground circles, an east European hacker wrote up his own exploit for it. This second exploit was eventually traded for hatorihanzo.c, a kernel exploit, which was also a 0-day at the time.

    The recipient of the hatorihanzo.c then tried to backdoor the kernel after first owning the CVS server and subsequently getting root on it.

    The hatorihanzo exploit was left on the kernel.org server, but encrypted with an (at the time) popular ELF encrypting tool. Ironically the author of that same tool was on the forensic team and managed to crack the password, which turned out to be a really lame throwaway password.

    And that's the story of how two fine 0-days were killed in the blink of an eye.

    • dzdt 5 years ago

      This sounds like a very interesting tale. Are there more details written somewhere? How close to first-person is your source of information?

      • sslalready 5 years ago

        Not that I'm aware of, but I wish. Memory is getting hazy these days. AFAIK the kernel.org breaches were made by the kind of hackers doing it for fun and games (if you get that thing) and not the kind working for nation states. I'm sure you can (or at least, at some point could) find others who know more details at your favorite compsec conf.

      • unixhero 5 years ago

        The right questions...

    • mherdeg 5 years ago

      This is great storytelling, thanks. Maybe worth a letter to 2600 magazine?

  • badRNG 5 years ago

    > 2) It was a warning shot from some Western agency meaning "tighten up your security".

    That's an interesting theory that'd certainly make for a powerful message. Has anything like that been done before or is there any precedence for Western agencies to do these sorts of things covertly?

    • yyyk 5 years ago

      I can't point to any evidence, but two things to note:

      A) Even on HN the temptation has come up. e.g. some comments in posts about ransomware make a similar argument for transparently damaging and self-serving actions. Three letter agencies with much more power and ability probably had people making the same arguments.

      B) The payoff was extremely low compared to the possibilities. Either whomever did this was unaware of the possibilities or not really interested in a major hack. Perhaps the idea was that if this actually works, the damage isn't so big, aside from embarrassing the Linux kernel team, and when the team noticed they'd tighten up.

      • ed25519FUUU 5 years ago

        This theory seems so outrageously far fetched to me. Why in the world would a "friendly" intelligence agency sneak a working backdoor into a project to "teach a lesson"??

        Here's what our intelligence agencies do when they decide to "teach a lesson"[1]. It doesn't include sneaking working backdoors into software. They do THAT when they plan on using the backdoors.

        https://www.marketwatch.com/story/nsa-alerts-microsoft-of-ma...

        • yyyk 5 years ago

          I'm pretty fine with the script kiddie thesis. But if we go for an intelligence agency, we have to explain why the hack was so.. small. A local privilege escalation that is relatively easy to find* should be of very limited use at best. They(tm) get ability to fake linux kernel source and that's all they do!?

          * Even if the linux kernel folks had failed to notice the CVS hack, someone would have eventually diffed the kernel versions and found it. Assigning uid to 0 is rather obvious, and quite a lot of linters warn about assignment in comparison.

          But if they had included (for example) some sort of off-by-one buffer overflow, the hack would have been a lot less apparent. Now do that for a remote exploit, and they get way more possibilities.

    • InfiniteRand 5 years ago

      An easy to spot exploit as a warning shot is probably more likely to come from a random grumpy hacker, there is a long history of that

  • NelsonMinar 5 years ago

    The NSA used to have a defensive mission. They fully compromised their ability to do that by subverting the security of American products time and time again. The Shadow Brokers disclosure alone has completely undermined any trust anyone in the industry has for the NSA.

    • ardy42 5 years ago

      > The NSA used to have a defensive mission.

      The NSA still has a defensive mission, and it hasn't changed. It just might not be the defensive mission you assumed it was. IIRC, it's mainly to defend US Government systems and communications from adversaries. To the extent they help with the defense of civilian systems, their goal seems to be to give them adequate security, not absolute security.

      For instance, take this episode from the development of DES during the 70s:

      https://en.wikipedia.org/wiki/Data_Encryption_Standard#NSA's...

      > NSA worked closely with IBM to strengthen the algorithm against all except brute-force attacks and to strengthen substitution tables, called S-boxes. Conversely, NSA tried to convince IBM to reduce the length of the key from 64 to 48 bits. Ultimately they compromised on a 56-bit key.

    • xxpor 5 years ago

      They still get to have input into FIPS whether anybody likes it or not

WhiteSage 5 years ago

From the article comment section:

> this is not a mistake.

> Assume that the coder meant == 0 what is he trying to enforce. If these 2 bits (_WCLONE and _WALL) are set and your are root then the call is invalid. The bit combination is harmless (setting WALL implies WCLONE [...]), and why would you forbid it for root only.

  • roenxi 5 years ago

    > ...code change in the CVS copy that did not have a pointer to a record of approval. Investigation showed that the change had never been approved and, stranger yet, that this change did not appear in the primary BitKeeper repository at all...

    I'll attach this here for people who read the article too quickly and think it may, somehow, have been a bug. This code was a very deliberate attack.

  • mikkom 5 years ago

    This is also very relevant comment:

    > In addition, parentheses were not required for the final comparison. This was done to prevent compiler warnings. This looks deliberate.

    • simias 5 years ago

      I would put parentheses here, I never like mixing logical operators with other types (or even different types of logical operators). While it's of course entirely redundant here, it also makes the code easier to read IMO.

      I think the parent's point is more convincing: why make this check only for root in the first place?

      • kazinator 5 years ago

        The parentheses are required if == is changed to =.

        == has a higher precedence than &&, but = has a lower precedence.

           a = b && c && d
        
        means

           a = (b && c && d)
        • simias 5 years ago

          Sure, my point was that even with the proper == comparison I'd still write the (now redundant) parens because I find it more readable that way.

          Actually in languages like Rust with type inference that make it cheap and non-verbose to declare intermediate values I tend to avoid complicated conditions altogether, I could rewrite the provided expression like:

              let invalid_options = (options == (__WCLONE|__WALL));
              let is_root = (current->uid == 0);
          
              if invalid_options && is_root {
                  // ...
              }
          
          One might argue that it's overkill but I find that it's more "self-documenting" that way. I find that the more experienced I get, the most verbose my code becomes. Maybe it's early onset dementia, or maybe it's realizing that it's easier to write the code than to read it.

          Of course you can do that in C as well but you have to declare the boolean variables, and in general it's frowned upon to intermingle code and variable declarations so you'd have to add them at the beginning of the function so it adds boilerplate etc...

londons_explore 5 years ago

The underhanded C contest shows it is so easy to insert backdoors into C code that even someone staring at the code for a while wouldn't find.

So why did this attacker choose such an obvious 'typo' rather than a subtle flaw in a large patch set?

  • GuB-42 5 years ago

    It is not so easy, it is a contest, and they show you the winners.

    And if you look at the "Scoring and Extra Points" section of http://underhanded-c.org/_page_id_5.html you will notice that it checks most of the boxes.

    It is short, errors based on human perception (here = vs ==) are good enough, it is innocent looking under syntax highlighting, is is not platform dependent, and it even passes the "irony" check. It is just the plausible deniability that is not great, but it is still defensible with a lot of bad faith.

    • mercer 5 years ago

      now I'm wondering if syntax highlighting shouldn't somehow make an assignment inside an if statement (and the variants) a bright red, or something like that.

      • mhh__ 5 years ago

        It should really be forbidden by the compiler these days, or at least a very loud warning.

        • GuB-42 5 years ago

          It would break these kinds of constructs, which are common.

            if ((fd = open(...)) != -1) {
                /* do something with fd */
            } else {
                perror("open");
            }
          
          The compiler outputs a warning when you have something like "if (a = b)". If that's what you mean (it sometimes is), you have to write it "if ((a = b))" to silence the warning.
          • a1369209993 5 years ago

            TBF, that doesn't actually work - you have to write:

              int fd; /* fd must be declared in order to be assigned */
              if ((fd = open(...)) != -1) {
                  /* do something with fd */
              } else {
                  perror("open");
              }
            
            which would be better written as:

              int fd = open(...);
              if (fd != -1) /* etc */
            
            It would be nice for scoping reasons to be able to write something like:

              if ((int fd = open(...)) != -1) /* etc */
            
            but

              if ((int fd == open(...)) != -1) /* etc */
            
            isn't valid code.
            • GuB-42 5 years ago

              Exactly my thoughts, I was about to comment on that but I was too lazy so I omitted the declaration. Obviously, this is HN, so someone had to point it out ;)

              Anyways, I am a bit torn about the second option. I like the idea of putting the call inside the if clause as it makes for a very explicit error handling but the uninitialized declaration is ugly. What I do in practice tends to depend on the situation but it is rarely satisfying.

              Your last suggestion would be ideal, but as you said, it is invalid code unfortunately.

              Maybe this

                for (int fd = open(...); fd != -1; fd = -1) {
                    /* do stuff */
                }
              
              Just kidding, don't do that.
              • a1369209993 5 years ago

                  > for (int fd = open(...); fd != -1; fd = -1) {
                
                that doesn't correctly (or rather at all) handle the failure case. You should do (IIRC):

                  #define LET(...) for(__VA_ARGS__,_tmp[0],*_once=_tmp; _once ;_once=0)
                  /* ^^^ goes in a header file */
                  LET(int fd = open())
                    {
                    if(fd == -1) { /* handle error, return or break */ }
                    /* do stuff */
                    }
            • saagarjha 5 years ago
          • saagarjha 5 years ago

            Here's a classic:

              char *strcpy(char *d, char *s) {
                  char *r = d;
                  while (*d++ = *s++);
                  return r;
              }
            
            (If you don't need the return value, this becomes even nicer.)
        • djeiasbsbo 5 years ago

          What about:

          while((i = getchar()) != EOF) putchar(i);

          This type of thing seems to be "encouraged"...

      • jhardy54 5 years ago

        You may be interested in linters.

  • IshKebab 5 years ago

    Because of selection bias - if they had chosen something less subtle then we wouldn't be talking about it.

    • brobdingnagians 5 years ago

      Maybe it is a smoke screen, put in something likely to be found and something that won't. Everyone pats themselves on the back for finding the obvious one...

      • nkrisc 5 years ago

        Why not just include the thing that won't be found and call it a day?

        Including a red herring to invite extra scrutiny doesn't seem wise if you're trying to hide something.

      • flingo 5 years ago

        Step 1: Get 3 pigs. Step 2: Number them 1, 2, and 4. ...

      • gvjddbnvdrbv 5 years ago

        This seems highly likely.

    • kubanczyk 5 years ago

      You've probably meant: if they had chosen something much subtler we wouldn't be talking about it.

  • staycoolboy 5 years ago

    GCC warns of assignment in conditional, even without -Wall or -pendantic. I don't know when it started doing that, but it seems like a sore thumb today, different in 2003 maybe?

    • not2b 5 years ago

      It only warns if the assignment doesn't have an extra pair of parentheses. These were added in this case, to silence the warning (so the attack would not be noticed). The parentheses are also needed in this case to get the precedence right, but they won't be needed if '==' were written, so anyone coding this by accident would immediately be warned of the mistake.

hmottestad 5 years ago

I admit that I read the code and completely overlooked the single equals sign. Makes me wonder why it would be so easy to change the userid. Shouldn’t there be some safeguards in place to stop the userid from being updated from unsafe places.

  • cyphar 5 years ago

    These days it'd be harder to write code which is "easy to overlook" -- the innocent version would be something like

      if (/* ... */ || current_euid() == GLOBAL_ROOT_KUID)
    
    But the "backdoor" version would fail to compile (current_euid() is a macro but it's written to not be a permitted lvalue). You would need to write something more obvious like the following (and kernel devs would go "huh?" upon seeing the usage of current_cred() in this context)

      if (/* ... */ || current_cred()->euid = GLOBAL_ROOT_KUID)
    
    In addition, comparisons against UIDs directly are no longer as common because of user namespaces and capabilities -- correct code would be expected to look more like

      if (/* ... */ || capable(CAP_SYS_ADMIN))
    
    Which you can't write as an "accidental" exploit. And since most permission checks these days use capabilities rather than raw UIDs you'd need to do

      commit_creds(get_cred(&init_cred));
    
    Which is bound to raise more than a couple of eyebrows and is really non-trivial to hide (assuming you put it somewhere as obvious as this person did).

    But I will say that it would've been much more clever to hide it in a device driver which is widely included as a built-in in distribution kernels. I imagine if you managed to compromise Linus' machine (and there are ways of "hiding" changes in merge commits) then the best place would be to shove the change somewhere innocuous like the proc connector (which is reachable via unprivileged netlink, is enabled on most distribution kernels, and is not actively maintained so nobody will scream about it). But these days we also have bots which actively scan people's trees and try to find exploits (including the 0day project and syzkaller), so such obvious bugs probably would still be caught.

    • perihelions 5 years ago

      "(current_euid() is a macro but it's written to not be a permitted lvalue)"

      I'm not an expert at C. I followed up on this kernel macro out of curiosity, and it was a confusing learning experience because it turns out the forbidden assignment

          ({ x; }) = y;
      
      is silently permitted by GCC (for example, with -Wall --std={c99,c11,c18}), and does actually assign x=y. Even though that's expressly prohibited by the C standard (-Wpedantic).

      I assume this is old news to C programmers, but its insidiousness surprised me.

      • simias 5 years ago

        Example #5984 of why I don't like the kernel's convention of masquerading macros as function by making them lowercase. I wasted so much time deciphering weird compile errors or strange behaviour only to finally realize that one of the function calls in the offending code was actually a macro in disguise.

        It's especially bad when some kernel macros, such as wait_event, don't even behave like a function would (evaluating the parameter repeatedly).

        One more thing Rust got right by suffixing macros with a mandatory !.

        • cyphar 5 years ago

          The best one is "current" -- which is a macro that looks like a variable but becomes a function call and thus if you ever want to use the variable name "current" you will get build errors. :D

      • cyphar 5 years ago

        Huh, I assumed (just as you did) that this would obviously not work -- but you're right that GCC ignores this and allows the assignment anyway.

        However it turns out that you still get a build error, and even the more explicit versions also give you a error:

          kernel/cred.c:763:17: error: assignment of member ‘euid’ in read-only object
            763 |  current_euid() = GLOBAL_ROOT_UID;
                |                 ^
          kernel/cred.c:764:23: error: assignment of member ‘euid’ in read-only object
            764 |  current_cred()->euid = GLOBAL_ROOT_UID;
                |                       ^
          kernel/cred.c:765:22: error: assignment of member ‘euid’ in read-only object
            765 |  current->cred->euid = GLOBAL_ROOT_UID;
                |                      ^
        
        So it is blocked but not for the reason I thought. current_cred() returns a const pointer and all of the cred pointers in task_struct are also const. So you'd need to do something more like:

          ((struct cred *)current_cred())->euid = GLOBAL_ROOT_UID;
        
        Which is well beyond "eyebrow-raising" territory.
      • pwdisswordfish4 5 years ago

        The C standard cannot expressly prohibit anything about a feature that isn't part of the C standard.

      • kazinator 5 years ago

        What you quoted:

            ({ x; }) = y;
        
        isn't ISO C syntax; it's a GNU extension.

        -Wpedantic diagnoses ISO C syntax errors, even if they are GNU extensions.

  • db48x 5 years ago

    Absolutely. This is an example of the poor design of the C language. Other languages that were around at the time C was created choose `:=` as assignment and `=` for equality tests, making this type of typo quite impossible.

    Common Lisp makes the Hamming distance even larger; equality tests are written as `(eq foo bar)`, while changing a value is `(setf foo bar)`. Common Lisp may have features which are undesirable in an OS kernel (garbage collection), but it does make the code wonderfully clear and easy to read.

    • ed25519FUUU 5 years ago

      := Still seems easy to overlook at a cursory glance.

      • kazinator 5 years ago

        What db48x neglected to mention is that some of those languages also featured assignment as strictly a statement; it could not be a subexpression. As in:

           fun(x := 42);   (* syntax error in Pascal *)
        
           x := 42;  (* OK *)
        
           x = 42; (* hopefully a statement with no effect warning *)
        
        If assignment is a statement, it's possible to use the same token. Classic BASIC:

           10 X = 5
           20 IF X = 5 GOTO 10
        
        This doesn't cause the C problem of mistaken assignment in place of a test, so it's rather ironic that C managed to shoot itself in the foot in spite of dedicating twice the number of tokens.
        • a1369209993 5 years ago

          That is true, but has nothing to do with "==" vs "=" vs ":=". You can do:

            func(x = 42); /* syntax error: expression operator expected, got '=' */
            x = 42; /* OK */
            x == 42; /* warning: statement expression has no effect */
          
          just fine if you require "=" to be a statement.
        • db48x 5 years ago

          I'd forgotten that!

  • jdblair 5 years ago

    You make a good point, but in a monolithic kernel the kernel is the “safe place.” Most likely the effect of this would be subtle and not necessarily long lived.

  • Cthulhu_ 5 years ago

    Same; my Java indoctrination is kicking in and is asking why that field is apparently public and there's no controls as to what process can set it.

    That said, counterpoint, it's the kernel and performance is super important; the overhead of adding setters (etc) or an utility function like "current->isRoot()" is probably a tradeoff they made at some point.

  • jrockway 5 years ago

    Same! I saw the if statement, was 100% sure this was going to be an "= instead of ==" thing... and still missed it. I spent too much mental energy looking at ((__LOUD|__NOISES)) and missed the obvious "current_user = 'root'" statement.

dang 5 years ago

If curious see also

2018 https://news.ycombinator.com/item?id=18173173

Discussed at the time (of the article): https://news.ycombinator.com/item?id=6520678

davidhyde 5 years ago

A uid of 0 being root is just such a bad idea to begin with because 0 is a default value of so many data types. It’s an accident waiting to happen and, in this case, a good way to hide something malicious as an accident.

  • chmod775 5 years ago

    >and, in this case, a good way to hide something malicious as an accident

    The number could've been 2342 and the backdoor would've worked exactly the same way.

  • lqet 5 years ago

    AFAIK only external and static variables are default initialized in C. For all other variables, the default value is undefined, so 0 is as good a choice as any other here.

    • DC-3 5 years ago

      Except that uninitialised memory is substantially more likely to be 0 than any other value.

      • grishka 5 years ago

        Except sometimes it is not and forgetting to initialize a variable in C/C++ leads to very insidious bugs that no one can reliably reproduce.

    • kevincox 5 years ago

      That's not quite true. While it is undefined 0 is a fairly common value for memory and registers meaning that your "undefined" values is likely 0 a higher than average amount of the time.

    • anonymousiam 5 years ago

      There is also the issue that (at least on some platforms) ECC memory must be initialized before being read, or an exception will occur.

    • fnord77 5 years ago

      if you use malloc, yes. calloc will initialize the variables

aborsy 5 years ago

There should be safe guards against such errors. Even with approval, the reviewer may not notice it.

Which brings up the question: how many more root-based backdoors are there now in the source code?

  • Cthulhu_ 5 years ago

    Unfortunately in C and its derivatives, the safeguards would have to be external tools (static analysis, linters); it's a perfectly valid statement in code.

    I wouldn't mind if languages simply mark assignments in conditions as errors. It's clever code, but clever code should be avoided in critical systems. And in general, I guess.

    • asveikau 5 years ago

      Not all c-syntax languages let you implicitly convert from integer or pointer to boolean though. Java and C# don't. I have heard MISRA C doesn't allow it.

      I actually don't mind this feature of C personally, just playing devil's advocate. Some people feel really strongly about not implicitly allowing conversion to bool. This is why.

    • wycy 5 years ago

      Assignments in conditionals can be handy handy, but I think it's better when there's a keyword for it. The Rust/Swift `if let` syntax is pretty nice for this.

          if let userID = 0 {}
      
      vs

          if userID == 0 {}
      
      The let syntax makes this error more obvious.
      • kibwen 5 years ago

        Since you're using Rust as an example there, worth noting that unlike in C the assignment operator in Rust does not evaluate to the assigned value (it evaluates to the unit value `()` instead). In combination with the fact that `if` expressions in Rust require their conditions to be bools (the language has no automatic coercion to bool), this means that `if foo = 0` is guaranteed to be a type error.

        (This difference in the behavior of the assignment operator is a result of Rust's ownership semantics; since assignment transfers ownership, having the assignment operator evaluate to the assigned value would actually result in the original assignment being entirely undone!)

    • tenebrisalietum 5 years ago

      Assignments in conditions can sometimes be useful and lend clarity, if it makes sense for the assignment to "fail".

      For the rough, rough example the below is probably not too clever.

      `if (!(my_socket=new_socket(inet_addr)) { fail(); }`

  • gonzo41 5 years ago

    A classic paranoid security question.

    The ace in Linux's pocket is that you're free to read it all. That can't be said for Apple, and Microsoft or any of the OS's running switches and hubs out there. Let alone all the server side cloud code.

    • kubanczyk 5 years ago

      Parent said "in the source code" not "in the Linux source code". Given the abysmal standards of security everywhere, it's quite logical thing to assume that many parties have backdoors scattered around various OSes. A tempting target with such multiplicative benefits.

      I don't think it's a paranoid question and I don't think it's even a question. It's a natural assumption and I'd demand exceptionally good evidence to challenge that.

      Points for Linux for its openness, people will probably catch some of these.

  • rectang 5 years ago

    This particular glitch was inserted via an attack on the BitKeeper repository. (EDIT: it was actually a CVS mirror of the repo.)

    But for the normal contribution flow, code review isn't the only safeguard. There's also a deterrent in that should a backdoor be inserted via a contribution that went through the normal process, an audit trail exists. If the backdoor is later discovered, there would be reputation harm to the contributor.

    Depending on how much an open source project knows about its contributors, it may be more or less difficult to track down a culprit, but in any case the audit trail makes such attacks more complicated.

    • FartyMcFarter 5 years ago

      > This particular glitch was inserted via an attack on the BitKeeper repository.

      No, it was inserted into the CVS mirror.

widforss 5 years ago

Won't gcc complain if you assign a variable within an if-statement?

  • GuB-42 5 years ago

    Not if you surround the expression with extra parenthesis. And that's what they did here.

    Assignments in if-statement can be useful, and that's how you prevent the compiler from complaining. That warning is intended for honest mistakes, not to catch backdoors.

    • tsbinz 5 years ago

      The parentheses here aren't actually "extra", without them the meaning would change - since && binds tighter than = without the parentheses the left hand side of = would not be an lvalue and compilation would fail.

msla 5 years ago

This is something C linters have been catching probably since there have been C linters, either from looking for that specific pattern (a lone equals sign in a conditional) or by "inventing" the notion of a boolean type long before C had one and then pretending that only comparison operators had such a type.

Needless to say, the better class of compiler catches this fine. gcc 9 does with -Wall and makes it an error with -Werror. Ditto clang 9. (Look at me giving version numbers as if this were recent. Any non-antediluvian C compiler worth using will do the same.) My point is, any reasonable build would at least pop up some errors for this, making it appear amateurish to me.

  • KMag 5 years ago

    > non-antediluvian C compiler

    Contrary to popular opinion, Noah's C compiler was actually highly advanced, but he only brought one copy on the ark with him. No backups, and less than ideal storage conditions... you can guess what happened next. A triceratops ate the parchment tape containing the only copy of Noah CC, and Noah threw the offending triceratops off the Ark, because in his rage, he thought "I have a spare tricero". Only afterword did he realize the error in his logic, thus dooming the triceratops to extinction.

    * Only found in highly divergent manuscripts, widely assumed to be late additions.

  • smcl 5 years ago

    I think I recall reading that around that time (remember this is 2003) Linus was either against -Werror or against spending effort eliminate warnings. The reason being that GCC had a few false positives, and the effort of making Linux kernel build with these spurious errors was not worth the risk of breaking code that likely worked ok.

    However I can't find anything where this is directly said, all I can find is a collection of Linus' early 00s emails on the subject of GCC which includes a LOT of reference to said warnings: https://yarchive.net/comp/linux/gcc.html

  • tsbinz 5 years ago

    I would be careful with statements like this. New compilers do NOT make this a warning/error, see for example

    https://godbolt.org/z/5zzz33

    Note that there are parentheses around the assignment which the compiler takes as an indication that this is intentional. Also note that the parentheses are required because without them the precedence would be wrong.

    • kazinator 5 years ago

      Since the parentheses are required due to precedence, then they are not there to show "I intend this assignment to happen". That would have to be:

        if ((options == (__WCLONE|__WALL)) && ((current->uid = 0)))
      
      As an aside, note that this particular case also has the problem that the assignment expression makes the entire test expression false, which is suspicious. If an assignment expression occurs in the controlling expression of a selection or iteration statement, such that the entire expression is always true or false as a result, that should probably be warned about no matter how many parentheses have been heaped on to the assignment.
      • tsbinz 5 years ago

        I'm not saying that a compiler shouldn't flag this. I'm just saying that current compilers don't.

        I'd guess that static analysis tools do flag it, but haven't checked.

  • phh 5 years ago

    I don't think gcc 9 was available in 2003.

    • nurettin 5 years ago

      We had gcc3, but some people were still stuck with redhat's patched 2.96 (which was officially 2.95 + some security patches)

gitgud 5 years ago

Has this happened since the source-control was changed to git? I imagine it would be almost impossible to break into Linus Torvald's git server amend previous commits, considering each one's hashed on the previous commits...

  • eru 5 years ago

    If you can break SHA1, that task would be easier.

    • Cthulhu_ 5 years ago

      SHA1 is close to being broken, but it's not there yet, and Git will be migrating to a better algorithm.

      That said, if you could rewrite an older commit, the change would only be applied in a fresh clone, right?

      • db48x 5 years ago

        Even if you could break SHA1, it's unlikely that your replacement source code would look like it was human-written. Instead, it's going to look like human-written source code containing kilobytes or megabytes of random-looking comments. The comments will only be there to change the hash of the new content back to the hash of the original content. It's not going to be subtle at all.

        • flingo 5 years ago

          Why would it require that much data? I always thought you wouldn't need to add or change more bytes than are in the output.

          Also, git hashes aren't just based on source code. You can add that data anywhere that git uses to generate the hash.

          • db48x 5 years ago

            That's true of a CRC code, but hashes are a lot harder to break.

            Git hashes each file, and puts those hashes into a tree object, like a directory listing. Then it hashes the trees, recursively back up to the root of the repository. Finally the hash of the root tree is put in the commit object, and the commit object is hashed. Thus the two places you can put additional data to be hashed are the file contents (either in existing files or new files), or in the commit message. You can get a few free bits by adjusting less obvious things like the commit timestamp or the author's email address, but not nearly enough to make your forged commit have the same hash as an existing commit.

            • flingo 5 years ago

              I'm still not following why it'd require so much data? I thought the goal was to have the commit hash collide with an existing commit hash, is that not enough?

              I looked around, and it seems like the right place to hide the added data is in the "trailer" section of the commit. It's where signed-off-by lives and is used to generate the commit hash.

              You might want to come up with a plausible reason for random data to go in there though. (likely using a header that wouldn't normally get printed out)

              • db48x 5 years ago

                In a CRC-style code, you're essentially adding up all the bytes and letting it overflow the counter, so that the counter is a fixed size (usually 16 or 32 bits). Then you add a few more bytes, exactly the same size as the counter, so that the data bytes plus the extra bytes add up to zero. The extra bytes are delivered along with the data bytes, so that the recipient can repeat the calculation and verify that the total is still zero. If you modify the data, it is trivial to recalculate the CRC code so that the total is still zero.

                Hashes are much, much more complex, and they're non-linear. Each bit of the hash output is intended to depend on every single bit of the input, so that changing a single bit in the input creates a radically different hash output.

                In a paper published this year, https://eprint.iacr.org/2020/014.pdf, the authors Gaëtan Leurent and Thomas Peyrin changed the values of 825 bytes out of a 1152 byte PGP key in order to generate a new key with the same signature (aka, the same hash). It only cost about $45k, too.

        • kibwen 5 years ago

          The git hash surely also takes the contents of binary files into account, so I imagine that in any repo that contains non-text files, an attacker would try to hide the garbage inside e.g. some metadata field of an image file.

          • db48x 5 years ago

            That's true. PDFs and other document formats are also great because you can include large volumes of data that is never used in the final output.

      • tomxor 5 years ago

        > That said, if you could rewrite an older commit, the change would only be applied in a fresh clone, right?

        I think so, assuming the fetch algorithm is using the hashes to get the deltas which I think it does.

        I'm not sure about CVS but with GIT rewriting a _previous_ commit _object_ itself with different blobs but making the commit object itself have the _same_ hash by messing with it's comment wouldn't cause any difference in child commits since commits are pretty much independent other than the pointers to parent/child and incorporating that into it's hash (i.e they would have different trees so the changes would not propagate to the HEAD of the branch).

        I think the only way have something end up in the HEAD of a branch AND persist is to break the SHA1 of a blob (i.e a file) by inserting the extra SHA1 breaking content into the blob itself rather than a commit tree (provided that exact blob hash is part of the tree in the HEAD of a branch). Then you would also need to hope that the malicious blob is fetched by the person who writes the next commit to be based upon the HEAD of that branch AND modifies the same file blob so that it persists into the next revision of the blob... seems pretty hard to pull off - pun intended

        There is also the issue of pushing a blob that already exists on the remote according to the hash. Even with re-write permission GC might make that hard to do quickly.... I wonder if you would need direct access to the git server to do this.

        [EDIT]

        Thinking about swapping out SHA1 in the future, you would still want to rehash all of the blobs and trees to prevent SHA1 attacks on old blobs that are unchanged going forward to essentially prevent what I described above.

        If you only hashed new blobs with the new algorithm you would need to wait until every file had been touched to be safe.

        • eru 5 years ago

          Yes, I would assume that most git repositories would want to re-hash all old commits when SHA1 gets replaced.

          For backwards compatibility, I suspect we'll add the new hash and keep SHA1 around, unless you specifically disable SHA1.

blauditore 5 years ago

I'm curious, wouldn't this also be caught by static code analysis tools, at least today? An assigment inside an if condition is both, most likely a mistake, and fairly easy to detect automatically.

  • rocqua 5 years ago

    I would guess this is part of the reason why most modern compilers will indeed emit a warning about assignment within if, for, and while - branch checks.

    At the same time, the standard implementation of strcpy is:

        while((*dst++ = *src++));
    
    which has a legitimate reason for doing assignment inside the while condition. Then again, one could argue that the above code is 'too clever'. And I would probably agree.
    • ancgop 5 years ago

      However they do not emit a warning if the assignment is parenthesized, like in the exploit. I think static analysis tools are the same, they would be way too chatty if they emitted warning for a parenthesized assignment.

      Static analysis already has way too many false positives as it stands. For a well maintained code base the rate can easily be 100% false positives, which gets annoying after some time.

    • asddubs 5 years ago

      could do this instead, right?

          do {
             *dst = *src;
             *dst++;
             *src++;
          } while(*dst);
      • josefx 5 years ago

        I think you are not copying the terminating nul character.

        • FartyMcFarter 5 years ago

          Unless the first character was null, in which case it would be ignored by the condition... Also, you don't need to dereference a pointer in order to increase it.

          The grandparent post's code is just nonsensical.

          • asddubs 5 years ago

            haha, I was genuinely asking if it made sense, I don't usually do C. Maybe it helped illustrate that the code is too clever for me, at least.

            e: I submit my revised code:

                do {
                   *dst = *src;
                   src++;
                   dst++;
                } while(*(dst - 1));
            • asveikau 5 years ago

              Looking at src[-1] would work but feels like poor form. My advice is stop trying to make do ... while work here, it isn't looking good.

              I feel like more idiomatic iterating through a string tends to loop on src not being a terminator.

                  while (*src)
                  {
                     *dst = *src;
                     ++src;
                     ++dst;
                  }
              
                  *dst = '\0';
              
              I feel like this is idiomatic C but needlessly verbose. Most people would combine the increment with the assignment. And most people would recognize putting it in the while condition as a common strcpy.
  • lqet 5 years ago

    I think this is why there are parantheses around current->uid = 0. gcc has the option -Wparentheses, which gives a warning if you write something like this:

      if (a = b) doSomething;
    
    But there is no warning if you write it like this:

      if ((a = b)) doSomething;
    
    The convention is that with these unneeded parantheses, you are signalling that you actually want the assignment here. I would assume other static code analysis tools use this convention as well.
stareatgoats 5 years ago

Was this a backdoor or not? Following the comments on the article and previous posts here on HN it seems the jury is out AFAICS.

The crucial question to me seems to be if this condition:

    options == (__WCLONE|__WALL) 
can be willfully introduced by a bad actor, and otherwise never really occur. Unfortunately I don't know this (not familiar with Linux development) but herein lies the answer it would seem.
  • hyperman1 5 years ago

    Following the man pages:

    wait4's man page points to waitpid for details, and notes wait4 is deprecated in favor of waitpid.

    So see the linux notes of this: https://man7.org/linux/man-pages/man2/waitpid.2.html

      The following Linux-specific options [..] can also, since Linux 4.7, be used with waitid():
      __WCLONE  [...] This option is ignored if __WALL is also specified.
      __WALL
    
    So to trigger this:

    * You have to call a deprecated function

    * With a flag that was at that time illegal (linux < 4.7)

    * And a second illegal flag that is cancelled out by the first illegal flag.

    This is something any userspace process can do, but no sane process should ever do.

  • speedgoose 5 years ago

    Definitely a door for a local privilege escalation. But since it's so obvious, we may call it a second front door.

reactchain 5 years ago

What are the chances major projects we use today aren't backdoored similarly? It's so easy to do and so hard to detect.

  • coldpie 5 years ago

    > What are the chances major projects we use today aren't backdoored similarly?

    Basically zero. There is no such thing as computer security in 2020.

bugeats 5 years ago

ITT: everyone pretending they've never burned hours troubleshooting only to find a stupid `=` instead of a `==`.

  • ViViDboarder 5 years ago

    Yea. Who hasn’t slipped up and forgotten an equals sign... and then accidentally exploited the Linux CVS and pushed their code without approval...

    We’ve all been there! /s

  • vlovich123 5 years ago

    It has been a long time since I make sure my codebases have `-Wall -Werror`. This bug is from 2003 both when that wasn't as common & when compiler diagnostics weren't as good/reliable.

    • not2b 5 years ago

      This code would not trigger under -Wall -Werror. Try it.

      • vlovich123 5 years ago

        I was referring to what the parent wrote:

        > ITT: everyone pretending they've never burned hours troubleshooting only to find a stupid `=` instead of a `==`.

        In the general case that OP was talking about, not for underhanded code, my comment holds.

pyuser583 5 years ago

How would git have handled the same issue?

I imagine if Linus pushed to the remote repo, it would have said “your repo isn’t up to date”.

But AFAIK, it doesn’t have the same sort of built in checksum checkers.

If an attacker signed the commit insecurely, would git complain? Can you set git to require PGP signatures?

Probably.

  • woodrowbarlow 5 years ago

    each commit's id is an integrity hash of the repository at the time of commit. git doesn't provide access control; it relies on access controls built-into whichever transport mechanisms you choose to enable (https, ssh, etc).

    you can sign commits with PGP signatures and with hooks, you can reject commits that aren't signed. i believe maintainers sign commits in the linux repo.

_urga 5 years ago

Something as important as "uid" should be "const".

  • brongondwana 5 years ago

    I mean... my first reading of that is "what a dumb idea, the reason it isn't const is that there are legit reasons to switch userid".

    But then I have used exactly this pattern, and it looks something like:

    struct protected_stuff { int userid; ... };

    void set_userid(const struct protected_stuff prot, int newuserid) { struct protected_stuff backdoor = (struct protected_stuff *)prot; backdoor->userid = newuserid; }

    and then the compiler complains if you go fiddling with userid outside this function where you deliberately opened a backdoor to write to it. (and you can wrap pragmas around that function to turn off warnings).

  • grishka 5 years ago

    There are legitimate reasons to change the uid at runtime. For example, some server software starts as root and then drops to a less-privileged user. Android relies on this too, zygote, the fully-initialized "blank" runtime process, runs as root and gets forked and changes uid to the corresponding unprivileged user whenever an app is launched.

    • _urga 5 years ago

      Sure, and I don't disagree that uid might need to change at runtime, but here we're talking about a struct field being const.

tester34 5 years ago

I think code like this shouldn't even compile like in other languages

>Operator '&&' cannot be applied to operands of type 'bool' and 'int'

  • tleb_ 5 years ago

    That would require some pretty big changes in the C programming language. Static analysis should detect it though, and probably does.

  • josefx 5 years ago

    _Bool is a relatively new addition to C, those operators return int results that are either zero or not.

baybal2 5 years ago

Why attempted that backdooring attempt you think?

stiray 5 years ago

I think that this might be an typo or at least it has plausible deniability. I have changed my coding style to always put constant on left side just to avoid such an error (such typo gave me a few days of debugging multithreaded code and I have just said "Never again!!" :D)

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection