Settings

Theme

Twitch is hacked, and its source code leaked

kotaku.com

556 points by goldenzun 4 years ago · 324 comments

Reader

nemothekid 4 years ago

This is a pretty thorough and high profile hack on a major tech company - this isn't something I'd expect from an Amazon owned property. The hack (allegedly, I haven't downloaded it) includes

* Entire git histories

* Internal/Private AWS SDKs

* Encrypted Password dumps and payout reports

It's so comprehensive I'm very curious into how an attacker got that level of access. I can't think of another, large, corporate web 2.0 startup who's gotten owned in a similar fashion. Could the same attack work on Amazon? YouTube?

It's also strange that someone who has this level of access to what is presumably a multi-billion dollar company decided to just leak the data? Maybe they did try to ransom it, but I'd imagine someone with this kind of access inside Twitch must have had some creative way of making money.

  • madrox 4 years ago

    There were no encrypted password dumps. No production secrets were leaked (according to the article). What's here is no more than what your average Twitch engineer has access to.

    Yes, that included payout data. Anyone with "staff" access to the site (which any employee can have) has access to any streamer's dashboard, which includes payout data.

    I don't think this was an attack. Based on the data so far I think it was a disgruntled engineer. Obviously if more gets leaked later I may revise that opinion.

    • ergerger 4 years ago

      I also worked for Twitch and can confirm what you're saying is true. These repo's any staff member had access to - including non-engineering staff.

      Revenue for the longest time was as simple as navigating to a streamers dashboard as staff, but they did finally gate that away from staff who don't need to see that info, however I am sure there are other ways to obtain revenue reporting info.

      I am assuming all data - including personal - has been compromised but so far, the data leaked is data that most staff would have access to in some way or another. Some may find that shocking, but this was not a "high level hack"

      • madrox 4 years ago

        I'm actually very happy to hear they finally added a flag for payout access. It's been years since I was there and my eyes bugged out when I saw what I had access to without needing it.

      • bengale 4 years ago

        Why did non engineers have access to repos?

        • vkou 4 years ago

          The better question is, why did random engineers have access to the financials of the streamers on the platform, without having to go through a break-glass, audited, emergency access escalation.

      • realyashnag 4 years ago

        Allegedly it also contains AWS access keys. I feel bad for the engineers who will have to answer for this.

    • twistedpair 4 years ago

      So much for information compartmentalization. Does the typical engineer need access to payment details for their daily work?

      • nonameiguess 4 years ago

        No, but doing least privilege, separation of privileges, and RBAC correctly is tedious and difficult and slows development velocity, so companies rarely do it well if they even bother trying unless some outside force compels them.

        I highly doubt it would be possible to do something like this at AWS, just because hosting multitenant infrastructure and working with the government forces you to implement security since you're being audited and awarded contracts on that basis. Twitch users don't give a crap about the security of the platform. They just want to monetize as quickly as they can, too.

        So I'm not hugely surprised that practices and culture would be different even if they have the same parent company, especially since Twitch was an acquisition. Even if not, though, I'd expect security at Prime to be better than Twitch but worse than Marketplace, Marketplace to be worse than AWS, etc. All speculation since I've never worked at any Amazon product, but that's what I would expect.

      • oconnor663 4 years ago

        The tradeoffs for any individual piece of data are different from the tradeoffs of a company-wide policy. Siloing off one little thing (e.g. credit card info) usually doesn't inconvenience very many people, but at the same time it only provides marginal security. No front page headline has ever read "At Least The Credit Card Info Was Safe". On the other hand, a company-wide policy of siloing everything can have more of a security impact, but it also inconveniences everyone frequently. That's the tradeoff that many tech companies don't want to make.

        • zerkten 4 years ago

          I don't see how this precludes just-in-time access. Even if people can re-up on their own, you can still observe the data access patterns and manage the risk. Further, when you see someone is getting blocked a lot you can improve the experience for them so they are unblocked, or have more efficient access to the data. This is just mature data and security management.

          Quality of life and developer experience are important topics in many ways, but should they really trump security consistently? It's always going to be dependent on people's risk assessment and comfort, but frequently it skews the wrong way because the people making the decisions know that they'll be gone.

          • myohmy 4 years ago

            Implementing just-in-time access on legacy systems that pre-date just-in-time architectures is extremely expensive. Its cheaper to either give all info or no info. Which is what every legacy company does instead.

            My company can shut off my access to the all the databases when they stop asking me to troubleshoot any and all data issues. Which will never happen.

      • ABeeSea 4 years ago

        Part of the appeal is working at a place like Amazon is having a voice in decision making in the product you’re building. Hard to make informed decisions or opinions without the data. Engineers in Amazon retail definitely have broad access to sales data.

    • ljm 4 years ago

      Why would an intern at Twitch have access to data in production?

      Saying that no 'secrets' were leaked is effectively burying the lede.

      • ijcd 4 years ago

        In general the broad access was to code repos early on. Some were gated. There’s lots of collaboration and the need to study other code bases for learning and collaboration, read only. It’s micro services galore there so one didn’t tend to have access to production databases for services or systems you didn’t work on. You were opted in there. Teams did their own devops for the most part.

        The payout data likely wasn’t ripped from a DB but rather dashboards which customer service or partnerships likely had access to. Tier1 or Tier2 support kinda stuff.

        This smells like a stolen backup or maybe network access and http scanning, finding the internal GitHub and maybe a support admin cred that allowed dashboard view.

      • madrox 4 years ago

        By secrets, I mean salts, password hashes, etc.

        • ihattendorf 4 years ago

          > You also get access to every streamer's dashboard and their analytics

          I would classify that as access to production systems.

        • ljm 4 years ago

          Why is getting access to prod, or prod data, considered a perk, exactly?

          • manquer 4 years ago

            The perk is the wrench UX denoting you are an employee to the community . Reddit/twitch allow employees to communicate with the users . It is a social media platform , being able to indicate that you are special is street cred.

            The other access rights that come from staff access is either incedential or miss /debt in architecture.

            • zerkten 4 years ago

              It's understandable why this is a neat perk, but it also seems absurd when you look at Twitch as an entity owned by a global corporation.

              • VRay 4 years ago

                Man oh man, big "No thanks" to a perk like that from me

                "Hey, pick through everything I say with a fine-toothed comb and treat it as the official company stance!"

                • iotku 4 years ago

                  From prior (user) experience, typically staff have non-wrenched alt accounts for just chilling out in streams (with less concern about conduct, but generally tame by twitch standards). But will wrench up for higher profile streams or folks they're otherwise pretty close with personally.

                  I suspect that's a lot more controlled these days, but it wasn't very uncommon for signified staff to be trolling along with everyone else.

        • skilled 4 years ago

          This statement makes no sense.

          The leak includes source code of multiple active websites and applications that are operated under the umbrella of Twitch/Amazon.

          Why would an intern have access to this data?

          • isbvhodnvemrwvn 4 years ago

            In many companies source code for all products is available to each and single developer.

            • skilled 4 years ago

              In what world does all this data in the leak would be stored together in a unified ecosystem? It makes absolutely no sense.

              If you're saying that Twitch runs their developer environment in a lousy manner (and you have proof of this), then please go ahead.

              But to imply that an intern/average developer would be given access to all this branching information is ignorant.

              • barkingcat 4 years ago

                I think many people are trying to say that in this world at many companies all that data is indeed stored and accessed together.

                Maybe the super secure siloed world doesn't really exist outside of military/government organizations.

              • imwillofficial 4 years ago

                I have first hand info, and this is how it’s done. Don’t call somebody ignorant if you don’t have first hand info. Leave that for somebody who does.

          • nemothekid 4 years ago

            >Why would an intern have access to this data?

            monorepos are a thing at several companies (e.g. Google).

    • popotamonga 4 years ago

      I worked for a multi billion company and even 6 month contractors had access to basically everything with little effort.

    • unethical_ban 4 years ago

      No one in IT should have access to business data. That's simply best practice. Worst case would be a database engineer who has access to backups or some prod data for troubleshooting, and even that should be under tight control with good access accounting.

      • hamburglar 4 years ago

        Welcome to devops. Ask Mike down the hall to add you to the “admin” group. Tell him you’re a new dev so you need everything.

        (This is a joke but also, at many companies, it’s not. Twitch was once small and grew. Who knows what ancient all-access switches are still critical to running the systems, marked “tech debt” in someone’s backlog)

        • unethical_ban 4 years ago

          The whole point of devops is to automate everything according to best practices, so fuckups are a thing of the past! The only fuckups, of course, will be Terraform state issues.

          • syshum 4 years ago

            No the whole point of devops is to get rid of those terrible sysadmins always keeping the devs from doing anything...

            Once you have "DevOps" the devs are ops, your head count drops, and all that pesky security and other things those dirty sysadmins wanted are gone

            kinda /s sometimes I think that is really want managers think about devops

          • hamburglar 4 years ago

            As it turns out, the entire industry doesn't quite agree on "the whole point of devops."

      • cube00 4 years ago

        Until the business raises a priority one incident that their monthly reports are not looking right and you need to dive into the data to find out why some other API back end decided to present its numbers this month divided by 1000 for ease of display to their own users.

        I know, I know, service contacts but my point is sometimes engineers need at least temporary access to provide support at times.

    • weaksauce 4 years ago

      Could have been a hack of a twitch engineer's laptop or something like that.

      • bdreadz 4 years ago

        This is what I thought of as well. Maybe just an engineer was hacked.

    • syshum 4 years ago

      Sounds like someone in Twitch Security needs to take a course on Least Privileged Access then

  • 63 4 years ago

    > It's also strange that someone who has this level of access to what is presumably a multi-billion dollar company decided to just leak the data? Maybe they did try to ransom it, but I'd imagine someone with this kind of access inside Twitch must have had some creative way of making money.

    Notably, the initial leak didn't actually include the password data which the leaker claims to have, just source code and payment data which has been verified by several affected streamers. It's possible that this first leak was just to establish trust so they can random or auction password hashes later.

    • ganoushoreilly 4 years ago

      Given the torrent is labeled "twitch-leaks-part-one" I'm curious too as to what they have. The torrent breaks out into a lot of compressed volumes, so it's clear this wasn't just a backup file, but a curated collection of files. I'm very curious if we will see any other amazon related leaks come from it.

      Either way, I can only imagine the chaos inside as they try to figure out what has transpired here.

    • nemothekid 4 years ago

      >It's possible that this first leak was just to establish trust so they can random or auction password hashes later.

      Password hashes are relatively useless though? Once the leak is announced I imagine most of the big targets will rotate their credentials. Then the next thing you need to do is spend possibly thousands in CPU time bruteforcing bcrypt hashes. Then I'm not sure what you can even do with those.

      I'm not criminally creative but I imagine you could make more by abusing trust with payment processors or fraudulent invoices.

      • tyingq 4 years ago

        >Then I'm not sure what you can even do with those

        Assume some end users used the same passwords on other, non-twitch accounts. That's what makes hacked passwords valuable, no matter where they came from.

        • twofornone 4 years ago

          That's something I've wondered - do password hashes tend to be the same across platforms? Is everyone using the same hashing algorithm? Isn't this also what salting is for?

          Never implemented auth myself.

          • tyingq 4 years ago

            Yes, the hashes are (usually) different due to different algorithms and/or salts. But, if you've brute forced one by using good guesses, and know the email/userid for other sites, and the user used the same or a similar password...that doesn't matter.

          • franga2000 4 years ago

            If everyone did things the way they're supposed to then no, hashes should never be the same between platforms. Using the same algorithm is likely, but as you said, salting solves that.

            But mistakes such as salting with just the username are sometimes made even by very large companies and in that case, hashes could be the same.

            • evandwight 4 years ago

              Why does it matter if hashes are the same?

              That only tells you the passwords are the same.

              • mindwok 4 years ago

                If they are the same everywhere, you can precompute a huge database of hashes (called a rainbow table) and simply lookup the hash in the table when breaches occur to find the password. By salting, every provider who stores credentials has different hashes for the same inputs which makes the approach far less attractive at a large scale.

                • thaumasiotes 4 years ago

                  > If they are the same everywhere, you can precompute a huge database of hashes (called a rainbow table) and simply lookup the hash in the table when breaches occur to find the password.

                  You can do this anyway. But the space requirements of a rainbow table are so large that including an account's username in the password would make a rainbow table completely unfeasible.

              • thaumasiotes 4 years ago

                It doesn't matter at all if one person's hashed password is identical across two of that person's accounts on two different websites. The identical hash will instantly let an attacker (with access to both hashes) know that this person shares the same password across two accounts. But that is of no value; the attacker is going to start by assuming that it's true anyway.

                Salts are there to ensure that two accounts on the same website which have identical passwords nevertheless have different password hashes.

          • bduerst 4 years ago

            In a perfect world, no, but lazily someone could skip salting and/or use common hashing functions. IIRC this was a problem at Sony not too long ago.

        • axpy906 4 years ago

          Pretty much this. If they gain one email/username password combination - they can use it elsewhere.

          • growt 4 years ago

            If they are properly hashed and salted, they can not.

            • xorcist 4 years ago

              The point here is that once you brute force the plaintext password, the same password might be used elsewhere.

              • mercurywells 4 years ago

                What if you did something like hash(plaintext_pw+"twitchsalt") <browser> ---> <server> hash(browser_hash + db_salt)

                • growt 4 years ago

                  If I understand this right, the problem is "twitchsalt" has to be known so that you can generate the same hash for future logins. So it's just one iteration of hashing more for a brute force attempt (modern hashing algorithms already use multiple iterations of hashing to make brute forcing harder)

                • ocdtrekkie 4 years ago

                  Well, bear in mind, the hacker also has the exact code Twitch uses to salt it's hashes.

                • xorcist 4 years ago

                  The browser_hash is now the password.

            • thaumasiotes 4 years ago

              Password salting has nothing to do with password reuse.

              Imagine two people have accounts on each of two websites:

                           eBay           YouTube
                 
                 Alice     sunlight       bobrules
                 
                 Bob       bobrules       bobrules
              
              A password reuse attack dumps the YouTube database, cracks Bob's password, and then accesses Bob's eBay account. The fix for this is that Bob should use different passwords on his different accounts. Hashing helps by making step 2 ("crack Bob's password") more difficult. Salting does not affect this attack in any way. Note that the attacker didn't bother to dump the eBay database.

              The attack that salting protects against dumps the YouTube database, cracks Bob's password, and then accesses Alice's YouTube account.

              • growt 4 years ago

                "Salting does not affect this attack in any way." Yes it does. If you habe unsalted passwords you can just use a rainbow table to look passwords up.

                • thaumasiotes 4 years ago

                  And that is not affected by salting. You can use a rainbow table to look passwords up whether or not those passwords are salted. There is zero conceptual connection between the two ideas.

                  Now, realistically, you can't use a rainbow table on passwords of any noticeable length, and a salt may push the password over the edge of that threshold. If that's really what you want... enforce a minimum password length.

                  • growt 4 years ago

                    "Use of a key derivation that employs a salt makes this attack infeasible." https://en.wikipedia.org/wiki/Rainbow_table

                    "Salts defend against attacks that use precomputed tables (e.g. rainbow tables)" https://en.wikipedia.org/wiki/Salt_(cryptography)

                    • tyingq 4 years ago

                      Salts do nothing for people with predictable passwords though. The salt is in the dump, so I can hash known plaintext with the algorithm and the dumped data.

                      Even if I can only hash a million a day, if your password is one of the top million most popular, and I have a good list, I'll have your password in a day. And if you re-used it...

                      Salts do make naïve brute-force, all-possible-strings approaches useless, yes.

                      • growt 4 years ago

                        Yes, but nothing will make predictable passwords safe (at least when you have the hash). Enforcing password guidelines helps a bit.

          • skeeks 4 years ago

            I would be very deeply concerned if Twitch, a multi-billion dollar company owned by Amazon, does not properly hash and salt the passwords of its users.

            • isbvhodnvemrwvn 4 years ago

              You don't brute force it, you find the password for accounts with the same e-mail in leaks from other sites and try only those.

            • tyingq 4 years ago

              You can still run those "top billion popular password" lists against properly salted/hashed passwords.

      • errantspark 4 years ago

        A few things here. If you're the sort of person who runs a crypto mine, which I assume many of the people interested in breaking hashes are you have enough firepower at your disposal to at least perform a targeted attack on a few hashes with relative ease.

        Ideally that would be useless because things are properly salted and you don't know the salt, however with access to all of the source code as we have here I think it isn't as clear cut, as it may be possible to reverse out the salts as well.

        I'm not a cybersec guy so please take my speculation with a grain of salt.

        • exitheone 4 years ago

          The salt is usually stored next to the password. The point of a salt is just to make the hash unique to prevent the use of rainbow tables, it's not a separate secret.

        • shkkmo 4 years ago

          I think it is pretty common to store the salts alongside the password hashes. They are used by the same pieces of code so it is generally unrealistic to think that your salts will be secure if your hashes are obtained.

          Salting isn't really supposed to make a hashing algorithm secure by being secret but by being unique. Unique salts make hashing more secure because an attacker can't re-use a single rainbow table for multiple hashed passwords. That, combined with a sufficiently computationally difficult hashing algorithm, it makes it prohibitively expensive to reverse the hashes of all your users.

          This may not be enough to protect high value users or those who use fairly common or easily guessable passwords. This is part of why it is so important that you don't reuse passwords. It's also why your application should reject all known passwords using something like https://haveibeenpwned.com/Passwords or any of the "common password" list you can find online.

          Edit: If you do include a secret that is stored seperatly that is added to the password and salt when hashing, this is called "peppering" and these peppers are generally not unique per user.

        • maccard 4 years ago

          I've heard this before, and queried how feasible an attack would be, as people always talk about just how bad this is but yet I've _never_ heard of someone having an account compromised through this vector, and I'd like to know how feasible it really is. Here's the sha1 of an unsalted password b85ffa7dae2cbed04e7d3335f6ebc43c8a5764dd

          How long does it actually take in practice to break something like this? I would love it if someone could prove it to me.

          • jaredsohn 4 years ago

            Is the password ncc1701e?

            I just googled it and found https://hashtoolkit.com/decrypt-sha1-hash/b85ffa7dae2cbed04e... along with other results.

            • maccard 4 years ago

              It is! I guess using a password from Google isn't the best idea, and kind of defeated the point of what I wanted to ask (if your password isn't already hashed online how long does it actually take to break a sha1 hash), but definitely proves the point.

              Can I try again? Sha1 e7b7cdf949007abe7e8a190ba8eae56c60018c1f

              • Behemoth66 4 years ago

                Couldn’t find it in 1.4 Trillion combinations. Used rockyou.txt with dive.rule.

                Took me 6 minutes to try all 1.4 trillion passwords. So either you have a strong password or I messed something up. What is it?

                In theory if your password was weak enough to be on this list it would take on average 3 minutes to break it on a GTX 1080.

                • maccard 4 years ago

                  Thanks for trying! This somewhat supports what I'm suggesting - because that password hasn't been leaked by being posted in plaintext as a verified password, it's not available as a lookup, therefore it doesn't matter whether they used bcrypt, sha1 or md5, or even just pgp encrypted it, the password is likely "secure".

                  • Behemoth66 4 years ago

                    It depends. It doesn’t have to strictly be a leaked password. If it’s similar to a leaked password then the permutation rule-set will catch it.

                    Anything under 9 characters I can brute force in minutes. 9 character passwords would take me 9 hours.

                    Obviously if someone has a nest of the latest GPUs then they could go a lot faster.

                    But yes if your password is uwv&6qu_brusb618_$@618jg then it doesn’t really matter how you hash it.

                    • maccard 4 years ago

                      The reason I didn't give any more information on the password above is because you don't have any extra information on a dump of hashes from a twitch database either. If a password is only feasibly brute forceable for a specific algorithm by reducing the search space by many orders of magnitude, it kind of shows that there's not really any risk even if the passwords are unsalted for a person who hasn't reused a password.

                      • thaumasiotes 4 years ago

                        > it kind of shows that there's not really any risk even if the passwords are unsalted for a person who hasn't reused a password.

                        No, it doesn't. You could reuse uwv&6qu_brusb618_$@618jg everywhere and it wouldn't get cracked. If the plaintext password leaked, then you'd be in more trouble.

                        What matters is whether your password is easy to guess, not whether you've reused it. If you have all unique passwords, they can still all be trivial to crack.

                  • ghostway-chess 4 years ago

                    Well. Sha1 is not _that_ hard to break. It's a solved algorithm

                    • astrange 4 years ago

                      That's for generating collisions, not preimage resistance. It's not particularly easy to reverse.

              • shkkmo 4 years ago

                The point of the salt isn't that it makes it take longer to break any one password. What it does is prevent you from re-using the rainbow table you generate breaking one password when you break the next one.

                Sha1 is not a very secure/expensive hashing algorithm and thus does make it significantly cheaper to break even with a unique salt.

                • thaumasiotes 4 years ago

                  > What [a password salt] does is prevent you from re-using the rainbow table you generate breaking one password when you break the next one.

                  Your idea of what a rainbow table is appears to be unrelated to what a rainbow table actually is. A rainbow table is prepared in advance, not generated in the process of cracking an individual password.

                • maccard 4 years ago

                  > Sha1 is not a very secure/expensive hashing algorithm and thus does make it significantly cheaper to break even with a unique salt.

                  Ok, so how long does it take to break the hash I've provided if it's not very secure?

                  • shkkmo 4 years ago

                    It's not so much "how long does it take" as it is "how much does it cost" and the answer to that really depends on what sort of compute infrastructure you have access to. Using a more appropriate hashing algorithm with a sufficient cost factor can massively increase the amount of compute needed. Preventing the re-use of that computational effort on additional users is why unique salts are important.

                    • maccard 4 years ago

                      > It's not so much "how long does it take" as it is "how much does it cost"

                      So the answer is "It's too expensive to figure out in practice, unless you're being explicitly targetted by someone with nation state level credentials?", i.e. it's pretty much fine?

                      > Using a more appropriate hashing algorithm with a sufficient cost factor can massively increase the amount of compute needed.

                      But by the sounds of it, SHA1 is more than enough (given that nobody here is willing to brute force the hash I shared above?)

                      > Preventing the re-use of that computational effort on additional users is why unique salts are important.

                      The person who "cracked" my first hash found it in a list of passwords which was actually gotten from a plain text dump 15 years ago. That wasn't found by reversing a hash, so the compute wasn't reused. You are right that once it's cracked, it's cracked and that's that, but if your password _isn't_ cracked it's moot whether it's hashed with SHA1 or something more secure, as per above?

                      • theneworc 4 years ago

                        >But by the sounds of it, SHA1 is more than enough (given that nobody here is willing to brute force the hash I shared above?)

                        SHA1 is "more than enough" for this specific interaction in which you chose a complex password and/or your only opponents are unmotivated/non-incentivized HN commenters that don't have a password cracker at their immediate disposal. That doesn't mean anything outside of this context.

                        If your opponent was a motivated hacker with dedicated password cracking machines (which do not require anything even close to a nation-state budget, btw), your SHA1 hash would be much more likely to be cracked. If you were a specific target of a hacker group, such as an employee of a company that is being targeted by an attack or someone known to have a BTC wallet with $10 million in it, your SHA1 hash would be much more likely to be cracked. If your password was a relatively simple phrase like "dog$aregreat2019", like the vast majority of user passwords are, it would almost certainly be cracked.

                        SHA1 is not even anywhere close to "enough" for general password hashing use. Don't think otherwise just because a couple of random HNers failed your little game.

                        edit: The premise of your "challenge" is also not equivalent to the goals of most hackers. Unless you are a specifically known and prioritized target (because you're a celeb, VIP, wealthy person or something like that), the goal of a hacker is not to take one specific hash and crack it, because the success of that will depend a lot on the complexity of your password. The goal of most hackers in a breach like this Twitch one is more like "just throw it all at the wall and see what sticks". They take a massive database of thousands of hashes and spend a few hours to see what can be cracked, taking advantage of the fact that while some people may have complex passwords, most do not. After a few hours, maybe they crack 90% of the SHA1 hashes in a leak. Maybe your password was complex enough that it was in the 10% that wasn't cracked; good for you, but just because your password remained uncracked doesn't mean SHA1 is "enough". The hackers still got the other 90%.

                        • ghostway-chess 4 years ago

                          But you shared a hash of an uncommon password. We probably have the salt (probably somewhere in the code) and people dont use password managers. So rainbow tables are enough. Oh, I thought the first sentence was you and not quoted. Agreed with the above

                      • shkkmo 4 years ago

                        > But by the sounds of it, SHA1 is more than enough (given that nobody here is willing to brute force the hash I shared above?)

                        Absolutely not and that is a ridicoulous conclusion to draw. State-level resources are absolutely not required to break sha1.

                        > but if your password _isn't_ cracked it's moot whether it's hashed with SHA1 or something more secure, as per above?

                        Again, absolutely not. The algorithm and cost setting have a huge impact on the practical likihood that an attacker will crack your password.

          • isoskeles 4 years ago

            Many hashes are trivial to target, until you start getting to password hashers that force you to use lots of RAM or CPU (or ideally both) to check a single password. As long as you know what hashing algorithm was used (often inferred by the hash length or other details), you can shove it into hashcat or some alternatives and wait, either using a good dictionary or bruteforce. If you've configured hashcat to work well with a decent GPU, you're good to go.

            Even bcrypt is not that hard to find a solution to a hash if it didn't use enough rounds.

            I learned a bunch of this when a company I worked for was breached and wanted to see just how easy it was to solve out weaker passwords in our db.

            • maccard 4 years ago

              As I said, I've heard the claim, but still question it. Here's a sha1 e7b7cdf949007abe7e8a190ba8eae56c60018c1f, how long does it take hashcat to break it?

              • filleokus 4 years ago

                I don't really follow your argument. You've never heard of a hash being brute forced? I've done it myself multiple times, both for pen testing purposes and for password recovery on systems I control myself.

                The LinkedIn password leak contained hashed (but not salted) passwords, and some of those where cracked and exploited in the wild.

                My old gaming PC with a 1060 can apparently do ≈ 6300 * 10^6 hashes per second. Assuming your password above is az-AZ, 0-9 = 62 possibilities (with no salt) it would take me 10 seconds to test all combinations for 6 characters and 30 days for 9 characters. And it's a trivially parallel problem, making it easy to throw money on to make it wall-clock quicker.

                It's just a simple brute force problem, I don't see what there is to question (beside the choice of SHA1 for password hashing...).

                • maccard 4 years ago

                  > The LinkedIn password leak contained hashed (but not salted) passwords, and some of those where cracked and exploited in the wild.

                  The hashes of previously unused passwords were brute forced, or passwords were reused across sites from a previous plain text dump and exploited? Because there's a big difference between those two things. If your password is reused and originally compromised , you're screwed regardless, and having the leaked hashed passwords doesn't leave you in any worse a situation than before.

                  > My old gaming PC with a 1060 can apparently do ≈ 6300 * 10^6 hashes per second. Assuming your password above is az-AZ, 0-9 = 62 possibilities (with no salt) it would take me 10 seconds to test all combinations for 6 characters and 30 days for 9 characters. And it's a trivially parallel problem, making it easy to throw money on to make it wall-clock quicker.

                  So practically infeasible to exploit? The claims that are being made (even in this thread) are that having a mining rig would let you brute force a SHA1 hash, but based on the numbers

                  > It's just a simple brute force problem, I don't see what there is to question

                  If it's "just a simple brute force problem", and SHA1 is the only issue, then my question is what's the password in the hash above? You (and others here, on reddit, online) are telling us that this is a trivial problem.

                  • filleokus 4 years ago

                    > The hashes of previously unused passwords were brute forced, or passwords were reused across sites from a previous plain text dump and exploited?

                    I believe there are documented instances where previously not leaked passwords were cracked. Of course not 128 bit random strings, but still passwords more "complex" than what you previously posted. If you have 100 million hashes to try, you will crack some. People are generally have bad passwords, especially in 2012, even if the plaintext weren't available anywhere...

                    > So practically infeasible to exploit? It depends on how strong the password is and how much money you have to spend. For 32 USD I get an hour with p4d.24xlarge that has 8 graphics card, that in total can do about 175 * 10^9 hashes per second. 20 hours (and 640 USD) machine time (not wall clock time) on that machine can do what 30 days on my old PC does.

                    > If it's "just a simple brute force problem" […] If you can give me a bound on the number of combinations, and an AWS account to bill, I and many others would gladly attempt to crack your hash :-). But if your second hash is >9 alphanumerical characters we will probably just burn electricity to no avail.

                    I don't even know what you are arguing?

                    EDIT: Now that you have some numbers of hashing rates and cost, you can figure out how expensive different passwords are to crack with different approaches. Two common dictionary words with two numbers appended? 6 random alphanumeric characters? Then think about how expensive the cheapest non-leaked password is in a database of 100 million users are...

                    Is it bad to store plaintext passwords? Yes, obviously. Is some hashing better than none. Yes, obviously. Is salting your hashes much better than not. Yes, because with a salt, your first password wouldn't have turned up on Google / in rainbow tables. Is it even better to use a proper PBKDF. Yes, with a pretty aggressive PBKDF, brute forcing even low-complexity passwords become expensive very quickly, and we get the benefits of salting "built in".

                    Can SHA1 / MD5 hashes be cracked even if not the _exact_ password-hash pair have been leaked previously? Yes, very much so.

                  • vel0city 4 years ago

                    Right? "Its just a simple brute force problem", but sometimes that still takes a lot of force. Sometimes far more force than breaking a single account password.

                    I managed to lock myself out of a dogecoin wallet. I have the hash of the passphrase, so I figured I'd give it a go cracking it. After a few weeks (and a larger than usual power bill) I sent it to some friends with good mining rigs to try and take a stab at it, willing to split the amount 50/50. Its only the passphrase, not the full wallet, so I'm not worried about someone stealing the doge.

                    The passphrase is probably 15-25 characters, mostly not dictionary words or simple letter/number/symbol substitution, only symbols easy to type on a US keyboard. I'm now about 6 months trying to crack that password with probably a few hundred dollars of electricity used overall between myself and friends (I don't know their power bill), excluding hardware cost as it was already owned, and I'm not even halfway through the search space.

                    Can it be done? Sure. Will I be able to crack that password with a cost that's less than the value of the DOGE in the wallet? Probably not. Right now its really more of a gamble that I'll get lucky with the rigs running. I had to tone down some of my rigs as it was getting quite hot over the summer, but over the winter I'll be chugging away as the waste heat is just additional home heat. I'll probably need to rent a considerable amount of GPU power on a cloud provider to crack it, at which point maybe it'll take me days to crack it but ultimately cost me many, many thousands of dollars in GPU-time.

        • Deathmax 4 years ago

          Salts being exposed is not a massive risk in of itself, as the purpose of the salt is to prevent the use of pre-computed tables to reverse a hash into plaintext, forcing an attacker to bruteforce each individual hash+salt instead of being able to reuse work.

          With regards to crypto mines being used for breaking hashes, if you have one based on GPUs, yes, you could reuse GPU mining hardware for cracking hashes, albeit with relatively low hashrates for current best practice hashing algorithms.

          If you're looking at something like Bitcoin's hashrate and thinking that it could be used to break SHA2 hashes, as far as I understand ASIC miners, this is not possible, as ASIC miners are designed only for mining, and they don't really accept non-mining related inputs (ie, no arbitrary inputs to be hashed, unless it matches Bitcoin's specific steps for iterating over nonces).

        • thaumasiotes 4 years ago

          > Ideally that would be useless because things are properly salted and you don't know the salt

          I'm really curious where people get their ideas about salting. It's not just a word. It doesn't make one password any more difficult to crack. It makes cracking every password in a given database more difficult to do. A password's salt is public information.

      • j_walter 4 years ago

        Relatively useless...but if even a few percent of people recycle passwords used for banking or crypto platforms it could be a profitable cache of data.

    • zinekeller 4 years ago

      Maybe that Twitch is competent in the password department so they decided against it? But thinking about it, although it's unclear if two-factor secrets are included in the leak, but maybe the two-factor secrets may be usable to someone who has already the password of a victim. Unless it's the dongle-type one (WebAuthn/FIDO), the secret is common to both the server and the user, so two-factor bypass is almost certain in this case.

    • mdoms 4 years ago

      Doesn't seem likely to me. If the attacker has password hashes then they would want to keep this attack quiet so that the buyer of the hashes would have time to compute the passwords. If Twitch gets wind of this happening then a simple password reset would foil any efforts.

  • skilled 4 years ago

    I'm hoping we will get to see a transparent report (from hacker or Twitch) on how this happened.

    I think anyone would be excited to hack Twitch as the site alone - or any big platform for that matter - but this is quite literally someone just downloading the entire Twitch ecosystem and publishing it online.

  • leros 4 years ago

    It something I would expect security hardware to have automatically stopped. Even an employee shouldn't be able to download 125GB of stuff without flipping a safety switch somewhere.

    • munk-a 4 years ago

      Gosh - I've worked at shops where we handled multi-terabyte images and we'd regularly stream large chunks of that while debugging tools. I've also worked at places where data was king and 125GB of stuff might be a reasonable dispatch of data to help someone debug.

      The volume of data is irrelevant - source code is usually teensy tiny and of far more value to companies than, say, three months of livestream chat logs.

      I'm not certain what security hardware you're thinking of - but I'm pretty sure I hate it already since it doesn't effectively guard anything while making everyone's lives difficult. For effective corporate security you need 1) data use policies and 2) access control lists - both of those are generally more effectively implemented at an entirely software level.

      • retbull 4 years ago

        Yeah volume is a terrible metric to go by. I work as a data engineer and a lot of the time if I am working between environments or when migrating between data centers will have a copy of the data locally that I can write tests against or move to somewhere I can compare it to a running output. This would be possible to do entirely remotely I guess but not nearly as easy. (note I never do this with anything that contains PII)

        • manquer 4 years ago

          It is still fraught with problems, while you (knowingly) wouldn't do it with PII, is not all that reassuring, others could, or compromised system could be used to exfiltrate this data, if the only control is just trust on the users behaving well with their access

          That fact in general industry the controls on how PII data is accessed internally is so lightly managed should worry everyone

    • AshamedCaptain 4 years ago

      Trying to protect against leaking developers/employees is like trying to protect against lone gunman terrorists: useless. And, if you try anyway, it is likely to cause more annoyance to everyone involved than actual protection (think TSA).

      • yupper32 4 years ago

        I disagree. Locking down and logging access to raw data like password hashes or payout information to only those who absolutely need it doesn't cause much annoyance and is very useful.

        It protects the company against rogue employees (not even strictly malicious, but also curious employees who want to see more than they should). It limits exposure if an employee's account gets hacked (my pet theory for this Twitch hack). And if something does go wrong, logs help track down the issue/leak.

        And at the end of the day, there should be a lightweight way to request access. Many times I've seen people request access that they didn't actually need. And most other times they have access pretty quickly.

        • AshamedCaptain 4 years ago

          Note that it was code that was leaked. Preventing developers from leaking the codebase they are working with is outright impossible. Now combine that with a "monorepo" and even the most junior developer has access to practically the entire company codebase and version control history.

          And you can try to prevent them from accessing live/real customer data, but the cost is that they will never be able to debug issues in production. Most companies, even very large ones, are just not able to pay that cost. Not to mention that once you have access to the codebase there are a million ways to leak customer data anyway -- it is a lost battle.

          • yupper32 4 years ago

            Of course, some stuff you can't avoid, especially code leaking. Luckily code isn't usually that interesting or useful to external parties which is the only reason it isn't leaked more.

            For the rest of the stuff, there's a sliding scale. In no universe does your average twitch developer need raw access to password hashes, for example.

            • AshamedCaptain 4 years ago

              What with security as it is on these companies, the code is literally the most sensitive information they can hold, specially in terms of value to the company. With the code out, expect lots more high-profile cracks in the coming months...

              "your average twitch developer" needs access to the password hashes or at least the code that checks these hashes the moment they need to debug an issue which involves logging in, and from then its all downwards.

          • lrem 4 years ago

            Nope, it was code AND data, including the sensitive type (e.g. user payouts).

        • xmprt 4 years ago

          Adding to your pet theory I think that WFH has led to a lot of people being casual about their workplace security. For example, leaving a laptop unattended at a Starbucks.

          This is just a guess but I wouldn't be surprised if companies have to start taking stricter precautions with their security in a WFH world.

      • hn_throwaway_99 4 years ago

        This isn't accurate. There are certainly companies that have extremely in-depth Data Loss Prevention toolsets and teams - everything anyone downloads or moves is logged and alerts fire if things look out of the ordinary. Google clearly had tons of data about how Anthony Levandowski was able to exfiltrate lots of info when he left.

        The issue that building these systems accurately so they are NOT a constant annoyance is difficult, expensive, and takes a large team to support well.

      • unethical_ban 4 years ago

        There are ways to look for anomalous behavior without creeping too hard (even though it's a business's right to view and monitor all network traffic on their system).

        If someone who doesn't have a business need to upload lots of traffic begins uploading large amounts of data, you may ask questions. Maybe you kick off a scripted playbook that then checks for increased logins to other privileged systems, or for large transfers of data from internal sources to the user's desktop.

      • xwolfi 4 years ago

        I dont know dude, I work in an enormous company that you 've heard of, and it's impossible for me to imagine how to extract code out. I can't do it, except if I get remote access and film my screen while scrolling.

        Anything else is found quickly. I certainly wouldn't even dream of someone extracting the repo.

        • AshamedCaptain 4 years ago

          Really, you can't simply copy files from a code repo you're working on? You work on a isolated workstation, not connected to any external network, where you are not allowed to bring anything other than plain clothes (TSA-style)? With a sizable army of developers all working this way?

          And if it's a remote FB/VNC connection, what is preventing you from just recording the screen? Not really hard...

          Most companies I've seen could see all their code extracted with one malformed NFS packet. These are "air gapped" systems holding the type of industrial secrets that we don't want to leak to china. Practically the only real line of defense they have is employee screening, which does not really stop the lone man guy.

    • CobrastanJorji 4 years ago

      If the bulk of it is a git repo, it's probably expected that every engineer will download it regularly.

      • manojlds 4 years ago

        Case against monorepos?

        • crdrost 4 years ago

          There are much better cases than this; in this case a monorepo makes it slightly more likely to be caught rather than less. (A monorepo can get to Google size and then you can't check it all out at once and it needs bespoke tooling, which can make it harder to pull this off.)

          On the flip side while many smaller repos _can_ have independent ACLs, you are very unlikely to set those up until you reach a certain scale -- and then when you reach that scale it gets hard to implement ACLs across everything at once. So your engineers probably all have access to all your repos until you reach a very large size anyway. So the question becomes just "can someone write a for-loop over all of the repo names and check them all out," and it's like, yeah, that's not terribly hard, I as a programmer can do that pretty easily in bash.

          Ideal repo size should not in my view be directed at "how do I prevent compromise to the external world," because VCS is not designed to give you the superpower of being resilient around being compromised. Rather VCS is trying to give you the superpower of time travel. So you should probably scope your repo to "what is the unit that makes sense to time travel with?" -- in other words if you are adamant that you have these independent services which operate decoupled and running this one backwards by a year should not affect that one, then those services should be in separate repos. If on the other hand they have some moderate coupling and rewinding this service by 1 year would break the APIs that that service uses to communicate... then those should ideally be in the same repo so that you can coordinate changes between them to their shared protocol.

          • vineyardmike 4 years ago

            > So your engineers probably all have access to all your repos until you reach a very large size anyway.

            Happens at my company. We have rudimentary ACL but not sure how its implemented because you can find things via explicit searching, or via "organic finding" via links from repo->repo but it won't be surfaced if you just search for code.

        • packetslave 4 years ago

          You can still have a monorepo and restrict who has access to certain parts of it. You just have to build the tools to do it.

          Google, for example, has a small number of subdirectories in the tree that only certain engineers can view (the really sensitive stuff, like the actual ranking algorithms for search and ads) but the build system is setup to allow you to still link against it.

        • munk-a 4 years ago

          Not particularly - unless different teams are highly focused on certain subsections of the repository. If everyone might have to look anywhere than you'll need to download all the repos - whether that's one or five hundred.

      • javajosh 4 years ago

        How often do devs delete and re-clone?

        • Retric 4 years ago

          Clean OS install or new hardware should both be daily events at even mid sized companies. Because even if it’s once every 2-4 years per developer that still becomes extremely common in aggregate.

          • Marsymars 4 years ago

            I think the tech giants have warped some people's expectations of what a "mid-sized" company is. I work for a mid-sized company where we roll our own ERP system and we probably average about two clean OS installs per year across the entire development team.

          • javajosh 4 years ago

            Yes, but GP said "every engineer, regularly" which seems odd.

        • vineyardmike 4 years ago

          I suspect lots of junior devs will clone fresh, push changes, nuke repo and repeat. I did when i was young instead of syncing state and rebasing.

    • com2kid 4 years ago

      > Even an employee shouldn't be able to download 125GB of stuff without flipping a safety switch somewhere.

      I am trying to recall, but I am pretty sure when I worked in Microsoft Office that a build would pull down many tens of gigabytes of data.

      125GB in one day from the build system wouldn't be uncommon!

      • Raidion 4 years ago

        That's ingress though. Companies should be monitoring and worrying about egress.

        Edit: This won't help against a thumbdrive, but that type of thing should be also tracked.

        • AustinDev 4 years ago

          I'm working on a project and just had to repull my workspace after some local corruption. I pulled 1.2TB out of the office and never got an email. I think it's pretty common for places not to monitor egress that closely.

    • tptacek 4 years ago

      There was a fad for tools that accomplished this in enterprise networks, with much clearer rules for who needs to access what (it was called "data loss prevention", or DLP) and those tools for the most part don't work. This is a harder problem than it looks like.

      • unethical_ban 4 years ago

        DLP products tend to be more about scanning the contents of data for sensitive patterns, at least in my observation of the market. There are other products (typically built into SIEM) that do correlation on login events, network traffic and whatnot to detect anomalous behavior.

        • AmericanChopper 4 years ago

          I’ve worked on a lot of DLP projects in big enterprise, and I have a very dim view of the entire category of product. A lot of their functionality is just magic black boxes, that unsurprisingly achieve very little. The primary motive for deploying them is not that they’re particularly effective, it’s so that you can tell auditors and other scrutineers that you’ve got a “DLP solution”. The idea that you can grant people access to huge quantities information, but then very strictly control what they do with it is fundamentally flawed. Especially on networks that require large amounts of in and outflow for BAU. Even the most tightly controlled data in the world cannot be protected from an inside leaker (or adversary who has taken control of an insiders access), because it runs into the same “analog hole” issue that DRM products have.

      • tfigment 4 years ago

        My company has this. It encrypts any file touched on USB. And other software logs every app run. Prevents casual copying but easily circumvented. But somewhere logs may have enough info to trace the source of leak I guess.

      • realitylabs 4 years ago

        These tools (DLP) have gotten better with app migration to K8s, since traffic can be watched prior to encryption in a standardized way. Just an FYI….

        • tptacek 4 years ago

          The enterprise DLP tools were deployed fleetwide as agents and at network choke points; getting access to the raw data wasn't the problem.

      • cpach 4 years ago

        Thank you for mentioning this. I always had a gut feeling that it seems like an extremely hard problem to solve in a sensible way.

    • outworlder 4 years ago

      > It something I would expect security hardware to have automatically stopped. Even an employee shouldn't be able to download 125GB of stuff without flipping a safety switch somewhere.

      Remember that Twitch handles streams. Good luck implementing this without having all sorts of false alarms everywhere.

      Plus, you don't have to exfiltrate 125GB in one go.

    • cheeze 4 years ago

      I feel like once you have it pulled downm, it would be as simple as an upload to s3 (which wouldn't trigger any flags), then making the bucket public whenever you want. Hell, S3 used to (still does?) support being part of a torrent swarm...

    • ljm 4 years ago

      Why would that help? They just have to accumulate work over a period of time and then 'lose' their laptop.

    • toomuchtodo 4 years ago

      That's 6.25GB/day over a 20 day working month. More time, less data per work day, harder to detect.

      • jandrese 4 years ago

        And it might be disguised as a video stream coming out of the video streaming servers.

        But it could also be a 128GB thumb drive plugged into the system somewhere.

        • vineyardmike 4 years ago

          > And it might be disguised as a video stream coming out of the video streaming servers.

          Just log in to FB messenger or Discord and egress it as small data chunks that way. Lots of people have private chats on work computer for practical purposes.

          Discord allows for bots, so you could easily write a script to chunk data and egress, and another to re-assemble.

    • ABeeSea 4 years ago

      ML engineers / data scientists are regularly moving terabytes of data around at Amazon.

    • yawaworht1978 4 years ago

      Indeed , how could this happen, really curious.

      So let's say someone with access to all GitHub repos gave the password to someone else, maybe then it was downloaded from another machine?

      Or someone stole the credentials and downloaded from another machine?

      Or someone got access to such a machine?

      It's it not possible to prevent these cases?

      How long does such a download take?

    • stefan_ 4 years ago

      Cue monorepo discussion

      • munk-a 4 years ago

        Cue "Don't check payment receipts into git" discussion - although I strongly suspect this hack wasn't just about acquiring appropriate credentials and then running `git clone`. It sounds to me like a backup service was compromised.

  • ArlenBales 4 years ago

    There are so many indiscreet USB pentesting devices easily purchasable by anyone today, I'm actually surprised this sort of thing doesn't happen more often.

    • SketchySeaBeast 4 years ago

      Shouldn't that be discreet devices? Or do they make a really high pitched whine with a big flashing light when they start transferring data?

      • angst_ridden 4 years ago

        "Hey, Jeff, what's that weird thumb-drive over there that keeps texting me `I'm in your datacenter downloading your datas'?"

  • aahortwwy 4 years ago

    ITT: people shocked that something like this could happen at a company the size and profile of Twitch.

    Running security at scale in a hypergrowth B2C company is very difficult. It's also completely different from running security at a startup, in a B2B company, or a slower-growth situation. _Every_ security executive and manager I've met has given up in frustration after 12-24 months and gone to take a cushy FAANG job instead.

    I'm not surprised at all. My experience in security at a larger SV unicorn was that changes only happened in the immediate aftermath of a security crisis. Otherwise, there was incredible inertia and you just wouldn't be able to get the institutional support you needed to make progress.

  • koolba 4 years ago

    How much of this is a holdover of lax security practices from before they were acquired? I can’t imagine AWS being managed in a way where local network access gives you keys to the kingdom. Then again, EC2 instance profiles do let you do quite a bit.

    • lamontcg 4 years ago

      Conflating AWS security with twitch security is probably the wrong way to think about it.

      Within Amazon those are almost going to be two entirely separate companies, with very different security focuses.

      The idea that Amazon is monolithic and uniform wasn't true when I left there in 2006, and I'm certain it is less so now.

      And that isn't just that its related to the merger, but that fundamentally its different business orgs with different focus.

      • vineyardmike 4 years ago

        But does twitch not share the same Amazon wide git service? Could most of Amzn code be leaked or compromised? Seems like all of amazon internals that shares security measures is at risk...

        • cheeze 4 years ago

          I've heard (but don't have any actual evidence more than hearsay) that Twitch generally operates independently of Amazon/AWS. I'm sure that they share some things, but I wouldn't be surprised if their source was separate from the "main repo"

        • bleepblooop 4 years ago

          Remember that Amazon runs one of the biggest multi-tenant service platforms in the industry! A separate business unit like Twitch is likely to be set up a lot like any other random AWS customer, and you wouldn't expect that compromising servers used by one AWS customer to automatically compromise the underlying infrastructure.

          (I would also expect that the Amazon retail systems are in most senses "just another tenant" on AWS, albeit with much more liberal quotas!)

    • this_user 4 years ago

      I always had the impression that Twitch were operating in a largely independent fashion. For instance, it had been an open secret for years that one of their executives had been sexually harassing female streamers. Only a year ago he was finally fired. If Amazon had a firmer grip on Twitch, I'm sure they would have stepped in much earlier.

    • ganoushoreilly 4 years ago

      If you go back to the Adobe software breach circa 2013, a large part of their issues were the bolt on connections between acquisitions. It's honestly the most common thing I see in the startup world.

  • slightwinder 4 years ago

    > It's also strange that someone who has this level of access to what is presumably a multi-billion dollar company decided to just leak the data?

    From what I heard about Twitch-interns over the years, it seems the company is more a third-rate-s**hole that grew too big too fast and accumulated a huge amount of technical debt and fatal security flaws. Making billions doesn't mean anything if you don't invest them back into the important corners of the company. It's considered a miracle that the platform is still working that well in that state. And what comes from the leaks so far supports this view.

    Though, said that, it seems they did start to improve one or two years ago, just too late to prevent this critical hit. But considering this was also a strike that avoided the deadly parts (yet), maybe there is a different aim here and the company can grow from this? It will be interesting to see how Amazon will react to this.

    • superfrank 4 years ago

      > From what I heard about Twitch-interns over the years, it seems the company is more a third-rate-s*hole that grew too big too fast and accumulated a huge amount of technical debt and fatal security flaws.

      I mean this as a genuine question, but is there any company that didn't end up like this after an exponential growth phase? I'm not saying it's okay, but this feels par for the course. I've now been at two start ups during that hockey stick growth time and both went through this as well.

      I'd be curious if anyone here has worked at a large, fast growing tech company where they didn't accumulate a ton of technical debt during growth. If so, what did the company do to prevent that?

      • slightwinder 4 years ago

        Generally yes, but Twitch is not your average startup. It's now 10 years old, and 7 of those years it was owned by Amazon, which should have enough competence and manpower for bringing it onto a good course. But from what I heard, Amazon did neglect Twitch for a long time and focused too much on making it a profitable business by all costs. Because of which they had all those scandals and problems in the last years. It's a business-platform, where technology is just an afterthought.

  • yupper32 4 years ago

    Does anyone know if Twitch employees have two factor auth? Having access to an employee's account would be the easiest way to pull this off.

    It'd be strange if they don't have two factor auth, of course, but it's just as strange to have this large of a hack.

    I think if it is a simple case of an employee account takeover, then the attack would "work" to some extent at any company. Larger companies typically have strict data access requirements, though. Good luck finding the few employees who have raw access to Google password hashes, for example. And even more luck knowing how to get that data if you do.

    • some_furry 4 years ago

      > Does anyone know if Twitch employees have two factor auth?

      Yes, IIRC everyone at Amazon has a hardware security key (which is more secure than the standard mobile app TOTP most of us use everywhere online).

      • ramesh31 4 years ago

        >(which is more secure than the standard mobile app TOTP most of us use everywhere online).

        Is it though? The "wrench theory" applies here. It's not unthinkable that an employee was stalked on social media and had their key stolen.

        • bawolff 4 years ago

          Its still more secure. Rubber hose cryptanalysis applies to both equally, but that doesn't mean there aren't other attacks that apply to totp which don't to yubikeys.

          More secure != perfectly secure.

          • xmprt 4 years ago

            With a phone you need my passcode to accept to 2FA request (assuming lock screen notifications are disabled). I think yubikeys can work without a passcode as long you plug it in right?

            • bawolff 4 years ago

              Right, but presumably the site is already asking for a password, and if the attacker can bypass one password, im not sure its a safe assumption that they cant bypass two. However fair enough. Some yubikeys do involve fingerprint scans too though.

              The main security benefit is unphishability. With yubikey/webauth crypto is used so you can't give the code to the wrong website. Phishing is a pretty major cause of account hacks generally, so pragmatically that is a very big win.

            • cheeze 4 years ago

              It's still the same, 2fa.

              With a Yubikey, you need to use your password to log in to your computer, and then need to auth using Yubikey.

              With OTP app, you need to use your password to log into your computer, passcode for phone, and then auth.

              In both cases, it's something you know, and something you have. You could argue that the app based is a bit more secure in that you need two passwords. On the flipside, if your phone gets pwned, someone can access completely remote.

              Everything is a tradeoff.

              • bawolff 4 years ago

                Why would you need to log into your computer with a yubikey? Wouldn't any computer (including the attacker's computer) work?

            • tenryuu 4 years ago

              Amazon still has a passkey requirement, it's not just a touch of the key, and these passwords are different to your user passwords at login.

            • some_furry 4 years ago

              They require a physical touch.

        • some_furry 4 years ago

          Yes.

          I don't know which protocols they use (obviously), but if they use WebAuthn, everything is public-key signatures. Even if you leak everything from the server, public keys buy you nothing.

          https://webauthn.guide/

    • AustinDev 4 years ago

      Every Twitch Developer has 2FA even 3rd party developers are required to have 2FA I also think, but don't know, that this applies to Twitch Broadcaster Partners as well in order to have their tax information in the system.

      Luckily iirc from a conversation with a senior Twitch engineer the Tax information backend has been migrated to Amazon. So hopefully that did not leak... Because that would be full legal name and addresses of a ton of streamers that likely have stalkers.

      • lrae 4 years ago

        Twitch partners also have forced 2FA for quite some time now, should be a couple of years now - at least more than a year though. Covid killed my sense of time.

  • gorgoiler 4 years ago

    Facebook [2011] was pretty bad…

    https://www.theguardian.com/technology/2012/feb/17/facebook-...

    …except Mangham didn’t ever get to release his spoils to The Internet?

  • dilyevsky 4 years ago

    > I can't think of another, large, corporate web 2.0 startup who's gotten owned in a similar fashion

    Linkedin, Microsoft, Yahoo, Google

  • FormerBandmate 4 years ago

    I mean, it did work on Amazon (a division with poorer security probably, but still). 4chan is a truly special place

  • kordlessagain 4 years ago

    From an ethical standpoint, any code that amplifies and profits from radical speech should be fair game for release. If employees or hackers feel the need to release info in that regard, so be it. This is the risk defined in such models and should be mitigated accordingly.

    • heurisko 4 years ago

      Who decides what speech is radical enough to compromise the privacy of users?

      And if speech is "radical" meaning to the point of illegality, shouldn't the legal system decide, rather than the court of public opinion?

  • Hokusai 4 years ago

    > this isn't something I'd expect from an Amazon owned property

    Because you expect Amazon to put security priority over new features and profit? We have very different understandings of what Amazon stands for.

    • nemothekid 4 years ago

      >Because you expect Amazon to put security priority over new features and profit?

      I don't know what you think Amazon stands for, but Amazon runs the largest cloud hosting service in the world - AWS, which not only runs a large number of other large companies but governments as well. I know, first hand, that their datacenter security protocols are state of the art.

      Amazon has a much larger surface attack area so if they were playing fast and loose with security, chances are we would know already.

      • Hokusai 4 years ago

        > Amazon has a much larger surface attack area so if they were playing fast and loose with security, chances are we would know already.

        I get your point and I am no taking about AWS but about Twitch. Each part of the company has its own incentives. Amazon is well know for not caring about quality nor its employees. In my experience with corporations there is little to no technical sharing between different parts of the company. AWS could have the best SecOps in the world and Twitch could have no security at all. Is your experience different?

        • nemothekid 4 years ago

          I'm not sure what point you are trying to make. If you look at most of the high profile hacks and leaks in the past 20 years, very few of them are from web 2.0 tech companies (e.g. Google, Facebook) rather than dinosaurs (ex. Target). Those that have (like Google) have only been successfully breached by nation actors (e.g. China, NSA).

          As far as I can tell, there's no data to back up the assertion that these large tech companies are disregarding security if favor of profits, except for Twitch now, which is why this leak is interesting to me.

        • vineyardmike 4 years ago

          > In my experience with corporations there is little to no technical sharing between different parts of the company.

          Amazon is all about sharing efforts with the company. That's the whole point of AWS - its a monetization of this efforts. Most older AWS services started out as internal services that someone realized was generally useful.

    • adrusi 4 years ago

      EC2, Amazon's cash cow, competes with nearly identical offerings from Microsoft and Google, and is not a place where additional features are often all that valuable to customers. Any sort of breach like this on EC2 would seriously hurt Amazon's bottom line and they know it.

dolores_ab 4 years ago

Someone actually started streaming going through the code ... on twitch.

https://www.twitch.tv/deepfrieddev

  • dolores_ab 4 years ago
    • kuroguro 4 years ago

      On one hand I understand why you'd ban that kind of content, on the other it's essentially public information now... what's the point.

      • AnIdiotOnTheNet 4 years ago

        Because everyone else doing it still doesn't make it right.

        • kuroguro 4 years ago

          The streaming part or the downloading/looking at code?

          You can look at leaked source code for educational purposes in most places (not legal advice). As far as I understand leaks are commonly used in vulnerability research for example (if the bad guys can use it so can bug hunters).

          Streaming copyrighted material is a separate issue - but using it for "criticism, comment, news reporting, teaching" should fall under fair use, no?

        • OnlineGladiator 4 years ago

          What's wrong with looking at public code? The code is public, regardless of how it became public - this isn't someone's personal life being exposed. If twitch is damaged by streaming this, it's only because their poor code quality is being examined publicly.

          I can certainly understand why twitch banned this and don't blame them (although I think it's stupid), but I see nothing unethical about openly talking about this code in the public now that it's already there.

          • AnIdiotOnTheNet 4 years ago

            > What's wrong with looking at public code? The code is public, regardless of how it became public

            Copyright would disagree with you, and I would say that ethically it is basically the same as stealing it yourself. You're profiting off of someone else having done the dirty work for you.

            > this isn't someone's personal life being exposed.

            Apparently a lot of payment information, telephone numbers, etc. was also in the leak. I don't think we should downloading or encouraging people to download and peruse that stuff.

            • OnlineGladiator 4 years ago

              > You're profiting off of someone else having done the dirty work for you.

              I don't think anybody is streaming this stuff on twitch with the intention to make money, anymore than someone sharing it on a blog is trying to make money. Sure, in that edge case I'd agree with you, but it seems like the exception to the rule (after all people can just go look at the code themselves for free). I'm not talking about the guy who stole the code and is likely ransoming Amazon with it - I'm talking about people that just like to talk about code because it's something they like to do (there's an entire category for it on twitch already).

              > Apparently a lot of payment information, telephone numbers, etc. was also in the leak. I don't think we should downloading or encouraging people to download and peruse that stuff.

              My limited understanding is none of this information actually has been leaked yet, and is likely part of a future ransom (I could be wrong, I haven't looked because I don't care). I don't condone sharing that either, but that's not what the guy streaming was sharing. I'm talking about discussing the source code which is already publicly available.

              > Copyright would disagree with you

              I know very little about copyright so I'll just assume you're right. I still see no ethical problem with openly discussing this code publicly though. Anyway, agree to disagree.

    • Philip-J-Fry 4 years ago

      They = you. It's fine to be honest, you're not exactly making it unobvious.

  • CoolGuySteve 4 years ago

    "Sorry. Unless you’ve got a time machine, that content is unavailable."

    Too bad, it would be nice to see someone go through and document how Twitch works. I've never worked at "web scale" so I'd probably learn a lot.

    • yupper32 4 years ago

      > I've never worked at "web scale" so I'd probably learn a lot.

      As someone who has worked at both large and small companies, you'd probably be disappointed.

      • AustinDev 4 years ago

        It's likely lots of bubble gum and chicken wire. I'm sure in the video ingest and transcode side of things there are some really interesting bits though. When you're owned by Amazon you don't need to optimize too much to achieve web scale... just leverage AWS services. It's not like you're going to get a bill.

        • whack 4 years ago

          > When you're owned by Amazon you don't need to optimize too much to achieve web scale... just leverage AWS services. It's not like you're going to get a bill.

          Oh you're be surprised. Divisions get billed constantly for the AWS resources they consume, and this bill gets taken out of their annual budget. From what I hear, this is a common practice in most large organizations.

          Also, the AWS services you can access from within Amazon are almost identical to the AWS services you can access as an external customer. It's equally easy/hard for a random company to achieve web scale, compared to Twitch.

    • peterkos 4 years ago

      A lot of it is probably hacked together -- like, embarrassingly hacked together lol

      • cheeze 4 years ago

        This is true about almost any company. Closed source generally means you can have lower standards.

      • dijit 4 years ago

        You’re being downvoted for being overly negative, but the ops code is of (literally) shockingly poor quality.

        This leak has made me understand clearly that code quality is not what makes a product great.

        I guess that’s something.

        The jenkinsfiles are mostly nice and clean though. I’ve definitely seen worse of those.

        • peterkos 4 years ago

          Oops, didn't mean to be too too negative. I say embarrassing in the sense of, I've definitely shoved out awful code because something needed to get out(tm). And with large companies, deadlines that cause that situation are inevitable.

          But I also say it like that because, well, I've seen code that causes (objectively easy-to-fix) crashes but still ships because of one reason or another: laziness, politics, inexperience. It's a part of software engineering I'm still trying to accept.

      • phgn 4 years ago

        Yep, there are lots of small services that don't seem production ready in the source code. Though admittedly we don't know which of those are deprecated.

    • Arnavion 4 years ago

      Well, you know what they say, "Self help is the best help."

    • wesleytodd 4 years ago

      I hear Netflix has a good tech blog ;)

  • jedberg 4 years ago

    Hah. This is like when reddit does something people don't like and there is a huge thread about it ... on reddit.

  • phgn 4 years ago

    It is really fun to go through the source code. You'll find interesting architecture diagrams, documentation etc. It's like joining a new job and being amazed how a service you actually use was build.

    Everyone interested, just download the code :)

  • onnnon 4 years ago

    Channel is gone, banned?

  • echelon 4 years ago

    It just got disconnected.

    The chat had a few Amazon insiders, which was interesting to read their perspectives.

  • mawaldne 4 years ago

    This no longer works. Guy got banned I think.

  • Orphis 4 years ago

    And banned

  • Avery3R 4 years ago

    got banned

  • Nickoladze 4 years ago

    aaaand it's gone

mastermojo 4 years ago

There's something about this sentence that I find hilarious:

The download was posted to 4chan today, described by its unidentified source as “part one” of “an extremely poggers leak,”

_qbjt 4 years ago

More discussion here: https://news.ycombinator.com/item?id=28770590

rasz 4 years ago

> including its source code

This will help with ad preroll blockers.

I would love to see someone look deep into Twitch recommendation system - last time I tested the thing they call "Feedback" is a rolling buffer and wont let you exclude more than ~100 things, adding more simply removed oldest entries and started spamming you with things you already excluded in the past. This looked like performance optimization (less things to track per user).

  • mariusor 4 years ago

    This won't help with preroll ads because the video segments themselves are replaced in the stream data. They're not ads, but it's not the stream either.

    You get a "twitch commercial break in progress" video for the time the ads are playing.

    You can check this by loading a stream with MPV.

    • rasz 4 years ago

      aaand new ad bypass dropped 4 hours ago :)

      >You can check this by loading a stream with MPV

      I watch all of my twitch using mplayer. "magic incantations" when generating access token is what produces ad free .m3u8. For example early methods involved setting origin and/or referrer headers to internal Amazon systems.

DavidPeiffer 4 years ago

I'd be interested if someone could get their own instance of Twitch up and running from this leak. Someone mentioned internal API's, which would have to be reworked to avoid detection, but it'd be interesting to host it on AWS just to see how long it takes to get shut down.

How would current AWS policies hold up? Obviously the code would be illegally acquired, but do they have detection mechanisms in place?

  • manquer 4 years ago

    Even with source code it is hard to run a service if not impossible. You would need well written documentation that explains various options and error codes you could potentially get.

    Many times there is some magic command only one guy knows and he will share with you on slack.

    Rubbing a service of any complexity takes years of institutional knowledge.

    • BugWatch 4 years ago

      Please don't rub the services, it causes unnecessary friction, and wear & tear.

  • ijcd 4 years ago

    100s of services and databases to work out and sort through. Good luck building a global real-time video CDN too. You could build your own faster. Microservice architectures mirror the org that built them. You wouldn’t do it the same way for yourself.

personjerry 4 years ago

The top streamers' earnings were also leaked: https://www.twitchearnings.com/

ChrisArchitect 4 years ago

lots of discussion and speculation from a few hours ago here:

https://news.ycombinator.com/item?id=28770590

marto1 4 years ago

We're just walking into a future where these kind of leaks happen every other day, aren't we ?

luis8 4 years ago

I wonder how often these "hacks" are just an engineer leaking the info.

fhood 4 years ago

Hang on, is this just a repo dump or not? Because it looks like a repo dump, in which case I would be very surprised if any passwords or other personal information is included, at least at a reasonable scale.

noncoml 4 years ago

Anybody took a peek?

What language, and framework if they use one, do they use?

doctorshady 4 years ago

Archive of the original 4chan post from this morning: https://archive.is/8rQNK

imwillofficial 4 years ago

Is this the first time actual Amazon infrastructure has been hacked? Anyone has Amazon been hacked pervious to this? (Not talking about insecure AWS accounts)

iuri1 4 years ago

Since the main leaked files are from github, I'm assuming they got it from one of the many reported github auth flaws which don't get fixed and allows access to private repositories. Or more unlikely, via someone getting sloppy with their laptop.

Now I wonder if the commit history has database dumps or sensitive information, which is a common practice, or if any twitch servers have been accessed through a breach or privileged information found in some of their source code.

frays 4 years ago

As an avid Twitch streamer, what do I need to do to protect myself?

  • INTPenis 4 years ago

    Change your password obviously, maybe even reset your 2FA if those codes are in the leak.

    And if you want to be perfectly safe, don't visit twitch. Because if that source code has any vulnerabilities they might be exploited against twitch visitors as we speak.

  • ALittleLight 4 years ago

    Also change any account with a password that's the same as your twitch account. Once they know your twitch password they will try it on your related accounts.

  • shapefrog 4 years ago

    Report your earnings on your tax.

andrewstuart 4 years ago

What language is the main website written in?

rawoke083600 4 years ago

at we least know their backups were 'complete' ! This hack seems to includes everything and the kitchen sink !

1vuio0pswjnm7 4 years ago

[deleted]

anthk 4 years ago

From banned usernames, "Jesus".

Yep. From Mexico to the Pagonia and Iberia, let's screw a few millions of users.

runawaybottle 4 years ago

Does it take a genius to figure out how to build twitch? It’s a modern crud app with video streaming.

  • kabdib 4 years ago

    I figure you could "build a Steam" in a couple of years, with the right engineers hitting the main features. There's very little magic at the technology level, and you can make life simpler and forget about minor things like the hardware survey or the pretty graphs. I'm not saying this is trivial, but it's definitely doable.

    This is a far different statement than "You can build something and compete with Steam in a couple of years". Most of the really hard problems are not technical. Success ain't gonna happen without a bunch of pain, sweat, and strategic stumbles on the part of the competition.

    • runawaybottle 4 years ago

      Sir (Madame?), I ask you one simple question:

      Was Twitch built in 10 years, or over just a few?

      Steam was built since I was in FUCKING high school. Im old now, well over 30.

      Apples, and blueberries.

      Bluebarry, Drewbarry, tomato, ToMaHtoH.

      Fuck their stupid ass streaming code, it’s a giant crud app, only their devops team can take credit for scaling, everyone else is not worth a shit, sorry, thats life, I gotta Leetcode too, and ur code isn’t worth me reading it, leaked or not).

      • dijit 4 years ago

        Based on what I read of the ops code… don’t give them too much credit.

        The thing I learned most from this leak is that the technology side plays very little part in the business being successful or not.

  • o10449366 4 years ago

    This is such a Hacker News comment.

    It's just a crud app - why do they need more than 10 employees?

  • namrog84 4 years ago

    A lot of the secret sauce of such things are not that secret but just take a lot of work.

    Building and maintaining infrastructure simply takes a lot of people, time, relationships and whatnot.

    They get good at it over time which I guess could consider some secret sauce but there isn't like some secret code that makes the whole thing way better that now you'll see tons of competitors.

  • ashtonkem 4 years ago

    Everything is easy to build until a small nation state’s worth of people want to use it at once.

    • ThePadawan 4 years ago

      I work in a small nation state.

      That doesn't stop CV-hungry engineers from finding ways to overcomplicate it.

      (I do agree with you on this topic in general)

      • ashtonkem 4 years ago

        You completely misread me.

        • ThePadawan 4 years ago

          Oh I'm sorry, here's what I read:

          "Building apps is easy as long as you don't have millions of users. For that you have to actually think about bottlenecks, the larger architecture etc."

          (I agree with that)

          What I wanted to express is that lots of engineers I personally know instead say

          "Building apps involves thinking about every bottleneck in advance and optimizing for every possible user scenario and a global user base, regardless if the number of users is only ~100."

          • oceanplexian 4 years ago

            "Building apps involves thinking about every bottleneck in advance and optimizing for every possible user scenario and a global user base, regardless if the number of users is only ~100."

            I would advocate the exact opposite. If you need to scale to X users focus on making a great platform for X users, even if it’s only 100. If you try to over-engineer instead you’ll prematurely optimize and will make poor decisions that’ll come back to bite you when you actually DO need the scale and the requirements change.

  • lwansbrough 4 years ago

    Just stream the video, it’s easy!

  • NelsonMinar 4 years ago

    I mean doing Youtube is even easier; it's just a wrapper around HTML5 video.

  • lm28469 4 years ago

    Everything is just a crud app with a few extra steps.... yet you're not Zuckerberg or Dorsey

    • nullifidian 4 years ago

      One shouldn't aspire to be a Zuckerberg/Dorsey.

      • lm28469 4 years ago

        I personally don't, many people do though. I used them because Facebook and Twitter could be easily summed up as "crud app"

    • runawaybottle 4 years ago

      I’m so misread, Twitch is a lot of luck, so is all of these companies. Show me the the source code for luck. I don’t give a fuck if you leaked a video streaming crud app code lol.

      • notyourwork 4 years ago

        This perspective is immature at best, but genuinely ignorant of how tech works at twitch scale.

      • holler 4 years ago

        You're missing the _hard work_ part. Sure there's always an element of "luck" in any story of success, but that mostly has to do with timing, and is much less weighted than the perseverance and hard work of the people building it.

        Twitch is a full-featured, very mature application with many moving parts outside of just the video streaming, and building all those parts took an incredible amount of time and effort.

        • runawaybottle 4 years ago

          It’s just luck. I mean, if I was a storyteller, what story would I have to tell if there was no story.

          They hit.

          It’s sort of like we all hold Golden dice, so we marveled, by our own eyes, at the gold.

          Dealer: You rolling those?

          Us: no, it’s gold.

          They fucking risked it. It’s not a engineering feat, we’re all a bunch of pussies.

          Twitch is easiest site to build, you might as well show me a todo app (which will be sieged and dismantled), scale is solved, we will eat your applications, the barbarians.

          Rome falls.

          • lm28469 4 years ago

            You'll be rolling dices for a long time of that's all you need to build a twitch clone ;)

      • lm28469 4 years ago

        It's a very polished, state of the art crud app serving millions of people, it's always interesting to see how it's made.

        I personally don't give a single fuck but I can see the appeal for some people.

        It's a bit like the great pyramids, it's just a big pile of rocks but we'd be really interested in knowing exactly how the made these big piles

  • Zababa 4 years ago

    You don't need a genius. You need a few good people, and a lot of hands. I think the best way to look at things like Twitch is to compare them to cathedrals, bridges, things like that. You might be able to have the idea and sketch the plans by yourself, but it's physically impossible to build it yourself.

  • throw_m239339 4 years ago

    Like all things web, the problem is scaling the platform and moderation/security. It wouldn't be hard to build a toy Twitch clone no. But it takes tons of people and money to scale it / secure it. And even with all the security, they still got hacked...

  • mdoms 4 years ago
  • decebalus1 4 years ago

    This reminds me of the Albertsons guy on Blind who inadvertently created a meme when he said that Facebook could be rewritten with a small cluster of Oracle dbs. The meme is that Albertsons people are so elite, they work and think in a higher level of existence, way above the scalability bs us commoners are accustomed to.

  • bradjohnson 4 years ago

    Right, just like a plane is a car with wings.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection