Meet PassGAN, the supposedly “terrifying” AI password cracker that’s mostly hype

9 min read Original article ↗

WANT CHEESE WITH THAT NOTHINGBURGER?

AI cracking is on par with conventional methods, but you’d be forgiven for thinking otherwise.

Credit: Aurich Lawson | Getty Images

Credit: Aurich Lawson | Getty Images

By now, you’ve probably heard about a new AI-based password cracker that can compromise your password in seconds by using artificial intelligence instead of more traditional methods. Some outlets have called it “terrifying,” “worrying,” “alarming,” and “savvy.” Other publications have fallen over themselves to report that the tool can crack any password with up to seven characters—even if it has symbols and numbers—in under six minutes.

As with so many things involving AI, the claims are served with a generous portion of smoke and mirrors. PassGAN, as the tool is dubbed, performs no better than more conventional cracking methods. In short, anything PassGAN can do, these more tried and true tools do as well or better. And like so many of the non-AI password checkers Ars has criticized in the past—e.g., here, here, and here—the researchers behind PassGAN draw password advice from their experiment that undermines real security.

Teaching a machine to crack

PassGAN is a shortened combination of the words “Password” and “generative adversarial networks.” PassGAN is an approach that debuted in 2017. It uses machine learning algorithms running on a neural network in place of conventional methods devised by humans. These GANs generate password guesses after autonomously learning the distribution of passwords by processing the spoils of previous real-world breaches. These guesses are used in offline attacks made possible when a database of password hashes leaks as a result of a security breach.

An overview of a generative adversarial network.

Conventional password guessing uses lists of words numbering in the billions taken from previous breaches. Popular password-cracking applications like Hashcat and John the Ripper then apply “mangling rules” to these lists to enable variations on the fly.

When a word such as “password” appears in a word list, for instance, the mangling rules transform it into variations like “Password” or “p@ssw0rd” even though they never appear directly in the word list. Examples of real-world passwords cracked using mangling include: “Coneyisland9/,” “momof3g8kids,” “Oscar+emmy2” “k1araj0hns0n,” “Sh1a-labe0uf,” “Apr!l221973,” “Qbesancon321,” “DG091101%,” “@Yourmom69,” “ilovetofunot,” “windermere2313,” “tmdmmj17,” and “BandGeek2014.” While these passwords may appear to be sufficiently long and complex, mangling rules make them extremely easy to guess.

These rules and lists run on clusters that specialize in parallel computing, meaning they can perform repetitive tasks like cranking out large numbers of password guesses much faster than CPUs can. When poorly suited algorithms are used, these cracking rigs can transform a plaintext word such as “password” into a hash like “5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8” billions of times each second.

Another technique that makes word lists much more powerful is known as a combinator attack. As its name suggests, this attack combines two or more words in the list. In a 2013 exercise, password-cracking expert Jens Steube was able to recover the password “momof3g8kids” because he already had “momof3g” and “8kids” in his lists.

Password cracking also relies on a technique called brute force, which, despite its misuse as a generic term for cracking, is distinctly different from cracks that use words from a list. Rather, brute force cracking tries every possible combination for a password of a given length. For a password up to six characters, it starts by guessing “a” and runs through every possible string until it reaches “//////.”

The number of possible combinations for passwords of six or fewer characters is small enough to complete in seconds for the kinds of weaker hashing algorithms the Home Security Heroes seem to envision in its PassGAN writeup.

Brute forcing passwords of seven or more characters is more intensive because the keyspace—meaning the number of possible combinations—is orders of magnitude greater. To make the key space manageable, crackers augment brute force attacks with what are known as Markov chains. Instead of trying every possible combination, a Markov attack performs probabilistically ordered, per-position brute force attacks. Think of it as an “intelligent brute force” that uses statistics to check more likely passwords first.

Where a classic brute force tries “aaa,” “aab,” “aac,” a Markov attack prioritizes guesses based on their likelihood. Certain characters are more or less likely to appear in the first, second, third, or higher position of a string. The character “z,” for instance, may not appear often in the second or third positions, whereas the character “e” does. Yet another type of brute force is known as a mask attack. It allows the cracker to select which characters appear in which positions.

Brute force attacks can also be combined with word attacks. This hybrid attack may append all possible two-character strings containing digits or symbols to the end of each word in the list, followed by all three-character strings containing digits or symbols, and so on.

Cat-and-mouse neural networking

PassGAN uses none of these methods. Instead, it creates a neural network, a type of data structure loosely inspired by networks of biological neurons. This neural network attempts to train machines to interpret and analyze data in a way that’s similar to how a human mind would. These networks are organized in layers, with inputs from one layer connected to outputs from the next layer.

As explained in the above-linked 2017 paper introducing PassGAN:

To learn the generative model, GANs use a cat-and-mouse game, in which a deep generative network (G) tries to mimic the underlying distribution of the samples, while a discriminative deep neural network (D) tries to distinguish between the original training samples (i.e., “true samples”) and the samples generated by G (i.e., “fake samples”). This adversarial procedure forces D to leak the relevant information about the training data. This information helps G to adequately reproduce the original data distribution. PassGAN leverages this technique to generate new password guesses.

PassGAN was an exciting experiment that helped usher in the use of AI-based password candidate generators, but its time in the sun has come and gone, password-cracking expert and Senior Principal Engineer at Yahoo Jeremi Gosney said. Gosney added that a different neural networking method for guessing passwords, introduced in 2016, performs slightly better than PassGAN. A runner-up is this research from researcher Matt Weir. It uses a machine-learning model known as PCFGs—short for “probabilistic context-free grammars.”

“But even as the leading AI password candidate generator, their cracker is about on par with Markov generators (not a significant improvement),” he wrote of the 2016 work in an online interview. Referring to the overall results of the PassGAN tool implemented by Home Security Heroes, he wrote, “Unfortunately, its performance falls well short of existing techniques, including statistical candidate generators like Markov, probabilistic candidate generators like PCFGs, wordlists with mangling rules, and for short inputs, even dumb brute force.”

All of these nuances are lost on the Home Security Heroes team that demonstrated the PassGAN tool. They trained it on 15.7 million passwords taken solely from the RockYou breach, a tiny and outdated sliver of the overall corpus of available samples today. It can crack 81 percent of them in less than a month, 71 percent in less than a day, and 65 percent in less than an hour. It can also guess any seven-character password in six minutes or less.

It’s impressive that a machine can achieve that level of performance, and therein lies the value of the original PassGAN research. But compared to what’s possible through conventional means, these results are hardly remarkable. The chances that PassGAN will ever replace more conventional password cracking are infinitesimally small.

“Ignoring the fact that these would all crack instantly if you just used the Rockyou wordlist, as you know from our previous tangos, we can destroy Rockyou-type breaches (raw MD5, no password complexity), usually reaching 80 percent within the first few hours,” Gosney told me. “So these numbers are neither impressive nor exciting.”

Be very, very afraid

Instead of putting PassGAN into context, the Home Security Heroes write-up turns it into the next super-scary security threat. Its authors write:

PassGAN represents a concerning advancement in password cracking techniques. This latest approach uses Generative Adversarial Network (GAN) to autonomously learn the distribution of real passwords from actual password leaks, eliminating the need for manual password analysis. While this makes password cracking faster and more efficient, it is a serious threat to your online security.

PassGAN can generate multiple password properties and improve the quality of predicted passwords, making it easier for cybercriminals to crack your passwords and gain access to your personal data. As such, it is crucial to regularly update your passwords to protect yourself from this dangerous technology.

Specifically, the post advises people to change their passwords every three to six months, despite explicit guidance otherwise from technologists at the Federal Trade Commission, Microsoft, and NIST. Those entities all say that regular password changes are more harmful than helpful to security because the effort required to change scores of passwords multiple times per year encourages a person to cut potentially costly corners. Contrary to the advice in the write-up, there’s no reason to change sufficiently strong passwords unless there’s evidence that they have been compromised.

Other examples in the post dress up mediocre performance as something to worry about. “PassGAN took a mere six minutes to figure out a password with seven characters, even if it contained uppercase and lowercase letters, numbers, and symbols.” In reality, Gosney said, this is only slightly better than a brute force attack. And as explained earlier, human-generated passwords would use still faster methods such as brute force with Markov rules or a word list with rules.

As a final embarrassment, Home Security Heroes’ password strength checker is nothing short of abysmal. Remember “momof3g8kids,” one of the many passwords mentioned earlier that was cracked in minutes to hours using traditional methods? The checker says PassGAN would need 14 billion years to guess it. The same checker says it would take only 187 million years to crack the password “2HdmYfcn!H9VhV,” which, by all objective measurements, is immeasurably more secure.

So to all the people saying PassGAN represents a new threat to password security… no. PassGAN was an interesting experiment with minimal lasting benefit other than showing it’s possible to build a working AI-based password candidate generator that doesn’t rely on humans. The only notable or concerning thing about PassGAN these days is the hype and the counterproductive advice it’s generating.

Listing image: Getty Images

Photo of Dan Goodin

Dan Goodin is Senior Security Editor at Ars Technica, where he oversees coverage of malware, computer espionage, botnets, hardware hacking, encryption, and passwords. In his spare time, he enjoys gardening, cooking, and following the independent music scene. Dan is based in San Francisco. Follow him at here on Mastodon and here on Bluesky. Contact him on Signal at DanArs.82.

70 Comments

  1. Listing image for first story in Most Read: Blue Origin's rocket reuse achievement marred by upper stage failure