WordPress passwords, explained and cracked

If you enjoy going to WordCamps as I do, you probably heard this already: "WordPress password hashing is not safe", or in the most technical version: "...because it is md5 based".

True or not, strong password hashing is crucial for a large ecosystem like the WordPress one, which has always been a juicy target for hackers. So, I decided to take a closer look at the hashing system and try to crack WordPress hashes from scratch!

Understanding WordPress password hashes

I started doing some googling and found that most of the information out there is generic and confusing. Lots of references to the PHP libraries used (portable hash from phpass), but nothing concrete.

I decided to take a different approach starting from an assumption:

hashing is a one-way process, but WordPress is somewhat able to authenticate users by matching their password input with the hash stored in the database

From there, I started checking the code and found the first interesting function: wp_check_password($password,$hash) which compares the plain text password with the hash and returns true if they match.

// presume the new style phpass portable hash.
if ( empty( $wp_hasher ) ) {
    require_once ABSPATH . WPINC . '/class-phpass.php';
    // By default, use the portable hash from phpass.
    $wp_hasher = new PasswordHash( 8, true );
}

$check = $wp_hasher->CheckPassword( $password, $hash );

/** This filter is documented in wp-includes/pluggable.php */
return apply_filters( 'check_password', $check, $password, $hash, $user_id );

Going through the code, it quickly brings us to CheckPassword, crypt_private and encode64 which basically is where the magic happens.

Long story short, crypt_private($password, $stored_hash) re-hashes the password before it gets compared to the stored hash. If they match, the password is correct and authentication goes on. This means that we can also use that function to crack the hash.

Hash anatomy

This is a WordPress hash:

$P$BnPVO4gP9JUMSAM1WlLTHPdH6EDj4e1

For simplicity, we will assume the site uses PHP>5 and the newest phpass portable hash, which is the most common setup.

The first 3 characters $P$ are an ID, telling the system which kind of hash we have.

Character number 3 (counting from 0) is used to determine how many times the md5() has to process the input string.

Chars from 4 to 12 nPVO4gP9 are the salt, which is a random string appended to the password before hashing, to give it more randomness. For example, if your password is admin, it gets turned to nPVO4gP9admin and then hashed.

The remaining part of the hash JUMSAM1WlLTHPdH6EDj4e1 is the real randomness, generated by the salt+password passed in an undocumented encode64 function, which performs some bitwise operations on the input string and returns a 22 chars output.

Not that clear, uh? Stick with me.

So far we know:

the first part of the hash is a fixed id
the second is a single char used as a counter, also fixed
the third is the salt, tied to the password
the list is random, generated by 'salt+pass' processed by encode64 function

[$P$] [B] [nPVO4gP9] [JUMSAM1WlLTHPdH6EDj4e1]

So, we can re-write the logic - salt+password hashed X times and passed in encode64 - to perform a dictionary or brute-force attack, and obtain the same 'last part' of the hash and that would be a successful hash crack!

In a real-life scenario, hackers would be more interested in finding weak passwords because they are more likely to be reused, instead of random ones that are often generated by a password manager and so site-specific.

Rewriting encode64 using golang

I decided to go with Golang because it is very fast, and we need all that speed to compute large password dictionaries.

Just like in the WordPress encryption procedure, the script takes the hash and isolates the salt, then it composes the password (salt+pass) attempting every password in the dictionary and md5-hashing X number of times, and we have seen how X is determined.

itoa64 := "./0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
hashloop := 1 << strings.Index(itoa64, string(hash[3])); // char n. 4 in the hash

Done that, it performs a series of bitwise operations on bytes obtained using ord() in PHP, which we do not need in Golang as we already have byte values:

itoa64[hashedpass[0] &  0x3f]

At this point, I realized that we do not even need to find the whole string. It is enough to match a few characters to match the whole password and make the script more performant! So, due to the nature of the hashing process, if char n. 0 / 4 / 8 ... matches, the password passed as input is correct.

And BOOM, we cracked the hash:

wphashcrash '$P$BnPVO4gP9JUMSAM1WlLTHPdH6EDj4e1' Dev/wphashcrash/dict.txt
$P$BnPVO4gP9JUMSAM1WlLTHPdH6EDj4e1 admin
2020/05/30 12:44:15 Executed in 0.000221

The code is as usual on GitHub - https://github.com/francescocarlucci/wphashcrash - and at the moment, the script is a POC and gets only a single password hash as input, and the path to the password dictionary.

I already outlined in the readme some improvements that would be nice to have and I'll probably add them in the future.

If you really wanna try to hack a list of hashes, you can fork it or just wrap the script in a bash for loop.

Security considerations

At this point, I think it is clear enough that if an attacker gets access to the database on a WordPress site, he/she can basically crack every weak password.

This is unfortunate, especially because:

SQL Injection is not that rare in WordPress plugins
RCE will also most likely grant access to the DB
WordPress ecosystem is a huge territory for freelancer's work and is very common to find old/unused wp-admin and FTP accounts, created for temporary needs

This last point, combined with user enumeration which is a WordPress common issue, leaves the door open to gaining access to the DB using stolen/pawned admin credentials.

On top of this, many websites do not enforce strong passwords to not hurt the user experience, and I have direct experience in any of the mentioned points.

Once a website is compromised, having weak hashes stored allows attackers to compromise other user accounts on other sites, if they tend to re-use passwords and we know they do.

Do we have solutions?

My research did not go that far, I only wanted to see how to break WordPress hashes.

Bcrypt is known to be a stronger hashing method compared to md5, and there is an existing plugin that uses bcrypt and replaces all the core functions needed to handle passwords:

wp_check_password()
wp_hash_password()
wp_set_password()

Of course, you can write your own solution as a developer, but I believe this is an issue to fix at a core level to aim for a large adoption.

Final note

There are many tools to crack hashes, like Hashcat and John The Ripper which may be even more performant, but again, the scope of this research is understand the WordPress hash structure and cracking it from scratch.

Thanks for reading.

Francesco