How long to brute force a salted SHA-512 hash? (sa

2020-01-25 04:16发布

问题:

Here is an algorithm in Java:

public String getHash(String password, String salt) throws Exception {
    String input = password + salt;
    MessageDigest md = MessageDigest.getInstance(SHA-512);
    byte[] out = md.digest(input.getBytes());
    return HexEncoder.toHex(out);
}

Assume the salt is known. I want to know the time to brute force for when the password is a dictionary word and also when it is not a dictionary word.

回答1:

In your case, breaking the hash algorithm is equivalent to finding a collision in the hash algorithm. That means you don't need to find the password itself (which would be a preimage attack), you just need to find an output of the hash function that is equal to the hash of a valid password (thus "collision"). Finding a collision using a birthday attack takes O(2^(n/2)) time, where n is the output length of the hash function in bits.

SHA-2 has an output size of 512 bits, so finding a collision would take O(2^256) time. Given there are no clever attacks on the algorithm itself (currently none are known for the SHA-2 hash family) this is what it takes to break the algorithm.

To get a feeling for what 2^256 actually means: currently it is believed that the number of atoms in the (entire!!!) universe is roughly 10^80 which is roughly 2^266. Assuming 32 byte input (which is reasonable for your case - 20 bytes salt + 12 bytes password) my machine takes ~0,22s (~2^-2s) for 65536 (=2^16) computations. So 2^256 computations would be done in 2^240 * 2^16 computations which would take

2^240 * 2^-2 = 2^238 ~ 10^72s ~ 3,17 * 10^64 years

Even calling this millions of years is ridiculous. And it doesn't get much better with the fastest hardware on the planet computing thousands of hashes in parallel. No human technology will be able to crunch this number into something acceptable.

So forget brute-forcing SHA-256 here. Your next question was about dictionary words. To retrieve such weak passwords rainbow tables were used traditionally. A rainbow table is generally just a table of precomputed hash values, the idea is if you were able to precompute and store every possible hash along with its input, then it would take you O(1) to look up a given hash and retrieve a valid preimage for it. Of course this is not possible in practice since there's no storage device that could store such enormous amounts of data. This dilemma is known as memory-time tradeoff. As you are only able to store so many values typical rainbow tables include some form of hash chaining with intermediary reduction functions (this is explained in detail in the Wikipedia article) to save on space by giving up a bit of savings in time.

Salts were a countermeasure to make such rainbow tables infeasible. To discourage attackers from precomputing a table for a specific salt it is recommended to apply per-user salt values. However, since users do not use secure, completely random passwords, it is still surprising how successful you can get if the salt is known and you just iterate over a large dictionary of common passwords in a simple trial and error scheme. The relationship between natural language and randomness is expressed as entropy. Typical password choices are generally of low entropy, whereas completely random values would contain a maximum of entropy.

The low entropy of typical passwords makes it possible that there is a relatively high chance of one of your users using a password from a relatively small database of common passwords. If you google for them, you will end up finding torrent links for such password databases, often in the gigabyte size category. Being successful with such a tool is usually in the range of minutes to days if the attacker is not restricted in any way.

That's why generally hashing and salting alone is not enough, you need to install other safety mechanisms as well. You should use an artificially slowed down entropy-enducing method such as PBKDF2 described in PKCS#5 and you should enforce a waiting period for a given user before they may retry entering their password. A good scheme is to start with 0.5s and then doubling that time for each failed attempt. In most cases users don't notice this and don't fail much more often than three times on average. But it will significantly slow down any malicious outsider trying to attack your application.



回答2:

I want to know the time to brute force for when the password is a dictionary word and also when it is not a dictionary word.

Dictionary password

Ballpark figure: there are about 1,000,000 English words, and if a hacker can compute about 10,000 SHA-512 hashes a second (update: see comment by CodesInChaos, this estimate is very low), 1,000,000 / 10,000 = 100 seconds. So it would take just over a minute to crack a single-word dictionary password for a single user. If the user concatenates two dictionary words, you're in the area of a few days, but still very possible if the attacker is cares enough. More than that and it starts getting tough.

Random password

If the password is a truly random sequence of alpha-numeric characters, upper and lower case, then the number of possible passwords of length N is 60^N (there are 60 possible characters). We'll do the calculation the other direction this time; we'll ask: What length of password could we crack given a specific length of time? Just use this formula:

N = Log60(t * 10,000) where t is the time spent calculating hashes in seconds (again assuming 10,000 hashes a second).

1 minute:    3.2
5 minute:    3.6
30 minutes:  4.1
2 hours:     4.4
3 days:      5.2

So given a 3 days we'd be able to crack the password if it's 5 characters long.

This is all very ball-park, but you get the idea. Update: see comment below, it's actually possible to crack much longer passwords than this.

What's going on here?

Let's clear up some misconceptions:

  • The salt doesn't make it slower to calculate hashes, it just means they have to crack each user's password individually, and pre-computed hash tables (buzz-word: rainbow tables) are made completely useless. If you don't have a precomputed hash-table, and you're only cracking one password hash, salting doesn't make any difference.

  • SHA-512 isn't designed to be hard to brute-force. Better hashing algorithms like BCrypt, PBKDF2 or SCrypt can be configured to take much longer to compute, and an average computer might only be able to compute 10-20 hashes a second. Read This excellent answer about password hashing if you haven't already.

  • update: As written in the comment by CodesInChaos, even high entropy passwords (around 10 characters) could be bruteforced if using the right hardware to calculate SHA-512 hashes.


Notes on accepted answer:

The accepted answer as of September 2014 is incorrect and dangerously wrong:

In your case, breaking the hash algorithm is equivalent to finding a collision in the hash algorithm. That means you don't need to find the password itself (which would be a preimage attack)... Finding a collision using a birthday attack takes O(2^n/2) time, where n is the output length of the hash function in bits.

The birthday attack is completely irrelevant to cracking a given hash. And this is in fact a perfect example of a preimage attack. That formula and the next couple of paragraphs result in dangerously high and completely meaningless values for an attack time. As demonstrated above it's perfectly possible to crack salted dictionary passwords in minutes.

The low entropy of typical passwords makes it possible that there is a relatively high chance of one of your users using a password from a relatively small database of common passwords...

That's why generally hashing and salting alone is not enough, you need to install other safety mechanisms as well. You should use an artificially slowed down entropy-enducing method such as PBKDF2 described in PKCS#5...

Yes, please use an algorithm that is slow to compute, but what is "entropy-enducing"? Putting a low entropy password through a hash doesn't increase entropy. It should preserve entropy, but you can't make a rubbish password better with a hash, it doesn't work like that. A weak password put through PBKDF2 is still a weak password.



回答3:

There isn't a single answer to this question as there are too many variables, but SHA2 is not yet really cracked (see: Lifetimes of cryptographic hash functions) so it is still a good algorithm to use to store passwords in. The use of salt is good because it prevents attack from dictionary attacks or rainbow tables. Importance of a salt is that it should be unique for each password. You can use a format like [128-bit salt][512-bit password hash] when storing the hashed passwords.

The only viable way to attack is to actually calculate hashes for different possibilities of password and eventually find the right one by matching the hashes.

To give an idea about how many hashes can be done in a second, I think Bitcoin is a decent example. Bitcoin uses SHA256 and to cut it short, the more hashes you generate, the more bitcoins you get (which you can trade for real money) and as such people are motivated to use GPUs for this purpose. You can see in the hardware overview that an average graphic card that costs only $150 can calculate more than 200 million hashes/s. The longer and more complex your password is, the longer time it will take. Calculating at 200M/s, to try all possibilities for an 8 character alphanumberic (capital, lower, numbers) will take around 300 hours. The real time will most likely less if the password is something eligible or a common english word.

As such with anything security you need to look at in context. What is the attacker's motivation? What is the kind of application? Having a hash with random salt for each gives pretty good protection against cases where something like thousands of passwords are compromised.

One thing you can do is also add additional brute force protection by slowing down the hashing procedure. As you only hash passwords once, and the attacker has to do it many times, this works in your favor. The typical way to do is to take a value, hash it, take the output, hash it again and so forth for a fixed amount of iterations. You can try something like 1,000 or 10,000 iterations for example. This will make it that many times times slower for the attacker to find each password.