How to generate a random alpha-numeric string?

2018-12-31 00:43发布

I've been looking for a simple Java algorithm to generate a pseudo-random alpha-numeric string. In my situation it would be used as a unique session/key identifier that would "likely" be unique over 500K+ generation (my needs don't really require anything much more sophisticated). Ideally, I would be able to specify a length depending on my uniqueness needs. For example, a generated string of length 12 might look something like "AEYGF7K0DM1X".

30条回答
孤独总比滥情好
2楼-- · 2018-12-31 01:06

This is easily achievable without any external libraries.

1. Cryptographic Pseudo Random Data Generation

First you need a cryptographic PRNG. Java has SecureRandom for that typically uses the best entropy source on the machine (e.g. /dev/random) . Read more here.

SecureRandom rnd = new SecureRandom();
byte[] token = new byte[byteLength];
rnd.nextBytes(token);

Note: SecureRandom is the slowest, but most secure way in Java of generating random bytes. I do however recommend NOT considering performance here since it usually has no real impact on your application unless you have to generate millions of tokens per second.

2. Required Space of Possible Values

Next you have to decide "how unique" your token needs to be. The whole and only point of considering entropy is to make sure that the system can resist brute force attacks: the space of possible values must be so large that any attacker could only try a negligible proportion of the values in non-ludicrous time1. Unique identifiers such as random UUID have 122bit of entropy (ie. 2^122 = 5.3x10^36) - the chance of collision is "*(...) for there to be a one in a billion chance of duplication, 103 trillion version 4 UUIDs must be generated2". We will choose 128 bit since it fits exactly into 16 bytes and is seen as highly sufficient for being unique for basically every, but the most extreme, use cases and you don't have to think about duplicates. Here is a simple comparison table of entropy including simple analysis of the birthday problem.

comparison of token sizes

For simple requirements 8 or 12 byte length might suffice, but with 16 bytes you are on the "safe side".

And that's basically it. Last thing is to think about encoding so it can be represented as a printable text (read, a String).

3. Binary to Text Encoding

Typical encodings include:

  • Base64 every character encodes 6bit creating a 33% overhead. Unfortunatly there is no standard implementation in the JDK (7 and below - there is in Android and Java 8+). But numerous libraries exist that add this. The downside is, that standard Base64 is not safe for eg. urls and as filename in most file systems requiring additional encoding (e.g. url encoding) or the URL safe version of Base64 is used. Example encoding 16 bytes with padding: XfJhfv3C0P6ag7y9VQxSbw==

  • Base32 every character encodes 5bit creating a 40% overhead. This will use A-Z and 2-7 making it reasonably space efficient while being case-insensitive alpha-numeric. There is no standard implementation in the JDK. Example encoding 16 bytes without padding: WUPIL5DQTZGMF4D3NX5L7LNFOY

  • Base16 (hex) every character encodes 4bit requiring 2 characters per byte (ie. 16 byte create a string of length 32). Therefore hex is less space efficient than Base32 but is safe to use in most cases (url) since it only uses 0-9 and A to F. Example encoding 16 bytes: 4fa3dd0f57cb3bf331441ed285b27735. See a SO discussion about converting to hex here.

Additional encodings like Base85 and the exotic Base122 exist with better/worse space efficiency. You can create your own encoding (which basically most answers in this thread do) but I would advise against it, if you don't have very specific requirements. See more encoding schemes in the Wikipedia article.

4. Summary and Example

  • Use SecureRandom
  • Use at least 16 bytes (2^128) of possible values
  • Encode according to your requirements (usually hex or base32 if you need it to be alpha-numeric)

Don't

  • ... use your home brew encoding: better maintainable and readable for others if they see what standard encoding you use instead of weird for loops creating chars at a time.
  • ... use UUID: you are wasting 6bits of entropy and have verbose string representation

Example: Hex Token Generator

public static String generateRandomHexToken(int byteLength) {
    SecureRandom secureRandom = new SecureRandom();
    byte[] token = new byte[byteLength];
    secureRandom.nextBytes(token);
    return new BigInteger(1, token).toString(16); //hex encoding
}

//generateRandomHexToken(16) -> 2189df7475e96aa3982dbeab266497cd

Example: Tool

If you want a ready-to-use cli tool you may use dice: https://github.com/patrickfav/dice

查看更多
人气声优
3楼-- · 2018-12-31 01:10

I found this solution that generates a random hex encoded string. The provided unit test seems to hold up to my primary use case. Although, it is slightly more complex than some of the other answers provided.

/**
 * Generate a random hex encoded string token of the specified length
 *  
 * @param length
 * @return random hex string
 */
public static synchronized String generateUniqueToken(Integer length){ 
    byte random[] = new byte[length];
    Random randomGenerator = new Random();
    StringBuffer buffer = new StringBuffer();

    randomGenerator.nextBytes(random);

    for (int j = 0; j < random.length; j++) {
        byte b1 = (byte) ((random[j] & 0xf0) >> 4);
        byte b2 = (byte) (random[j] & 0x0f);
        if (b1 < 10)
            buffer.append((char) ('0' + b1));
        else
            buffer.append((char) ('A' + (b1 - 10)));
        if (b2 < 10)
            buffer.append((char) ('0' + b2));
        else
            buffer.append((char) ('A' + (b2 - 10)));
    }
    return (buffer.toString());
}

@Test
public void testGenerateUniqueToken(){
    Set set = new HashSet();
    String token = null;
    int size = 16;

    /* Seems like we should be able to generate 500K tokens 
     * without a duplicate 
     */
    for (int i=0; i<500000; i++){
        token = Utility.generateUniqueToken(size);

        if (token.length() != size * 2){
            fail("Incorrect length");
        } else if (set.contains(token)) {
            fail("Duplicate token generated");
        } else{
            set.add(token);
        }
    }
}
查看更多
听够珍惜
4楼-- · 2018-12-31 01:10

Here is the one line code by AbacusUtil

String.valueOf(CharStream.random('0', 'z').filter(c -> N.isLetterOrDigit(c)).limit(12).toArray())

Random doesn't mean it must be unique. to get unique strings, using:

N.uuid() // e.g.: "e812e749-cf4c-4959-8ee1-57829a69a80f". length is 36.
N.guid() // e.g.: "0678ce04e18945559ba82ddeccaabfcd". length is 32 without '-'
查看更多
梦醉为红颜
5楼-- · 2018-12-31 01:11

If you're happy to use Apache classes, you could use org.apache.commons.text.RandomStringGenerator (commons-text).

Example:

RandomStringGenerator randomStringGenerator =
        new RandomStringGenerator.Builder()
                .withinRange('0', 'z')
                .filteredBy(CharacterPredicates.LETTERS, CharacterPredicates.DIGITS)
                .build();
randomStringGenerator.generate(12); // toUpperCase() if you want

Since commons-lang 3.6, RandomStringUtils is deprecated.

查看更多
几人难应
6楼-- · 2018-12-31 01:11

A short and easy solution, but uses only lowercase and numerics:

Random r = new java.util.Random ();
String s = Long.toString (r.nextLong () & Long.MAX_VALUE, 36);

The size is about 12 digits to base 36 and can't be improved further, that way. Of course you can append multiple instances.

查看更多
ら面具成の殇う
7楼-- · 2018-12-31 01:11

An alternative in Java 8 is:

static final Random random = new Random(); // Or SecureRandom
static final int startChar = (int) '!';
static final int endChar = (int) '~';

static String randomString(final int maxLength) {
  final int length = random.nextInt(maxLength + 1);
  return random.ints(length, startChar, endChar + 1)
        .collect(StringBuilder::new, StringBuilder::appendCodePoint, StringBuilder::append)
        .toString();
}
查看更多
登录 后发表回答