How to return a random character from a regex patt

2019-08-15 10:10发布

问题:

I would like to know if its somehow possible to return a single random character from a regex pattern, written in short term.

So here is my case..

I have created some Regex patterns contained in an Enum:

import random
from _operator import invert
from enum import Enum
import re

class RegexExpression(Enum):
    LOWERCASE = re.compile('a-z')
    UPPERCASE = re.compile('A-Z')
    DIGIT = re.compile('\d')
    SYMBOLS = re.compile('\W')

I want for these to be returned as a string containing all the characters that the regex expresses, based on this method below:

def create_password(symbol_count, digit_count, lowercase_count, uppercase_count):
    pwd = ""
    for i in range(1, symbol_count):
        pwd.join(random.choice(invert(RegexExpression.SYMBOLS.value)))
    for i in range(1, digit_count):
        pwd.join(random.choice(invert(RegexExpression.DIGIT.value)))
    for i in range(1, lowercase_count):
        pwd.join(random.choice(invert(RegexExpression.LOWERCASE.value)))
    for i in range(1, uppercase_count):
        pwd.join(random.choice(invert(RegexExpression.UPPERCASE.value)))
    return pwd

I have tried several thing, but the only option I find possible is using an Enum containing long regex patterns, or strings like in the below example:

LOWERCASE = "abcdefghijklmnopqrstuvwxyz"

... And so on with the other variables in use.

Any suggestions or solutions to this scenario?

--EDIT--

Mad Physicist brought the solution for my issue - Thanks a lot! Here is the working code:

def generate_password(length):
     tmp_length = length
     a = random.randint(1, length - 3)
     tmp_length -= a
     b = random.randint(1, length - a - 2)
     tmp_length -= b
     c = random.randint(1, length - a - b - 1)
     tmp_length -= c
     d = tmp_length

     pwd = ""
     for i in range(0, a):
         pwd += random.choice(string.ascii_lowercase)
     for i in range(0, b):
         pwd += random.choice(string.ascii_uppercase)
     for i in range(0, c):
         pwd += random.choice(string.digits)
     for i in range(0, d):
         pwd += random.choice(string.punctuation)

     pwd = ''.join(random.sample(pwd, len(pwd)))
     return pwd

回答1:

The string module has all the definitions you want.

  • Instead of RegexExpression.LOWERCASE use string.ascii_lowercase
  • Instead of RegexExpression.UPPERCASE use string.ascii_uppercase
  • Instead of RegexExpression.DIGIT use string.digits
  • RegexExpression.SYMBOLS is probably closest to string.punctuation

RegEx is not really suitable for this task. Expressions are used for checking if a character belongs to a class. I'm not aware of a good method to check the spec of a character class without getting into source code/implementation details.



回答2:

There's a recipe in the secrets module of the manual that may be a better approach:

https://docs.python.org/3.6/library/secrets.html#recipes-and-best-practices

from secrets import choice
import string
alphabet = string.ascii_letters + string.digits
while True:
    password = ''.join(choice(alphabet) for i in range(10))
    if (any(c.islower() for c in password)
        and any(c.isupper() for c in password)
            and sum(c.isdigit() for c in password) >= 3):
        break

print(password)


回答3:

If you 100% insist on using regex, you need a function to convert arbitrary character classes into strings. I'm sure there's an easier way to do this, but here is a general purpose routine:

from operator import methodcaller
from re import finditer

UNICODE_MAX = 0xFFFF
UNICODE = ''.join(map(chr, range(UNICODE_MAX + 1)))
ASCII = UNICODE [:128]

def class_contents(pattern, unicode=True, printable=True):
    base = UNICODE if unicode else ASCII
    result = map(methodcaller('group'), finditer(pattern, base))
    if printable:
        result = filter(str.isprintable, result)
    return ''.join(result)

Now you can just apply this function to your enum values to get the string of available characters.

Here is an IDEOne Link to demo the results: https://ideone.com/Rh4xKI. Notice that the regex for LOWERCASE and UPPERCASE need to be surrounded by square brackets or they will be literal three-character strings, not character classes.