Following up to Regular expression to match hostname or IP Address? and using Restrictions on valid host names as a reference, what is the most readable, concise way to match/validate a hostname/fqdn (fully qualified domain name) in Python? I've answered with my attempt below, improvements welcome.
相关问题
- how to define constructor for Python's new Nam
- streaming md5sum of contents of a large remote tar
- How to get the background from multiple images by
- Evil ctypes hack in python
- Correctly parse PDF paragraphs with Python
I like the thoroughness of Tim Pietzcker's answer, but I prefer to offload some of the logic from regular expressions for readability. Honestly, I had to look up the meaning of those
(?
"extension notation" parts. Additionally, I feel the "double-negative" approach is more obvious in that it limits the responsibility of the regular expression to just finding any invalid character. I do like that re.IGNORECASE allows the regex to be shortened.So here's another shot; it's longer but it reads kind of like prose. I suppose "readable" is somewhat at odds with "concise". I believe all of the validation constraints mentioned in the thread so far are covered:
Per The Old New Thing, the maximum length of a DNS name is 253 characters. (One is allowed up to 255 octets, but 2 of those are consumed by the encoding.)
One could argue for accepting empty domain names, or not, depending on one's purpose.
ensures that each segment
It also avoids double negatives (
not disallowed
), and ifhostname
ends in a.
, that's OK, too. It will (and should) fail ifhostname
ends in more than one dot.