python - 验证主机名字符串

Question

跟踪正则表达式以匹配主机名或 IP 地址？并使用对有效主机名的限制作为参考，在 Python 中匹配/验证主机名/fqdn（完全限定域名）的最易读、最简洁的方法是什么？我在下面的尝试中回答了，欢迎改进。

score 54 · Accepted Answer

import re
def is_valid_hostname(hostname):
    if len(hostname) > 255:
        return False
    if hostname[-1] == ".":
        hostname = hostname[:-1] # strip exactly one dot from the right, if present
    allowed = re.compile("(?!-)[A-Z\d-]{1,63}(?<!-)$", re.IGNORECASE)
    return all(allowed.match(x) for x in hostname.split("."))

确保每个段

至少包含一个字符，最多包含 63 个字符
仅包含允许的字符
不以连字符开头或结尾。

它还避免了双重否定 ( not disallowed)，如果hostname以 a 结尾.，也可以。hostname如果以多个点结尾，它将（并且应该）失败。

score 14 · Accepted Answer

不要重新发明轮子。您可以使用库，例如验证器。或者您可以复制他们的代码：

安装

pip install validators

用法

import validators
if validators.domain('example.com')
    print('this domain is valid')

score 11 · Accepted Answer

这是Tim Pietzcker 答案的更严格版本，并进行了以下改进：

将主机名的长度限制为 253 个字符（去掉可选的尾随点之后）。
将字符集限制为 ASCII（即使用[0-9]而不是\d）。
检查 TLD 是否不是全数字的。

import re

def is_valid_hostname(hostname):
    if hostname[-1] == ".":
        # strip exactly one dot from the right, if present
        hostname = hostname[:-1]
    if len(hostname) > 253:
        return False

    labels = hostname.split(".")

    # the TLD must be not all-numeric
    if re.match(r"[0-9]+$", labels[-1]):
        return False

    allowed = re.compile(r"(?!-)[a-z0-9-]{1,63}(?<!-)$", re.IGNORECASE)
    return all(allowed.match(label) for label in labels)

score 7 · Accepted Answer

根据The Old New Thing，DNS 名称的最大长度为 253 个字符。（一个最多允许 255 个八位字节，但其中 2 个被编码消耗。）

import re

def validate_fqdn(dn):
    if dn.endswith('.'):
        dn = dn[:-1]
    if len(dn) < 1 or len(dn) > 253:
        return False
    ldh_re = re.compile('^[a-z0-9]([a-z0-9-]{0,61}[a-z0-9])?$',
                        re.IGNORECASE)
    return all(ldh_re.match(x) for x in dn.split('.'))

可以根据自己的目的争论是否接受空域名。

score 2 · Accepted Answer

我喜欢 Tim Pietzcker 回答的彻底性，但我更喜欢从正则表达式中卸载一些逻辑以提高可读性。老实说，我不得不查找那些(?“扩展符号”部分的含义。此外，我觉得“双重否定”方法更明显，因为它将正则表达式的职责限制为仅查找任何无效字符。我确实喜欢 re.IGNORECASE 允许缩短正则表达式。

所以这是另一个镜头；它更长，但读起来有点像散文。我想“可读”与“简洁”有些不一致。我相信到目前为止线程中提到的所有验证约束都已涵盖：


def isValidHostname(hostname):
    if len(hostname) > 255:
        return False
    if hostname.endswith("."): # A single trailing dot is legal
        hostname = hostname[:-1] # strip exactly one dot from the right, if present
    disallowed = re.compile("[^A-Z\d-]", re.IGNORECASE)
    return all( # Split by labels and verify individually
        (label and len(label) <= 63 # length is within proper range
         and not label.startswith("-") and not label.endswith("-") # no bordering hyphens
         and not disallowed.search(label)) # contains only legal characters
        for label in hostname.split("."))

score 1 · Accepted Answer

def is_valid_host(host):
    '''IDN compatible domain validator'''
    host = host.encode('idna').lower()
    if not hasattr(is_valid_host, '_re'):
        import re
        is_valid_host._re = re.compile(r'^([0-9a-z][-\w]*[0-9a-z]\.)+[a-z0-9\-]{2,15}$')
    return bool(is_valid_host._re.match(host))

score 1 · Accepted Answer

对@TimPietzcker 的回答是免费的。下划线是有效的主机名字符（但不适用于域名）。而双破折号通常用于 IDN punycode 域（例如 xn--）。端口号应该被剥离。这是代码的清理。

import re
def is_valid_hostname(hostname):
    if len(hostname) > 255:
        return False
    hostname = hostname.rstrip(".")
    allowed = re.compile("(?!-)[A-Z\d\-\_]{1,63}(?<!-)$", re.IGNORECASE)
    return all(allowed.match(x) for x in hostname.split("."))

# convert your unicode hostname to punycode (python 3 ) 
# Remove the port number from hostname
normalise_host = hostname.encode("idna").decode().split(":")[0]
is_valid_hostname(normalise_host )

score -1 · Accepted Answer

通过排除无效字符并确保非零长度来单独处理每个 DNS 标签。

def isValidHostname(hostname):
    disallowed = re.compile("[^a-zA-Z\d\-]")
    return all(map(lambda x: len(x) and not disallowed.search(x), hostname.split(".")))

score -1 · Accepted Answer

-1

我认为这个正则表达式可能对 Python 有所帮助：'^([a-zA-Z0-9]+(\.|\-))*[a-zA-Z0-9]+$'

于 2019-08-09T11:04:29.240 回答

score -3 · Accepted Answer

如果您要验证现有主机的名称，最好的方法是尝试解析它。您永远不会编写正则表达式来提供该级别的验证。

python - 验证主机名字符串

10 回答 10

安装

用法

Related

Reference