python - Mypy Python 2 坚持使用 unicode 值而不是字符串值

Question

Python 2在某些情况下会隐式转换str为。根据您尝试对结果值执行的操作，unicode此转换有时会抛出一个。UnicodeError我不知道确切的语义，但这是我想避免的。

是否可以使用除此之外的其他类型unicode或类似于--strict-optional（http://mypy-lang.blogspot.co.uk/2016/07/mypy-043-released.html）的命令行参数来导致程序使用这种隐式转换无法进行类型检查？

def returns_string_not_unicode():
    # type: () -> str
    return u"a"

def returns_unicode_not_string():
    # type: () -> unicode
    return "a"

在此示例中，只有函数returns_string_not_unicode无法进行类型检查。

$ mypy --py2 unicode.py
unicode.py: note: In function "returns_string_not_unicode":
unicode.py:3: error: Incompatible return value type (got "unicode", expected "str")

我希望他们两个都无法进行类型检查。

编辑：

type: () -> byte似乎被以同样的方式对待str

def returns_string_not_unicode():
    # type: () -> bytes
    return u"a"

score 3 · Accepted Answer

This is, unfortunately, an ongoing and currently unresolved issue -- see https://github.com/python/mypy/issues/1141 and https://github.com/python/typing/issues/208.

A partial fix is to use typing.Text which is (unfortunately) currently undocumented (I'll work on fixing that though). It's aliased to str in Python 3 and to unicode in Python 2. It won't resolve your actual issue or cause the second function to fail to typecheck, but it does make it a bit easier to write types compatible with both Python 2 and Python 3.

In the meantime, you can hack together a partial workaround by using the recently-implemented NewType feature -- it lets you define a psuedo-subclass with minimal runtime cost, which you can use to approximate the functionality you're looking for:

from typing import NewType, Text

# Tell mypy to treat 'Unicode' as a subtype of `Text`, which is
# aliased to 'unicode' in Python 2 and 'str' (aka unicode) in Python 3
Unicode = NewType('Unicode', Text)

def unicode_not_str(a: Unicode) -> Unicode:
    return a

# my_unicode is still the original string at runtime, but Mypy
# treats it as having a distinct type from `str` and `unicode`.
my_unicode = Unicode(u"some string")

unicode_not_str(my_unicode)      # typechecks
unicode_not_str("foo")           # fails
unicode_not_str(u"foo")          # fails, unfortunately
unicode_not_str(Unicode("bar"))  # works, unfortunately

It's not perfect, but if you're principled about when you elevate a string into being treated as being of your custom Unicode type, you can get something approximating the type safety you're looking for with minimal runtime cost until the bytes/str/unicode issue is settled.

~~Note that you'll need to install mypy from the master branch on Github to use NewType.~~

Note that NewType was added as of mypy version 0.4.4.

python - Mypy Python 2 坚持使用 unicode 值而不是字符串值

1 回答 1

Related

Reference