python - Python print 没有使用 repr、unicode 或 str 作为 unicode 子类？

Question

Python print 在打印时没有使用__repr__,__unicode__或者__str__我的 unicode 子类。关于我做错了什么的任何线索？

这是我的代码：

使用 Python 2.5.2（r252:60911，2009 年 10 月 13 日，14:11:59）

>>> class MyUni(unicode):
...     def __repr__(self):
...         return "__repr__"
...     def __unicode__(self):
...         return unicode("__unicode__")
...     def __str__(self):
...         return str("__str__")
...      
>>> s = MyUni("HI")
>>> s
'__repr__'
>>> print s
'HI'

我不确定这是否是上述的准确近似值，但只是为了比较：

>>> class MyUni(object):
...     def __new__(cls, s):
...         return super(MyUni, cls).__new__(cls)
...     def __repr__(self):
...         return "__repr__"
...     def __unicode__(self):
...         return unicode("__unicode__")
...     def __str__(self):
...         return str("__str__")
...
>>> s = MyUni("HI")
>>> s
'__repr__'
>>> print s
'__str__'

[已编辑...] 这听起来像是获取 isinstance(instance, basestring) 并提供对 unicode 返回值的控制的字符串对象的最佳方法，并且使用 unicode repr 是...

>>> class UserUnicode(str):
...     def __repr__(self):
...         return "u'%s'" % super(UserUnicode, self).__str__()
...     def __str__(self):
...         return super(UserUnicode, self).__str__()
...     def __unicode__(self):
...         return unicode(super(UserUnicode, self).__str__())
...
>>> s = UserUnicode("HI")
>>> s
u'HI'
>>> print s
'HI'
>>> len(s)
2

上面的_ str _和_ repr _没有给这个例子增加任何东西，但它的想法是显式地显示一个模式，并根据需要进行扩展。

只是为了证明这种模式授予控制权：

>>> class UserUnicode(str):
...     def __repr__(self):
...         return "u'%s'" % "__repr__"
...     def __str__(self):
...         return "__str__"
...     def __unicode__(self):
...         return unicode("__unicode__")
... 
>>> s = UserUnicode("HI")
>>> s
u'__repr__'
>>> print s
'__str__'

想法？

score 10 · Accepted Answer

问题是print不尊重子类。__str__unicode

从PyFile_WriteObject, 由print:

int
PyFile_WriteObject(PyObject *v, PyObject *f, int flags)
{
...
        if ((flags & Py_PRINT_RAW) &&
    PyUnicode_Check(v) && enc != Py_None) {
    char *cenc = PyString_AS_STRING(enc);
    char *errors = fobj->f_errors == Py_None ? 
      "strict" : PyString_AS_STRING(fobj->f_errors);
    value = PyUnicode_AsEncodedString(v, cenc, errors);
    if (value == NULL)
        return -1;

PyUnicode_Check(v)v如果's 的类型是unicode 或子类，则返回 true 。因此，此代码直接编写 unicode 对象，无需咨询__str__.

请注意，子类化str和覆盖__str__按预期工作：

>>> class mystr(str):
...     def __str__(self): return "str"
...     def __repr__(self): return "repr"
... 
>>> print mystr()
str

与调用str或unicode显式一样：

>>> class myuni(unicode):
...     def __str__(self): return "str"
...     def __repr__(self): return "repr"
...     def __unicode__(self): return "unicode"
... 
>>> print myuni()

>>> str(myuni())
'str'
>>> unicode(myuni())
u'unicode'

我相信这可以解释为当前实现的 Python 中的一个错误。

score 6 · Accepted Answer

您正在继承unicode.

它永远不会调用__unicode__，因为它已经是unicode。相反，这里发生的是对象被编码为stdout编码：

>>> s.encode('utf8')
'HI'

除了它将使用直接 C 调用而不是.encode()方法。这是printunicode 对象的默认行为。

print语句调用，PyFile_WriteObject它又PyUnicode_AsEncodedString在处理unicode对象时调用。然后后者遵循当前编码的编码函数，这些函数使用Unicode C 宏直接访问数据结构。你不能从 Python 中截取它。

我猜你正在寻找的是一个__encode__钩子。由于这已经是一个unicode子类，print只需要编码，不需要再转换unicode ，也不能在不显式编码的情况下将其转换为字符串。您必须与 Python 核心开发人员一起讨论这个问题，看看是否__encode__有意义。

python - Python print 没有使用 __repr__、__unicode__ 或 __str__ 作为 unicode 子类？

2 回答 2

Related

Reference

python - Python print 没有使用 repr、unicode 或 str 作为 unicode 子类？