python - Python 2.7 - IPython 'raw_input' 并附加到列表 - 在每个项目之前添加 'u'

Question

我Python 2.7在 Mac OSX Lion 上使用。我正在使用IPython,Pandas 0.11.0和Numpy包Statsmodels。

我正在编写一个函数，允许用户对文件进行逻辑回归，指定在构建模型时要使用哪些变量，哪些变量应该转换为虚拟变量，哪个变量应该是自变量。

例如，当我执行以下操作时：

 cols_to_keep = []
 print (df.columns)
 i = eval(raw_input('How many of these variables would you like to use in logistic regression?: '))
 while i != 0:
    i = i - 1
    print (df.columns)
    addTo = raw_input('Enter a variable for this list that you would like to keep and use in logistic regression.: ')
    cols_to_keep.append(addTo)

我最终在路上遇到了问题。特别是当我要求用户从列表中指定因变量，然后需要将该变量从训练变量列表中取出时：

print (df.columns)

dependent = raw_input('Which of these columns would you like to be the dependent variable?: ')
training.remove(dependent)

在插入打印语句后，我发现添加到训练变量列表中的变量如下所示：

('these are the traing variables: ', ['access', u'age_age6574', u'age_age75plus', u'sex_male', u'stage_late', u'death_death'])

似乎 au已放置在每个用户指定的变量之前。

我的问题是：为什么会这样以及如何解决/解决这个问题，以便当用户指定因变量时，它实际上已从列表中删除。这也发生在用户指定变量并将其添加到列表中的所有其他情况下，如果我需要用户观察列表，则会造成混乱。

score 3 · Accepted Answer

这些只是 unicode 字符串，而不是字节字符串。没有错，字符串的内容不受影响。这u'text'只是为了让您在查看 repr 时可以区分 Python 2 中的字节字符串和 unicode 字符串。如果您打印字符串，您将看不到任何区别。这在 Python 3 中是相反的，其中"text"表示 unicode 字符串，而b"bytes"表示字节字符串。

如果您真的想将它们强制转换为字节串（不太可能），您可以这样做：

def ensure_str(s):
    if isinstance(s, unicode):
        s = s.encode('utf-8')
    return s

s = ensure_str(raw_input("prompt >"))

python - Python 2.7 - IPython 'raw_input' 并附加到列表 - 在每个项目之前添加 'u'

1 回答 1

Related

Reference