python - 如何在 Python 中编码/解码这个文件？

Question

我打算制作一个 Python 小游戏，它会从字典中随机打印键（英语），并且用户必须输入值（德语）。如果值正确，则打印“正确”并继续。如果值错误，它会打印“错误”并中断。

我以为这将是一件容易的事，但我被困在了路上。我的问题是我不知道如何打印德语字符。假设我有一个包含以下文本的文件“dictionary.txt”：

cat:Katze
dog:Hund
exercise:Übung
solve:lösen
door:Tür
cheese:Käse

我有这段代码只是为了测试输出的样子：

# -*- coding: UTF-8 -*-
words = {} # empty dictionary
with open('dictionary.txt') as my_file:
  for line in my_file.readlines():
    if len(line.strip())>0: # ignoring blank lines
      elem = line.split(':') # split on ":"
      words[elem[0]] = elem[1].strip() # appending elements to dictionary
print words

显然打印的结果并不像预期的那样：

    {'cheese': 'K\xc3\xa4se', 'door': 'T\xc3\xbcr',
     'dog': 'Hund', 'cat': 'Katze', 'solve': 'l\xc3\xb6sen',
     'exercise': '\xc3\x9cbung'}

那么我在哪里添加编码，我该怎么做呢？

谢谢！

score 5 · Accepted Answer

您正在查看字节字符串值，这些值打印为repr()结果，因为它们包含在字典中。字符串表示可以重新用作 Python 字符串文字，并且使用字符串转义序列显示不可打印和非 ASCII 字符。容器值总是用表示repr()以方便调试。

因此，字符串 'K\xc3\xa4se' 包含两个具有十六进制值 C3 和 A4 的非 ASCII 字节，这是 U+00E4 代码点的 UTF-8 组合。

您应该将值解码为unicode对象：

with open('dictionary.txt') as my_file:
    for line in my_file:   # just loop over the file
        if line.strip(): # ignoring blank lines
            key, value = line.decode('utf8').strip().split(':')
            words[key] = value

或者更好的是，codecs.open()在您阅读文件时使用它来解码文件：

import codecs

with codecs.open('dictionary.txt', 'r', 'utf8') as my_file:
    for line in my_file:
        if line.strip(): # ignoring blank lines
            key, value = line.strip().split(':')
            words[key] = value

打印结果字典仍将使用repr()内容的结果，所以现在您将看到u'cheese': u'K\xe4se'，因为\xe4Unicode 点 00E4 的转义码是ä字符。如果您希望将实际字符写入终端，请打印单个单词：

print words['cheese']

但是现在您可以将这些值与您解码的其他数据进行比较，前提是您知道它们的正确编码，然后操作它们并将它们再次编码为您需要使用的任何目标编解码器。print将自动执行此操作，例如，在将 unicode 值打印到终端时。

您可能想阅读 Unicode 和 Python：

每个软件开发人员绝对、绝对必须了解 Unicode 和字符集（没有任何借口！）作者：Joel Spolsky
Python Unicode HOWTO
Ned Batchelder 的实用 Unicode

score -2 · Accepted Answer

这就是你应该这样做的方式。

def game(input,answer):
       if input == answer:
             sentence = "You got it!"
             return sentence
       elif input != answer:
               wrong = "sorry, wrong answer"
               return wrong

python - 如何在 Python 中编码/解码这个文件？

2 回答 2

这就是你应该这样做的方式。

Related

Reference