python - EDX set 5 - 凯撒密码

Question

我正在做 edx 问题集 5，我在我的代码中偶然发现了一个问题：

# 6.00x Problem Set 5
#
# Part 1 - HAIL CAESAR!

import string
import random

WORDLIST_FILENAME = "words.txt"

# -----------------------------------
# Helper code
# (you don't need to understand this helper code)
def loadWords():
    """
    Returns a list of valid words. Words are strings of lowercase letters.

    Depending on the size of the word list, this function may
    take a while to finish.
    """
    print "Loading word list from file..."
    inFile = open(WORDLIST_FILENAME, 'r')
    wordList = inFile.read().split()
    print "  ", len(wordList), "words loaded."
    return wordList

def isWord(wordList, word):
    """
    Determines if word is a valid word.

    wordList: list of words in the dictionary.
    word: a possible word.
    returns True if word is in wordList.

    Example:
    >>> isWord(wordList, 'bat') returns
    True
    >>> isWord(wordList, 'asdf') returns
    False
    """
    word = word.lower()
    word = word.strip(" !@#$%^&*()-_+={}[]|\\:;'<>?,./\"")
    return word in wordList

def randomWord(wordList):
    """
    Returns a random word.

    wordList: list of words  
    returns: a word from wordList at random
    """
    return random.choice(wordList)

def randomString(wordList, n):
    """
    Returns a string containing n random words from wordList

    wordList: list of words
    returns: a string of random words separated by spaces.
    """
    return " ".join([randomWord(wordList) for _ in range(n)])

def randomScrambled(wordList, n):
    """
    Generates a test string by generating an n-word random string
    and encrypting it with a sequence of random shifts.

    wordList: list of words
    n: number of random words to generate and scamble
    returns: a scrambled string of n random words

    NOTE:
    This function will ONLY work once you have completed your
    implementation of applyShifts!
    """
    s = randomString(wordList, n) + " "
    shifts = [(i, random.randint(0, 25)) for i in range(len(s)) if s[i-1] == ' ']
    return applyShifts(s, shifts)[:-1]

def getStoryString():
    """
    Returns a story in encrypted text.
    """
    return open("story.txt", "r").read()


# (end of helper code)
# -----------------------------------


#
# Problem 1: Encryption
#
def buildCoder(shift):
    """
    Returns a dict that can apply a Caesar cipher to a letter.
    The cipher is defined by the shift value. Ignores non-letter characters
    like punctuation, numbers and spaces.

    shift: 0 <= int < 26
    returns: dict
    """
    dict={}
    upper = string.ascii_uppercase
    lower = string.ascii_lowercase
    for l in range(len(upper)):
        dict[upper[l]] = upper[(l+shift)%len(upper)]
    for l in range(len(lower)):
        dict[lower[l]] = lower[(l+shift)%len(lower)]
    return dict


def applyCoder(text, coder):
    """
    Applies the coder to the text. Returns the encoded text.

    text: string
    coder: dict with mappings of characters to shifted characters
    returns: text after mapping coder chars to original text
    """
    new_text=''
    for l in text:
        if not(l in string.punctuation or l == ' ' or l in str(range(10))):
           new_text += coder[l]
        else:
           new_text += l            
    return new_text   

def applyShift(text, shift):
    """
    Given a text, returns a new text Caesar shifted by the given shift
    offset. Lower case letters should remain lower case, upper case
    letters should remain upper case, and all other punctuation should
    stay as it is.

    text: string to apply the shift to
    shift: amount to shift the text (0 <= int < 26)
    returns: text after being shifted by specified amount.
    """
    ### TODO.
    ### HINT: This is a wrapper function.
    coder=buildCoder(shift)
    return applyCoder(text,coder)

#
# Problem 2: Decryption
#
def findBestShift(wordList, text):
    """
    Finds a shift key that can decrypt the encoded text.

    text: string
    returns: 0 <= int < 26
    """
    ### TODO
    wordsFound=0
    bestShift=0

    for i in range(26):
        currentMatch=0
        encrypted=applyShift(text,i)
        lista=encrypted.split(' ')
        for w in lista: 
            if isWord(wordList,w):
                currentMatch+=1
        if currentMatch>wordsFound:
                currentMatch=wordsFound
                bestShift=i
    return bestShift

def decryptStory():
    """
    Using the methods you created in this problem set,
    decrypt the story given by the function getStoryString().
    Use the functions getStoryString and loadWords to get the
    raw data you need.

    returns: string - story in plain text
    """
    text = getStoryString()
    bestMatch = findBestShift(loadWords(), text)
    return applyShift(text, bestMatch)

#
# Build data structures used for entire session and run encryption
#

if __name__ == '__main__':
    wordList = loadWords()
    decryptStory()

s = 'Pmttw, ewztl!'
print findBestShift(wordList, s)

print decryptStory()

问题是该程序的单个模块与解密故事不同。那段代码有什么问题？

score 1 · Accepted Answer

你的第一个问题是applyCoder不能像书面的那样工作。

buildCoder构建一个dict只有字母的条目。但是applyCoder尝试查找任何不是in string.punctuation, or == ' ', or的东西in str(range(10))。我认为你想要string.digits那里（因为str(range(10))is '[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]'），但如果你给它一个换行符，它仍然会爆炸，一个名为的文件story.txt几乎可以保证有它。

简单的解决方法是只检查l in string.ascii_uppercase or l in string.ascii_lowercase. 但是还有一个更好的解决方法：与其试图想出一种复杂的方式来反向表达相同的过滤器，或者重复自己，不如尝试一下：

for l in text:
    new_text += coder.get(l, l)

coder[l]如果l在地图中，这将返回l，如果不在，则返回默认值。

修复该问题后，该函数运行，并成功输出了一些东西。但它没有输出正确的东西。为什么？

好吧，看看这个：

if currentMatch>wordsFound:
    currentMatch=wordsFound
    bestShift=i

因此，每次您找到比初始wordsFound值 0 更好的匹配项时，您... 丢弃该currentMatch值并wordsFound保持不变。当然你想要wordsFound = currentMatch，而不是相反，对吧？

解决了这两个问题：

$ ln -s /usr/share/dict/words words.txt
$ echo -e "This is a test.\n\nIs it good enough? Let's see.\n" | rot13 > story.txt
$ python caesar.py
Loading word list from file...
   235886 words loaded.
Loading word list from file...
   235886 words loaded.
18
Loading word list from file...
   235886 words loaded.
This is a test. 

Here's some text. Is it enough? Let's see.

所以，它显然在某处做了一些不必要的重复工作，但除此之外，它还有效。

学习如何自己调试这样的问题可能比得到这个问题的答案更重要，所以这里有一些建议。

我通过添加一些额外的print语句发现了问题。重要的在这里：

if currentMatch>wordsFound:
    print i, currentMatch, wordsFound
    currentMatch=wordsFound
    bestShift=i

您会看到它wordsFound从 0 开始永远不会改变。并且即使在找到具有 18 个匹配项的移位之后，它也会选择具有 1 个匹配的移位作为最佳移位。很明显，有些事情是错误的。

但我不知道把那个放在哪里。print我在整个地方添加了十几行。这是调试简单代码的最简单方法。

对于更复杂的代码，其中有太多要打印的地方，您可能希望写入一个日志文件（最好使用logging），您可以在事后进行解析。或者，更好的是，使用更简单的输入数据，并在调试器和/或交互式可视化器中运行（比如这个）。

或者，更好的做法是，把所有的东西都去掉，直到找到不起作用的部分。例如，如果您知道 shift 18 应该比 shift 12 更好，请尝试applyShift使用 12 和 18 调用，看看它们各自返回什么。

即使这些步骤不能让您得到答案，它们也会让您找到一个更好的问题，以便在 SO 上发布。

python - EDX set 5 - 凯撒密码

1 回答 1

Related

Reference