0

我尝试写一篇关于 map-reduce 的作业。我在终端中运行:

ioannis@ioannis-desktop:~$ python hw3.py

然后在另一个终端:

ioannis@ioannis-desktop:~$ ls
a2.py                  la.py~                     stopwords.py
active_output          LTP Crafting Quality Code  stopwords.pyc
Desktop                mincemeat.py               Templates
Documents              mincemeat.pyc              test.py
Downloads              Music                      test.py~
Dropbox                NetBeansProjects           test.pyc
examples.desktop       NotFor                     Ubuntu One
Firefox_wallpaper.png  Pictures                   Videos
hw3.py                 Public                     vmware
hw3.py~                __pycache__                Web Intelligence and Big Data
ioannis@ioannis-desktop:~$ python mincemeat.py -p changeme localhost
error: uncaptured python exception, closing channel <__main__.Client connected localhost:11235 at 0x27748c0> 
(<type 'exceptions.NameError'>:global name 'allStopWords' is not defined 
 [/usr/lib/python2.7/asyncore.py|read|83] 
 [/usr/lib/python2.7/asyncore.py|handle_read_event|449] 
 [/usr/lib/python2.7/asynchat.py|handle_read|140]
 [mincemeat.py|found_terminator|96] 
 [mincemeat.py|process_command|194]
 [mincemeat.py|call_mapfn|170]
 [hw3.py|mapfn|35])
ioannis@ioannis-desktop:~$ 

hw3.py:

import mincemeat
import glob
from stopwords import allStopWords
text_files = glob.glob('/home/ioannis/Web Intelligence and Big Data/Week 3: Load - I/hw3data/hw3data/*')

def file_contents(file_name):
    f = open(file_name)
    try:     
        return f.read()
    except:
        print "exception!!!!!!"
    finally:
        f.close()

source = dict((file_name, file_contents(file_name))
    for file_name in text_files)

def mapfn(key, value):
    for line in value.splitlines():
            ........................
            ........................
            if word in allStopWords:
                continue        
            print(word)
        print(words_title)
        print("\n\n")

def reducefn(k, vs):
    result = sum(vs)
    return result

s = mincemeat.Server()
s.datasource = source
s.mapfn = mapfn
s.reducefn = reducefn

results = s.run_server(password="changeme")
print results

为什么它不起作用?如您所见,hw3.py 和 stopwords.py 都在主目录中!

4

1 回答 1

0

使用 mincemeat.py 时的一个潜在问题:您的 mapfn 和 reducefn 函数无法访问它们的封闭环境,包括导入的模块。如果您需要在其中一个函数中使用导入的模块,请确保在函数本身中包含 import 任何内容。

https://github.com/michaelfairley/mincemeatpy#imports

IOW:移动函数from stopwords import allStopWords顶部的语句。mapfn

于 2013-04-29T11:08:09.993 回答