我正在尝试在 python 文件中使用 mrjob 并在命令行中运行它,但我一直收到错误日志:</p>
C:\Users\Ni\Desktop>python si601lab6_sol.py pg1268.txt
no configs found; falling back on auto-configuration
no configs found; falling back on auto-configuration
creating tmp directory c:\users\ni\appdata\local\temp\si601lab6_sol.Ni.20131019.
052147.951000
writing to c:\users\ni\appdata\local\temp\si601lab6_sol.Ni.20131019.052147.95100
0\step-0-mapper_part-00000
Traceback (most recent call last):
File "si601lab6_sol.py", line 29, in <module>
BiGramFreqCount.run()
File "C:\Python27\lib\site-packages\mrjob\job.py", line 500, in run
mr_job.execute()
File "C:\Python27\lib\site-packages\mrjob\job.py", line 518, in execute
super(MRJob, self).execute()
File "C:\Python27\lib\site-packages\mrjob\launch.py", line 146, in execute
self.run_job()
File "C:\Python27\lib\site-packages\mrjob\launch.py", line 207, in run_job
runner.run()
File "C:\Python27\lib\site-packages\mrjob\runner.py", line 458, in run
self._run()
File "C:\Python27\lib\site-packages\mrjob\sim.py", line 182, in _run
self._invoke_step(step_num, 'mapper')
File "C:\Python27\lib\site-packages\mrjob\sim.py", line 269, in _invoke_step
working_dir, env)
File "C:\Python27\lib\site-packages\mrjob\inline.py", line 155, in _run_step
os.chdir(working_dir)
WindowsError: [Error 5] Access is denied: 'c:\\users\\ni\\appdata\\local\\temp\\
si601lab6_sol.Ni.20131019.052147.951000\\job_local_dir\\0\\mapper\\0'
python文件真的很简单:</p>
#!/usr/bin/python
# python si-601-lab-5.py input.txt
# Some codes used courtesy of Dr. Yuhang Wang.
from mrjob.job import MRJob
import re
class BiGramFreqCount(MRJob):
### input: self, in_key, in_value
def mapper(self, _, line):
yield "chars", len(line)
yield "words", len(line.split)
yield "lines",1
### Task2: replace this part and use bigram (this word and its next word) as the key
### and skip the last word in each line
print '== mapper output =='
for word in words:
print [word, 1]
yield(word, 1)
print
### input: self, in_key from mapper, in_value from mapper
def reducer(self, key, values):
yield (key, sum(values))
if __name__ == '__main__':
BiGramFreqCount.run()
我一直在尝试解决这个问题几个小时,但没有成功......希望有人能帮助我。谢谢!