I have the same setup and code on mac for running simhash, it works.
But when I run it on Ubuntu, it complaints the implementation of simhash itself has the bug.
Have you encountered such problem?
objs = [(str(k), Simhash(v)) for k, v in index_data.items()] File "/usr/local/lib/python2.7/dist-packages/simhash-1.1.2-py2.7.egg/simhash/init.py", line 30, in init self.build_by_text(unicode(value)) UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 34: ordinal not in range(128)