0

在使用 Apache beamIO预处理数据时,snappy库是一个很好的压缩模块,但看起来文件转换似乎不起作用,因为它在库中找不到crc32压缩函数!我使用的是 snappy-0.5.2 版本

错误看起来像这样 -

INFO:tensorflow:Saver not created because there are no variables in the graph to restore
ERROR:root:Exception at bundle <apache_beam.runners.direct.bundle_factory._Bundle object at 0x7f1dd1d60e50>, due to an exception.
 Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/direct/executor.py", line 312, in call
    side_input_values)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/direct/executor.py", line 347, in attempt_call
    evaluator.process_element(value)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/direct/transform_evaluator.py", line 551, in process_element
    self.runner.process(element)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/common.py", line 390, in process
    self._reraise_augmented(exn)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/common.py", line 388, in process
    self.do_fn_invoker.invoke_process(windowed_value)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/common.py", line 281, in invoke_process
    self._invoke_per_window(windowed_value)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/common.py", line 307, in _invoke_per_window
    windowed_value, self.process_method(*args_for_process))
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/typehints/typecheck.py", line 63, in process
    return self.wrapper(self.dofn.process, args, kwargs)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/typehints/typecheck.py", line 81, in wrapper
    result = method(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/iobase.py", line 965, in process
    self.writer.write(element)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/filebasedsink.py", line 299, in write
    self.sink.write_record(self.temp_handle, value)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/filebasedsink.py", line 129, in write_record
    self.write_encoded_record(file_handle, self.coder.encode(value))
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/tfrecordio.py", line 235, in write_encoded_record
    _TFRecordUtil.write_record(file_handle, value)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/tfrecordio.py", line 97, in write_record
    struct.pack('<I', cls._masked_crc32c(encoded_length)),  #
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/tfrecordio.py", line 77, in _masked_crc32c
    crc = crc32c_fn(value)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/tfrecordio.py", line 43, in _default_crc32c_fn
    _default_crc32c_fn.fn = snappy._crc32c  # pylint: disable=protected-access
AttributeError: 'module' object has no attribute '_crc32c' [while running 'WriteTrainData/Write/WriteImpl/WriteBundles']

如果有人可以帮助我正确使用带有 tensorflow 的 snappy!谢谢

4

1 回答 1

2

我刚刚遇到了这个问题;tensorflow我认为这是由于 Beam 对可选测试依赖项的版本(在本例中为和)有点粗心python-snappy

有问题的代码:

import snappy
snappy._crc32c

适用于python-snappy版本0.5.1但不适用于0.5.2(最新版本)。

我通过以下方式安装python-snappy 0.5.1通过了这些 Beam 测试:

pip install \
  --upgrade --ignore-installed \
  python-snappy==0.5.1 \
  --global-option=build_ext \
  --global-option="-I/usr/local/include" \
  --global-option="-L/usr/local/lib"

在 OSX 上,我需要三个--global-option标志,否则它找不到我的快速标题(症状:关于 的错误#include <snappy-c.h>)和库文件,它们分别brew install snappy放在/usr/local/include/usr/local/lib中。

之前的位似乎有必要覆盖pip想要给我最新版本的默认设置。

于 2018-05-27T23:46:38.897 回答