我正在尝试利用tf.Transform lib通过Apache Beam(Google DataFlow) 使用TensorFlow进行数据预处理。https://github.com/tensorflow/transform
这是我的设置:
conda create -n tftransform python=2.7
source activate tftransform
pip install tensorflow
pip install tensorflow-transform
pip install dill==0.2.6
git clone https://github.com/tensorflow/transform.git
cd transform/
python setup.py install # for good measure ...
然后我尝试执行 simple_example(https://github.com/tensorflow/transform/blob/master/examples/simple_example.py):
python examples/simple_example.py
我收到以下错误:
AttributeError: 'DType' object has no attribute 'dtype'
(导入时也有警告No handlers could be found for logger "oauth2client.contrib.multistore_file"
)
这是堆栈跟踪:
Traceback (most recent call last):
File "examples/simple_example.py", line 64, in <module>
preprocessing_fn, tempfile.mkdtemp()))
File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/apache_beam/transforms/ptransform.py", line 439, in __ror__
result = p.apply(self, pvalueish, label)
File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/apache_beam/pipeline.py", line 249, in apply
pvalueish_result = self.runner.apply(transform, pvalueish)
File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/apache_beam/runners/runner.py", line 162, in apply
return m(transform, input)
File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/apache_beam/runners/runner.py", line 168, in apply_PTransform
return transform.expand(input)
File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/tensorflow_transform/beam/impl.py", line 597, in expand
self._output_dir)
File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/apache_beam/transforms/ptransform.py", line 439, in __ror__
result = p.apply(self, pvalueish, label)
File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/apache_beam/pipeline.py", line 249, in apply
pvalueish_result = self.runner.apply(transform, pvalueish)
File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/apache_beam/runners/runner.py", line 162, in apply
return m(transform, input)
File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/apache_beam/runners/runner.py", line 168, in apply_PTransform
return transform.expand(input)
File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/tensorflow_transform/beam/impl.py", line 328, in expand
self._preprocessing_fn, input_schema)
File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/tensorflow_transform/impl_helper.py", line 416, in run_preprocessing_fn
inputs = _make_input_columns(schema)
File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/tensorflow_transform/impl_helper.py", line 218, in _make_input_columns
placeholders = schema.as_batched_placeholders()
File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/tensorflow_transform/tf_metadata/dataset_schema.py", line 87, in as_batched_placeholders
for key, column_schema in self.column_schemas.items()}
File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/tensorflow_transform/tf_metadata/dataset_schema.py", line 87, in <dictcomp>
for key, column_schema in self.column_schemas.items()}
File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/tensorflow_transform/tf_metadata/dataset_schema.py", line 133, in as_batched_placeholder
return self.representation.as_batched_placeholder(self)
File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/tensorflow_transform/tf_metadata/dataset_schema.py", line 330, in as_batched_placeholder
return tf.placeholder(column.domain.dtype,
AttributeError: 'DType' object has no attribute 'dtype'
这个库生产准备好了吗?我怎样才能使这项工作?