0

我正在尝试利用tf.Transform lib通过Apache Beam(Google DataFlow) 使用TensorFlow进行数据预处理。https://github.com/tensorflow/transform

这是我的设置

conda create -n tftransform python=2.7 source activate tftransform pip install tensorflow pip install tensorflow-transform pip install dill==0.2.6 git clone https://github.com/tensorflow/transform.git cd transform/ python setup.py install # for good measure ...

然后我尝试执行 simple_examplehttps://github.com/tensorflow/transform/blob/master/examples/simple_example.py): python examples/simple_example.py

我收到以下错误AttributeError: 'DType' object has no attribute 'dtype'

(导入时也有警告No handlers could be found for logger "oauth2client.contrib.multistore_file"

这是堆栈跟踪 Traceback (most recent call last): File "examples/simple_example.py", line 64, in <module> preprocessing_fn, tempfile.mkdtemp())) File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/apache_beam/transforms/ptransform.py", line 439, in __ror__ result = p.apply(self, pvalueish, label) File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/apache_beam/pipeline.py", line 249, in apply pvalueish_result = self.runner.apply(transform, pvalueish) File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/apache_beam/runners/runner.py", line 162, in apply return m(transform, input) File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/apache_beam/runners/runner.py", line 168, in apply_PTransform return transform.expand(input) File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/tensorflow_transform/beam/impl.py", line 597, in expand self._output_dir) File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/apache_beam/transforms/ptransform.py", line 439, in __ror__ result = p.apply(self, pvalueish, label) File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/apache_beam/pipeline.py", line 249, in apply pvalueish_result = self.runner.apply(transform, pvalueish) File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/apache_beam/runners/runner.py", line 162, in apply return m(transform, input) File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/apache_beam/runners/runner.py", line 168, in apply_PTransform return transform.expand(input) File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/tensorflow_transform/beam/impl.py", line 328, in expand self._preprocessing_fn, input_schema) File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/tensorflow_transform/impl_helper.py", line 416, in run_preprocessing_fn inputs = _make_input_columns(schema) File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/tensorflow_transform/impl_helper.py", line 218, in _make_input_columns placeholders = schema.as_batched_placeholders() File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/tensorflow_transform/tf_metadata/dataset_schema.py", line 87, in as_batched_placeholders for key, column_schema in self.column_schemas.items()} File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/tensorflow_transform/tf_metadata/dataset_schema.py", line 87, in <dictcomp> for key, column_schema in self.column_schemas.items()} File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/tensorflow_transform/tf_metadata/dataset_schema.py", line 133, in as_batched_placeholder return self.representation.as_batched_placeholder(self) File "/Users/XXX/anaconda/envs/tftransform/lib/python2.7/site-packages/tensorflow_transform/tf_metadata/dataset_schema.py", line 330, in as_batched_placeholder return tf.placeholder(column.domain.dtype, AttributeError: 'DType' object has no attribute 'dtype'

这个库生产准备好了吗?我怎样才能使这项工作?

4

1 回答 1

1

我运行了以下内容: python setup.py bdist_wheel pip install ./dist/tensorflow_transform-0.1.6.dev0-py2-none-any.whl 这将卸载tensorflow-transform-0.1.5并安装tensorflow-transform-0.1.6.dev0

现在运行python examples/simple_example.py有效 - 我得到以下结果: [{'s_integerized': 0, 'x_centered': -1.0, 'x_centered_times_y_normalized': -0.0, 'y_normalized': 0.0}, {'s_integerized': 1, 'x_centered': 0.0, 'x_centered_times_y_normalized': 0.0, 'y_normalized': 0.5}, {'s_integerized': 0, 'x_centered': 1.0, 'x_centered_times_y_normalized': 1.0, 'y_normalized': 1.0}] 感谢@elmer-garduno

于 2017-03-21T08:50:17.123 回答