问题标签 [tensorflow-data-validation]

For questions regarding programming in ECMAScript (JavaScript/JS) and its various dialects/implementations (excluding ActionScript). Note JavaScript is NOT the same as Java! Please include all relevant tags on your question; e.g., [node.js], [jquery], [json], [reactjs], [angular], [ember.js], [vue.js], [typescript], [svelte], etc.

0 投票
1 回答
1929 浏览

tensorflow-data-validation - 错误:找不到满足要求 tensorflow-data-validation 的版本(来自版本:无)

我得到这个错误Python 3.7Windows 10 64-bit(支持)。Python 3.5 和 3.6 似乎只有 Windows 的轮子......

提前致谢。

收集 tensorflow-data-validation 注意:您可能需要重新启动内核才能使用更新的软件包。

错误:找不到满足要求 tensorflow-data-validation 的版本(来自版本:无) 错误:未找到 tensorflow-data-validation 的匹配分布

软件和版本

-Windows 10 企业版。64位 -import sys !{sys.executable} --version Python 3.7.3(Python 3.7 版本(Anaconda 2019.07 for Windows Installer https://www.anaconda.com/distribution/ , Jupyter Notebook))-成功安装pip -19.2.3 (pip --version) -错误:无法卸载“PyYAML”。这是一个 distutils 安装的项目,因此我们无法准确确定哪些文件属于它,这只会导致部分卸载。(pip install apache-beam) -成功安装tensorboard-1.14.0 tensorflow-1.14.0 tensorflow-estimator-1.14.0 (pip install tensorflow) -成功安装pyarrow-0.14.0 (pip install pyarrow==0.14.*) - https://www.tensorflow.org/tfx/data_validation/install ( https://pypi.org/project/tensorflow-data-validation/) 支持的平台 TFDV 在以下 64 位操作系统上进行了测试:Windows 7 或更高版本。

冗长的

pip install -v tensorflow-data-validation 收集 tensorflow-data-validation 从页面https://pypi.org/simple/tensorflow-data-validation/分析链接 未设置配置变量“Py_DEBUG”,Python ABI 标签可能不正确 配置变量 'WITH_PYMALLOC' 未设置,Python ABI 标签可能不正确 […] 跳过链接:没有一个轮子的标签匹配:cp35-cp35m-win_amd64:files.pythonhosted.org/packages/d7/a1/b1f0c9c88713a60f206cf7bfaeb9391da1c9c8e3a6c98cd1420785688770.db 0-cp35-cp35m-win_amd64.whl#sha256=eeff482c69ae1e49d84bbbef7c2ca058735e1d12cd640b643853f5f5fb05bc70(需要-python:>=2.7,!=3.0. ,!=3.1. ,!=3.2. , !=3.3.4,<4) Skipping link: none of the wheel's tags match: cp36-cp36m-win_amd64: files.pythonhosted.org/packages/bb/75/f3112982ca379481ae7706a94bf2755bd886fd4c8386e88ab978c5a0ae52/tensorflow_data_validation-0.14.0-cp36-cp36m-win_amd64.whl#sha256= 611c23f718df87dcb6f34a6cf81d1a9699523254803607537e3d7e94e2c4712c (需要-python:>=2.7,!=3.0. ,!=3.1. ,!=3.2. ,!=3.3. ,!=3.4. ,<4) 跳过链接: 没有轮子'cp' cp35m-win_amd64: files.pythonhosted.org/packages/77/13/d0a90ccde514a4547b5d2ce3268f683aa6d5fb9f185c2b4d9a7db15eafca/tensorflow_data_validation-0.14.1-cp35-cp35m-win_amd64.whl#sha256=df5eb52ef53ee9db901aed5a30db183f272cda0a8b4f6981d9843cb6c52fc58a (requires-python:>=2.7,!=3.0. ,! =3.1. ,!=3.2. ,!=3.3. ,!=3.4.,<4) 跳过链接:没有一个轮子的标签匹配:cp36-cp36m-win_amd64:files.pythonhosted.org/packages/54/3e/dec2c051d4a6dd04dcacfd73d4d02be3ad3cd56008ba2251e3bd8cc36adf/tensorflow_data_validation-0.14.1-cp36-cphl36=dm=d6 2cba18c385d7de8d346b8db4b9bfec38e8535e1371a6a7f2f375ea51264dfeb8 (requires-python:>=2.7,!=3.0. ,!=3.1. ,!=3.2. ,!=3.3. ,!=3.4. ,<4) [...] 没有哈希值来检查项目的 0 个链接'tensorflow-data-validation':不丢弃任何候选人

PyPI 下载文件

tensorflow_data_validation-0.14.1-cp35-cp35m-win_amd64.whl (1.7 MB) Wheel cp35 2019 年 8 月 22 日 tensorflow_data_validation-0.14.1-cp36-cp36m-win_amd64.whl (1.7 MB) Wheel cp36 2019 年 8 月 22 日 - https:/ /pypi.org/project/tensorflow-data-validation/#modal-close

0 投票
1 回答
155 浏览

tensorflow - 如何在 tensorflow 中查看模式元数据的所有可能选项?

我正在使用 tensorflow 数据验证,并且正在尝试围绕我的数据集构建模式。我已经构建了初始模式,并且可以在记事本中查看/编辑它们,但是我很难找到一个资源来准确地显示我可以在文件中为给定数据类型设置什么样的参数(即最小值或最大值或数据形状)。

有谁知道我可以用来进一步编辑我的架构文件的好资源甚至是综合架构?

0 投票
0 回答
111 浏览

python - 如何在 python 3.7 中安装 tensorflow_data_validation?

我使用 anaconda 并且我有 python 3.7,我尝试从 anaconda 的提示符安装 tensorflow_data_validation 但它给了我以下错误:找不到满足要求的版本那?

0 投票
2 回答
563 浏览

tensorflow - 了解 TFDV 中使用的 L-infinity 范数

我试图实现 TensorFlow 数据验证来检查数据集中的漂移/偏斜。他们使用 L-infinity 范数作为衡量标准。我不明白这个概念。谁能解释它是如何计算的以及为什么他们在这里使用阈值作为 0.01?

TensorFlow 网站图片

0 投票
1 回答
95 浏览

validation - TensorFlow 数据唯一性验证

我正在使用 Tensorflow 数据验证,并希望确保一列没有重复值。但是好像Tensorflow的数据验证没有像Deequ的isUnique唯一性检查功能这样的功能?有没有办法在 Tensorflow 的模式中定义唯一性?我在这里阅读了文档,但仍然找不到任何独特性。

感谢您的帮助。

0 投票
3 回答
293 浏览

tensorflow-data-validation - 如何以正确的格式保存 TFDV 统计信息以便重新加载?

令我困惑的是,有一个tfdv.load_statistics()功能,但没有相应的tfdv.write_statistics()功能。如何保存统计信息,然后再次加载它们?

例如

我可以将字符串表示形式保存到文件中,但这不是 load_statistics 期望的格式。

有人指点吗?

0 投票
0 回答
2043 浏览

spreadsheet - 此单元格的内容违反其验证规则

当我应用数据验证时,我的工作表中出现此错误。

验证工作正常,但此错误不断显示。

我希望有人知道这一点。

数据验证的条件是接收列中的 no 不应超过之前收到的值。

这是图片链接 ------ https://drive.google.com/open?id=1yXFr3JVMX4S97cXgRAXq7EqpSlTluVJA

0 投票
3 回答
580 浏览

python-3.x - Tensorflow 数据验证无法识别数值特征中的异常

我一直在测试 Tensorflow 数据验证(版本 0.22.0)以在我当前的 ML 管道中使用,我注意到它在数值特征中没有任何异常。例如,

仅在分类异常的FeatAFeatB中检测到异常。但是在FeatCFeatD中,TFDV 没有检测到任何东西。

结果显示在这张图片中

我也尝试过设置偏斜和漂移比较器,但没有任何变化。我想这与自动生成的模式有关,它没有为数字特征映射域。

任何人都知道如何让 TFDV 为数字特征工作?

0 投票
0 回答
349 浏览

python-3.x - 无法在 Google Cloud Platform (Dataflow) 上运行 TensorFlow 数据验证

我一直在尝试按照 Google Documents 运行 TensorFlow 数据验证

遵循与https://www.tensorflow.org/tfx/data_validation/install相同的步骤:

>pip install tensorflow-data-validation

>git clone https://github.com/tensorflow/data-validation

>cd data-validation

>pip install strip-hints

>python tensorflow_data_validation/tools/strip_type_hints.py tensorflow_data_validation/

>sudo docker-compose build manylinux2010

>sudo docker-compose run -e PYTHON_VERSION=${PYTHON_VERSION} manylinux2010

用正确的路径更新

也遵循与 python3 相同的程序,这给了我下面的 .whl

tensorflow_data_validation-0.23.0.dev0-cp37-cp37m-manylinux2010_x86_64.whl ` 我得到 2 种错误,

使用时cp27

Error message from worker: Traceback (most recent call last): File "/usr/local/lib/python2.7/site-packages/dataflow_worker/batchworker.py", line 647, in do_work work_executor.execute() File "/usr/local/lib/python2.7/site-packages/dataflow_worker/executor.py", line 153, in execute test_shuffle_sink=self._test_shuffle_sink) File "/usr/local/lib/python2.7/site-packages/dataflow_worker/executor.py", line 118, in create_operation is_streaming=False) File "apache_beam/runners/worker/operations.py", line 1050, in apache_beam.runners.worker.operations.create_operation op = create_pgbk_op(name_context, spec, counter_factory, state_sampler) File "apache_beam/runners/worker/operations.py", line 856, in apache_beam.runners.worker.operations.create_pgbk_op return PGBKCVOperation(step_name, spec, counter_factory, state_sampler) File "apache_beam/runners/worker/operations.py", line 914, in apache_beam.runners.worker.operations.PGBKCVOperation.__init__ fn, args, kwargs = pickler.loads(self.spec.combine_fn)[:3] File "/usr/local/lib/python2.7/site-packages/apache_beam/internal/pickler.py", line 287, in loads return dill.loads(s) File "/usr/local/lib/python2.7/site-packages/dill/_dill.py", line 275, in loads return load(file, ignore, **kwds) File "/usr/local/lib/python2.7/site-packages/dill/_dill.py", line 270, in load return Unpickler(file, ignore=ignore, **kwds).load() File "/usr/local/lib/python2.7/site-packages/dill/_dill.py", line 472, in load obj = StockUnpickler.load(self) File "/usr/local/lib/python2.7/pickle.py", line 864, in load dispatch[key](self) File "/usr/local/lib/python2.7/pickle.py", line 1139, in load_reduce value = func(*args) File "/usr/local/lib/python2.7/site-packages/dill/_dill.py", line 827, in _import_module return getattr(__import__(module, None, None, [obj]), obj) File "/usr/local/lib/python2.7/site-packages/tensorflow_data_validation/__init__.py", line 18, in <module> from tensorflow_data_validation.api.stats_api import GenerateStatistics File "/usr/local/lib/python2.7/site-packages/tensorflow_data_validation/api/stats_api.py", line 50, in <module> from tensorflow_data_validation import types ImportError: cannot import name types

使用时cp37

Error message from worker: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", line 283, in loads return dill.loads(s) File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 275, in loads return load(file, ignore, **kwds) File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 270, in load return Unpickler(file, ignore=ignore, **kwds).load() File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 472, in load obj = StockUnpickler.load(self) File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 462, in find_class return StockUnpickler.find_class(self, module, name) File "/usr/local/lib/python3.7/site-packages/tensorflow_data_validation/statistics/stats_impl.py", line 31, in <module> from tensorflow_data_validation import constants File "/usr/local/lib/python3.7/site-packages/tensorflow_data_validation/__init__.py", line 39, in <module> from tensorflow_data_validation.statistics.generators.lift_stats_generator import LiftStatsGenerator File "/usr/local/lib/python3.7/site-packages/tensorflow_data_validation/statistics/generators/lift_stats_generator.py", line 68, in <module> ('y', _YType)]) File "/usr/local/lib/python3.7/typing.py", line 1448, in __new__ return _make_nmtuple(typename, fields) File "/usr/local/lib/python3.7/typing.py", line 1341, in _make_nmtuple types = [(n, _type_check(t, msg)) for n, t in types] File "/usr/local/lib/python3.7/typing.py", line 1341, in <listcomp> types = [(n, _type_check(t, msg)) for n, t in types] File "/usr/local/lib/python3.7/typing.py", line 142, in _type_check raise TypeError(f"{msg} Got {arg!r:.100}.") TypeError: NamedTuple('Name', [(f0, t0), (f1, t1), ...]); each t must be a type Got Any. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 647, in do_work work_executor.execute() File "/usr/local/lib/python3.7/site-packages/dataflow_worker/executor.py", line 153, in execute test_shuffle_sink=self._test_shuffle_sink) File "/usr/local/lib/python3.7/site-packages/dataflow_worker/executor.py", line 118, in create_operation is_streaming=False) File "apache_beam/runners/worker/operations.py", line 1050, in apache_beam.runners.worker.operations.create_operation File "apache_beam/runners/worker/operations.py", line 856, in apache_beam.runners.worker.operations.create_pgbk_op File "apache_beam/runners/worker/operations.py", line 914, in apache_beam.runners.worker.operations.PGBKCVOperation.__init__ File "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", line 287, in loads return dill.loads(s) File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 275, in loads return load(file, ignore, **kwds) File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 270, in load return Unpickler(file, ignore=ignore, **kwds).load() File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 472, in load obj = StockUnpickler.load(self) File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 462, in find_class return StockUnpickler.find_class(self, module, name) File "/usr/local/lib/python3.7/site-packages/tensorflow_data_validation/statistics/stats_impl.py", line 31, in <module> from tensorflow_data_validation import constants File "/usr/local/lib/python3.7/site-packages/tensorflow_data_validation/__init__.py", line 39, in <module> from tensorflow_data_validation.statistics.generators.lift_stats_generator import LiftStatsGenerator File "/usr/local/lib/python3.7/site-packages/tensorflow_data_validation/statistics/generators/lift_stats_generator.py", line 68, in <module> ('y', _YType)]) File "/usr/local/lib/python3.7/typing.py", line 1448, in __new__ return _make_nmtuple(typename, fields) File "/usr/local/lib/python3.7/typing.py", line 1341, in _make_nmtuple types = [(n, _type_check(t, msg)) for n, t in types] File "/usr/local/lib/python3.7/typing.py", line 1341, in <listcomp> types = [(n, _type_check(t, msg)) for n, t in types] File "/usr/local/lib/python3.7/typing.py", line 142, in _type_check raise TypeError(f"{msg} Got {arg!r:.100}.") TypeError: NamedTuple('Name', [(f0, t0), (f1, t1), ...]); each t must be a type Got Any.

0 投票
0 回答
62 浏览

python-3.x - 在 tf.data.datasets 上训练时不断收到 ValueError 并包含 validation_data

在我的数据集上运行该model.fit(...)方法。我不断收到这样的错误:

validatin_data但是,只有在我的model.fit(...)通话中包含此错误时,我才会收到此错误。对此进行任何澄清都会有所帮助。我不介意分享我的 Google Colab 笔记本来展示我如何使用tf.data.Dataset.

我的代码可以在Google Colab上找到。