“tensorflow-datasets”的相关标签问题

0 投票

1 回答

55 浏览

tensorflow - Tensorflow 意外缓存子图结果

我目前正在尝试为 TF 创建一个增强类，如下所示：

我已经尝试过dataset.map和tf.map_fn都返回一致的变换，即：

和

这两个调用都返回应用相同变换的不同图像。

使用随机变换返回图像的唯一方法是在以下位置调用变换map()：

或将所有随机数移入apply().

调用的位置random_*()在 TF 中是否重要，我的想法是位置无关紧要，它只对map_fn?

tensorflow tensorflow-datasets

2017-11-14T15:54:03.117

0 投票

2 回答

1047 浏览

python - 使用 Dataset api tensorflow 即时生成

我有一个产生特征和目标张量的函数。例如

如何将其与 TensorFlow 的数据集 API 集成以进行持续训练？理想情况下，我想使用数据集来设置批处理、转换等内容。

编辑澄清：问题是我不仅想将 x 和 t 放在我的图表中，还想从中创建一个数据集，以便我可以使用我为（正常）有限数据集实现的相同数据集处理，我可以加载到内存中并使用可初始化的迭代器输入同一个图形。

2017-11-15T22:57:05.590

0 投票

1 回答

2199 浏览

python - 如何使用 Tensorflow 中的其他示例转换扩展 tf.data.Dataset

我想通过向其添加随机噪声来将我用来在 tensorflow 中训练神经网络的现有数据集的大小加倍。因此，当我完成后，我将拥有所有现有示例以及所有添加了噪声的示例。我还想在转换它们时将它们交错，所以它们按以下顺序出现：示例 1 没有噪音，示例 1 有噪音，示例 2 没有噪音，示例 2 有噪音，等等。我正在努力完成这个使用数据集 API。我尝试使用 unbatch 来完成此操作：

但我收到一条错误消息Shapes must be equal rank, but are 2 and 1。我猜 tensorflow 正试图从我返回的那一批中制作一个张量，但是features形状labels不同，所以这不起作用。我可能可以通过制作两个数据集并将它们连接在一起来做到这一点，但我担心这会导致训练非常倾斜，我在一半的时间里训练得很好，突然间所有的数据都在第二次进行了新的转换一半。在输入 tensorflow 之前，如何在不将这些转换写入磁盘的情况下即时完成此操作？

python tensorflow tensorflow-datasets

2017-11-16T18:46:49.163

0 投票

1 回答

4606 浏览

python - 如何使用 tf.estimator（使用 predict 或 eval 方法）返回预测和标签？

我正在使用 TensorFlow 1.4。

我创建了一个自定义 tf.estimator 来进行分类，如下所示：

我可以轻松地训练它：

其中input_fn是一个使用tf.data.Dataset API从tfrecords 文件中读取数据的函数。

当我从 tfrecords 文件中读取数据时，当我进行预测时，我的内存中没有标签。

我的问题是，如何通过predict()方法或evaluate()方法返回预测和标签？

似乎没有办法两者兼得。predict()无法访问 (?) 标签，并且无法使用evaluate()方法访问预测字典。

python tensorflow tensorflow-datasets

2017-11-17T11:16:01.453

0 投票

1 回答

770 浏览

python - TensorFlow 中的多行文本数据集

tf.data.* 有数据集类。有一个 TextLineDataset，但我需要一个用于多行文本（在开始/结束标记之间）。有没有办法为 tf.data.TextLineDataset 使用不同的换行符？

我是一位经验丰富的开发人员，但也是一名 python 新手。我可以阅读，但我的写作有限。我正在将现有的 Tensorflow NMT 教程转换为我自己的数据集。大多数 TFRecord 教程都涉及 jpg 或其他结构化数据。

python tensorflow tensorflow-datasets

2017-11-18T14:31:58.353

0 投票

1 回答

3977 浏览

tensorflow - How to use tensorflow's Dataset API Iterator as an input of a (recurrent) neural network?

When using the tensorflow's Dataset API Iterator, my goal is to define an RNN that operates on the iterator's get_next() tensors as its input (see (1) in the code).

However, simply defining the dynamic_rnn with get_next() as its input results in an error: ValueError: Initializer for variable rnn/basic_lstm_cell/kernel/ is from inside a control-flow construct, such as a loop or conditional. When creating a variable inside a loop or conditional, use a lambda as the initializer.

Now I know that one workaround is to simply create a placeholder for next_batch and then eval() the tensor (because you can't pass the tensor itself) and pass it using feed_dict (see X and (2) in the code). However, if I understand it correctly, this is not an efficient solution as we first evaluate and then reinitialize the tensor.

Is there any way to either:

Define the dynamic_rnn directly on top of the output of the Iterator;

or:

Somehow directly pass the existing get_next() tensor to the placeholder that is the input of dynamic_rnn?

Full working example; the (1) version is what I would like to work but it doesn't, while (2) is the workaround that does work.

(Using tensorflow 1.4.0, Python 3.6.)

Thank you very much :)

tensorflow rnn tensorflow-datasets

2017-11-20T13:36:56.307

0 投票

1 回答

1414 浏览

python - 如何在 TensorFlow 中使用自己的图像？

我知道以前有人问过这个问题，但我还没有找到我可以使用的答案。我是 Python 和 Tensorflow 的新手，但使用 MNIST 图像集成功地将我的准确率提高到 +-99.3%。现在我想尝试使用我自己的图像，但事实证明这对我来说比预期的要困难。

我已经阅读了数百次 Tensorflow 网站上的教程页面，但这对我来说毫无意义，无论我尝试什么，我都会收到警告。现在我想自己弄清楚，但有没有人知道哪种方式最容易使用我自己的图像？或者有什么例子吗？我一直在网上寻找它们，但感觉好像我找到了 1000 个，但没有一个能以我能理解的方式解释。

提前感谢您的帮助。

python python-3.x tensorflow image-recognition tensorflow-datasets

2017-11-20T14:00:03.987

0 投票

2 回答

2842 浏览

performance - Tensorflow 数据集 API 是否比队列慢？

我用 Dataset API 方法替换了项目中的 CIFAR-10 预处理管道，导致性能下降约 10-20%。

预处理是相当标准的： - 从磁盘读取图像 - 随机/裁剪和翻转 - 随机播放，批处理 - 输入模型

总的来说，我看到批处理现在快了 15%，但是每隔一段时间（或者，更准确地说，每当我重新初始化数据帧或期望重新洗牌时）批处理被阻塞了很长时间（30 秒），总时间变慢- 每 epoch 处理。

这种行为似乎与内部散列有关。如果我在 ds.shuffle(buffer_size=N) 中减少 N 延迟会更短，但成比例地更频繁。将所有结果中的 shuffle 移除以延迟，就好像 buffer_size 设置为数据集大小一样。

在读取/缓存方面，有人可以解释 Dataset API 的内部逻辑吗？是否有任何理由期望 Dataset API 比手动创建的队列工作得更快？

我正在使用 TF 1.3。

performance tensorflow tensorflow-datasets

2017-11-21T00:32:37.050

0 投票

1 回答

903 浏览

tensorflow - TensorFlow 1.4 分布式模式下的新数据集 API

在 TensorFlow 1.4 的新数据集 API 之前，我使用以下代码在不同工作人员之间创建文件名共享队列：

这段代码使用了队列和队列运行器，它非常丑陋和令人困惑。但它允许选择shared_name=在工作人员之间创建一个共享队列，这样他们就不会在相同的示例上工作。

在 TensorFlow 1.4 新版本发布后，输入管道变得更加易于使用。所以我想更新我的程序以使用这个新功能。但是，我在新文档中找不到如何在工作人员之间共享数据集的任何地方。

这是自动完成的还是不是一个功能？

tensorflow tensorflow-datasets

2017-11-24T18:13:25.730

0 投票

1 回答

9414 浏览

tensorflow - iterator.get_next() 导致在抛出 'std::system_error 的实例后调用终止

我正在使用具有以下属性的共享服务器使用 tensorflow 训练 resNet50：

ubuntu 16.04 3 gtx 1080 gpus tensorflow 1.3 python 2.7 但总是在两个时期之后，在第三个时期，我遇到了这个错误：

这是将 tfrecord 转换为数据集的代码：

这是输入管道：

在我的代码中插入一些打印消息后，我发现下面的行导致了这个错误：

但是，我解决不了

tensorflow tensorflow-datasets

2017-11-26T17:46:27.807

问题标签 [tensorflow-datasets]

Reference