tensorflow - TensorFlow：打开 SummaryWriter 写入的日志数据

Question

在完成了本教程的摘要和 TensorBoard 之后，我已经能够使用 TensorBoard 成功保存和查看数据。是否可以用 TensorBoard 以外的东西打开这些数据？

顺便说一句，我的申请是做off-policy learning。我目前正在使用 SummaryWriter 保存每个 state-action-reward 元组。我知道我可以手动存储/训练这些数据，但我认为使用 TensorFlow 的内置日志功能来存储/加载这些数据会很好。

score 44 · Accepted Answer

截至 2017 年 3 月，EventAccumulator 工具已从Tensorflow 核心移至 Tensorboard 后端。您仍然可以使用它从 Tensorboard 日志文件中提取数据，如下所示：

from tensorboard.backend.event_processing.event_accumulator import EventAccumulator
event_acc = EventAccumulator('/path/to/summary/folder')
event_acc.Reload()
# Show all tags in the log file
print(event_acc.Tags())

# E. g. get wall clock, number of steps and value for a scalar 'Accuracy'
w_times, step_nums, vals = zip(*event_acc.Scalars('Accuracy'))

score 8 · Accepted Answer

很简单，数据实际上可以导出到.csvTensorBoard 中事件选项卡下的文件中，例如可以在 Python 中加载到 Pandas 数据框中。确保选中数据下载链接框。

如需更自动化的方法，请查看 TensorBoard自述文件：

如果您想导出数据以在其他地方进行可视化（例如 iPython Notebook），这也是可能的。您可以直接依赖 TensorBoard 用于加载数据的底层类：（用于 python/summary/event_accumulator.py从单次运行中加载数据）或python/summary/event_multiplexer.py（用于从多次运行中加载数据，并使其保持井井有条）。这些类加载事件文件组，丢弃因 TensorFlow 崩溃而“孤立”的数据，并按标签组织数据。

作为另一种选择，有一个脚本tensorboard/scripts/serialize_tensorboard.py（该脚本被设置为制作“假 TensorBoard 后端”以进行测试，所以它的边缘有点粗糙。

score 6 · Accepted Answer

我认为数据是编码的 protobufs RecordReader 格式。要从文件中获取序列化的字符串，您可以使用py_record_reader或使用 TFRecordReader操作构建图形，并将这些字符串反序列化为 protobuf 使用 Event schema。如果你得到一个工作示例，请更新这个 q，因为我们似乎缺少这方面的文档。

score 2 · Accepted Answer

我为以前的项目做了一些类似的事情。正如其他人所提到的，主要成分是 tensorflows 事件累加器

from tensorflow.python.summary import event_accumulator as ea

acc = ea.EventAccumulator("folder/containing/summaries/")
acc.Reload()

# Print tags of contained entities, use these names to retrieve entities as below
print(acc.Tags())

# E. g. get all values and steps of a scalar called 'l2_loss'
xy_l2_loss = [(s.step, s.value) for s in acc.Scalars('l2_loss')]

# Retrieve images, e. g. first labeled as 'generator'
img = acc.Images('generator/image/0')
with open('img_{}.png'.format(img.step), 'wb') as f:
  f.write(img.encoded_image_string)

score 1 · Accepted Answer

当您的 tfevent 文件中的数据点少于 10000 个时，Chris Cundy 的回答效果很好。但是，当你有一个包含超过 10000 个数据点的大文件时，Tensorboard 会自动对它们进行采样，并且最多只能给你 10000 个点。这是一个非常烦人的潜在行为，因为它没有得到很好的记录。请参阅https://github.com/tensorflow/tensorboard/blob/master/tensorboard/backend/event_processing/event_accumulator.py#L186。

为了绕过它并获取所有数据点，有点hacky的方法是：

from tensorboard.backend.event_processing.event_accumulator import EventAccumulator

class FalseDict(object):
    def __getitem__(self,key):
        return 0
    def __contains__(self, key):
        return True

event_acc = EventAccumulator('path/to/your/tfevents',size_guidance=FalseDict())

score 1 · Accepted Answer

您还可以使用：要在仅存在经典标量、、和tf.train.summaryiterator的 -Folder 中提取事件，您可以使用此 GIST：tensorboard_to_csv.py./logslracclossval_accval_loss

score 0 · Accepted Answer

对于 tb 版本 >=2.3，您可以简化将 tb 事件转换为 pandas 数据帧的过程tensorboard.data.experimental.ExperimentFromDev()。不过，它要求您将日志上传到公开的 TensorBoard.dev。未来计划将功能扩展到本地存储的日志。 https://www.tensorflow.org/tensorboard/dataframe_api

score 0 · Accepted Answer

您还可以使用 EventFileLoader 遍历张量板文件

from tensorboard.backend.event_processing.event_file_loader import EventFileLoader

for event in EventFileLoader('path/to/events.out.tfevents.xxx').Load():
    print(event)

tensorflow - TensorFlow：打开 SummaryWriter 写入的日志数据

8 回答 8

Related

Reference