cvat - 将大型数据集从 FiftyOne 上传到 CVAT

Question

我正在尝试使用“注释”功能将大约 15GB 的数据从 FiftyOne 上传到 CVAT 以修复注释。该任务分为 50 个样本的作业。在示例上传期间，我收到“错误 504 网关超时”错误。我可以在 CVAT 中看到图像，但它们没有当前注释。尝试使用“task_id”单独上传注释并更改 FiftyOne 中的“cvat.py”文件，但我无法加载更改后的注释。

我无法将其分解为多个任务，因为所有任务都具有相同的名称，这很不方便。为了能够使用“load_annotations”更新数据集，我知道我必须使用“注释”功能上传它（除非有其他方法）。

score 0 · Accepted Answer

更新：这似乎是CVAT对其 API 请求的最大大小的限制。为了暂时规避这个问题，我们在FiftyOne的方法中添加了一个task_size参数annotate()，该参数会自动将注释运行分解为最多多个任务，task_size以避免大数据或注释上传。

上一个答案：

现在管理此工作流程的最佳方法是将注释分解为多个任务，然后将它们上传到一个 CVAT 项目，以便能够很好地对它们进行分组和管理。

例如：

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart").clone()


# The label schema is automatically inferred from the existing labels
# Alternatively, it can be specified with the `label_schema` kwarg 
# when calling `annotate()`

label_field = "ground_truth"


# Upload batches of your dataset to different tasks
# all stored in the same project

project_name = "multiple_task_example"
anno_keys = []

for i in range(int(len(dataset)/50)):
    anno_key = "example_%d" % i
    view = dataset.skip(i*50).limit(50)

    view.annotate(
        anno_key,
        label_field=label_field,
        project_name=project_name,
    )
    anno_keys.append(anno_key)


# Annotate in CVAT...


# Load all annotations and cleanup tasks/project when complete
anno_keys = dataset.list_annotation_runs()  
for anno_key in anno_keys:
    dataset.load_annotations(anno_key, cleanup=True)
    dataset.delete_annotation_run(anno_key)

上传到现有任务和project_name参数将在下一个版本中可用。如果您想立即使用它，您可以从源代码安装 FiftyOne：https ://github.com/voxel51/fiftyone#installing-from-source

我们正在为像您这样的大型 CVAT 注释作业进行进一步的优化和稳定性改进。

cvat - 将大型数据集从 FiftyOne 上传到 CVAT

1 回答 1

Related

Reference