google-cloud-automl - 在多个数据集上训练 Google-Cloud-Automl 模型

Question

我想使用多个数据集在 gcp 的顶点 ai 上训练一个 automl 模型。我想将数据集分开，因为它们来自不同的来源，想单独训练它们等等。这可能吗？或者我需要创建一个包含两个数据集的数据集吗？看起来我只能在 Web UI 中选择一个数据集。

score 1 · Accepted Answer

只要您的源位于 Google Cloud Storage 中，就可以通过 Vertex AI API，只需提供 JSON 或 CSV 格式的训练数据列表，该列表符合训练数据格式化的最佳实践。

请参阅创建和导入数据集的代码。有关代码参考和更多详细信息，请参阅文档。

from typing import List, Union
from google.cloud import aiplatform

    def create_and_import_dataset_image_sample(
        project: str,
        location: str,
        display_name: str,
        src_uris: Union[str, List[str]], // example: ["gs://bucket/file1.csv", "gs://bucket/file2.csv"]
        sync: bool = True,
    ):
        aiplatform.init(project=project, location=location)
    
        ds = aiplatform.ImageDataset.create(
            display_name=display_name,
            gcs_source=src_uris,
            import_schema_uri=aiplatform.schema.dataset.ioformat.image.single_label_classification,
            sync=sync,
        )
    
        ds.wait()
    
        print(ds.display_name)
        print(ds.resource_name)
        return ds

注意：提供的链接适用于 Vertex AI AutoML Image。如果您访问这些链接，则可以选择其他 AutoML 产品，例如文本、表格和视频。

google-cloud-automl - 在多个数据集上训练 Google-Cloud-Automl 模型

1 回答 1

Related

Reference