azure - Azure Gremlin 边缘遍历可疑地高（Out() 步骤）RU 成本

Question

我有一个奇怪的问题，在一些边缘上进行外操作会导致我的 RU 成本增加三倍。希望有人可以帮助我解释为什么+我可以做些什么来减轻它。

我在 CosmosDB 中有一个 Graph，其中有两种类型的顶点标签：“Profile”和“Score”。每个配置文件通过“ProfileHasAggregatedScore”边具有 0 或 1 个分数顶点。partitionKey 是 Profile 的 ID。

如果我进行以下查询，则当前的 RU 是：

g.V().hasLabel('Profile').out('ProfileHasAggregatedScore')
>78 RU (8 scores found)

作为参考，获取一个类型的所有顶点的成本是：

g.V().hasLabel('Profile')
>28 RU (110 profiles found)

g.E().hasLabel('ProfileHasAggregatedScore')
>11 RU (8 edges found)

g.V().hasLabel('AggregatedRating')
>11 RU (8 scores found)

单个顶点或边的成本是：

g.V('aProfileId').hasLabel('Profile')
>4 RU (1 found)

g.E('anEdgeId')
> 7RU

G.V('aRatingId')
> 3.5 RU

有人可以帮我解释一下为什么沿途只有几个顶点的遍历（请参阅底部的遍历）比搜索所有内容更昂贵吗？我能做些什么来防止它吗？使用 partitionKey 添加一个 has-filter 似乎没有帮助。在找到 110 个顶点之后再遍历/查找 16 个元素（8 个边和 8 个顶点）会使操作成本增加三倍，这似乎很奇怪？

（注意。对于 1000 个配置文件，沿着边缘到得分节点进行 1 次遍历的成本是 2200 RU。考虑到他们的 Azure 团队强调它的可扩展性，这似乎很高？）

遍历是否有帮助（似乎大部分时间都花在使用 out() 步骤寻找边缘）：

[
  {
    "gremlin": "g.V().hasLabel('Profile').out('ProfileHasAggregatedScore').executionProfile()",
    "totalTime": 46,
    "metrics": [
      {
        "name": "GetVertices",
        "time": 13,
        "annotations": {
          "percentTime": 28.26
        },
        "counts": {
          "resultCount": 110
        },
        "storeOps": [
          {
            "fanoutFactor": 1,
            "count": 110,
            "size": 124649,
            "time": 2.47
          }
        ]
      },
      {
        "name": "GetEdges",
        "time": 26,
        "annotations": {
          "percentTime": 56.52
        },
        "counts": {
          "resultCount": 8
        },
        "storeOps": [
          {
            "fanoutFactor": 1,
            "count": 8,
            "size": 5200,
            "time": 6.22
          },
          {
            "fanoutFactor": 1,
            "count": 0,
            "size": 49,
            "time": 0.88
          }
        ]
      },
      {
        "name": "GetNeighborVertices",
        "time": 7,
        "annotations": {
          "percentTime": 15.22
        },
        "counts": {
          "resultCount": 8
        },
        "storeOps": [
          {
            "fanoutFactor": 1,
            "count": 8,
            "size": 6303,
            "time": 1.18
          }
        ]
      },
      {
        "name": "ProjectOperator",
        "time": 0,
        "annotations": {
          "percentTime": 0
        },
        "counts": {
          "resultCount": 8
        }
      }
    ]
  }
]
enter code here

azure - Azure Gremlin 边缘遍历可疑地高（Out() 步骤）RU 成本

0 回答 0

Related

Reference