tinkerpop3 - 为什么 API 调用在 Gremlin Python 中不起作用？

Question

在 gremlin-python 中，我可以这样做：

for e in g.E().toList():
        print(e)

并会得到类似的结果

e[11][4-created->3]
e[12][6-created->3]
e[7][1-knows->2]
e[8][1-knows->4]
e[9][1-created->3]
e[10][4-created->5]

根据

http://tinkerpop.apache.org/javadocs/3.4.3/core/org/apache/tinkerpop/gremlin/structure/Edge.html

Edge 有一个 inVertex() 访问器。将这个想法翻译成 python 会导致：

for e in g.E().toList():
        print (e.inVertex().id)

和错误

AttributeError: 'Edge' object has no attribute 'inVertex'

对于很多其他“简单”API 调用也是如此。

for e in g.E().toList():
        print(e.property('weight'))

也失败了

这是怎么回事，解决方法是什么？

score 3 · Accepted Answer

在 TinkerPop 中，图元素（例如顶点、边、顶点属性）经常会经历一个“分离”的过程。从远程源返回图形元素的 Gremlin 遍历会经历这个过程，在这些情况下，通常与“引用”分离。引用提供了足够的信息来重新附加到远程图。对于重新连接的过程，它只需要id和label。因此，不会返回属性。Gremlin 支持的所有语言都是一样的，而不仅仅是 Python（不过，我会在最后的最后说明中与此声明有点矛盾）。

专门针对 Gremlin 语言变体（如 Python）而言，Gremlin 的这些实现没有完整的 Gremlin 虚拟机来处理遍历，并且从来没有打算在 Python 端构建完整的图形结构 - 只有带有引用的图形元素才能匹配从远程来源返回。这也减少了 Python 端需要维护的代码量，因为 TinkerPop 可以依赖于所有编程语言中存在的标准原语等Dictionary。List

抛开技术历史不谈，引用的回归力用于根据最佳实践编写更好的 Gremlin。用户应该在 Gremlin 遍历中准确指定他们想要的数据。而不是：

g.V().hasLabel('customer')

你更喜欢：

g.V().hasLabel('customer').valueMap(true,'name')

或在 3.4.4 中：

 g.V().hasLabel('customer').elementMap('name')

它返回的嵌套结构比valueMap(). elementMap()非常适用于边缘，并且可以替代更复杂的方法，通过project()从问题中的边缘获取您请求的数据：

gremlin> g.V().has('person','name','marko').elementMap()
==>[id:1,label:person,name:marko,age:29]
gremlin> g.V().has('person','name','marko').elementMap('name')
==>[id:1,label:person,name:marko]
gremlin> g.V().has('person','name','marko').properties('name').elementMap()
==>[id:0,key:name,value:marko]
gremlin> g.E(11).elementMap()
==>[id:11,label:created,IN:[id:3,label:software],OUT:[id:4,label:person],weight:0.4]

在你可能不会做的 SQL 中，这确实没有什么不同：

SELECT * FROM customer

但反而：

SELECT name FROM customer

返回引用并强制用户更明确地了解他们返回的内容也解决了多/元属性的一个巨大问题。如果用户返回顶点并无意中返回了一个“胖”顶点（例如，一个具有 100 万个属性的顶点），它将对尝试返回该顶点的服务器产生重大影响。通过脱离参考，用户不会陷入困境。

尽管如此，从 3.4.3 开始，分离仍然存在一些不一致的地方，在 Java 中的某些情况下，除了引用分离之外，分离还有其他工作方式。TinkerPop 一直试图在这种方法中变得完全一致，但一直在尝试以一种不会破坏现有发布行中现有代码的方式来做到这一点。这可能不是您要寻找的答案，但至少它有助于解释为什么事情是这样的一些推理和历史。

score 1 · Accepted Answer

toList()执行 gremlin 查询并将结果打包到一个列表中。因此，您无法继续遍历inVertex().

要获取输入顶点，您应该运行：

for v in g.E().inV().toList():
        print(v)

要在单个查询中获取边属性和两个顶点属性，您可以使用project：

g.E().project("values", "in", "out")
    .by(valueMap(true))
    .by(inV().valueMap(true))
    .by(outV().valueMap(true))

score 0 · Accepted Answer

查看https://github.com/apache/tinkerpop/blob/master/gremlin-python/src/main/jython/gremlin_python/structure/graph.py的源代码（见下文），可以直接访问以下属性：

对于所有元素：

e.id
e.label

对于边缘：

e.inV
e.outV

坏消息是首先需要检索属性，因此在单个 python 语句中访问 id、标签和属性并不容易。

class Element(object):
    def __init__(self, id, label):
        self.id = id
        self.label = label

    def __eq__(self, other):
        return isinstance(other, self.__class__) and self.id == other.id

    def __hash__(self):
        return hash(self.id)


class Vertex(Element):
    def __init__(self, id, label="vertex"):
        Element.__init__(self, id, label)

    def __repr__(self):
        return "v[" + str(self.id) + "]"


class Edge(Element):
    def __init__(self, id, outV, label, inV):
        Element.__init__(self, id, label)
        self.outV = outV
        self.inV = inV

    def __repr__(self):
        return "e[" + str(self.id) + "][" + str(self.outV.id) + "-" + self.label + "->" + str(self.inV.id) + "]"

tinkerpop3 - 为什么 API 调用在 Gremlin Python 中不起作用？

3 回答 3

Related

Reference