1

我希望我的应用程序连接到两个远程服务器 Gremlinserver/Janusserver。两者都有相同的 Cassandra 数据库。这样我就有了高可用性。

<dependency>
    <groupId>org.janusgraph</groupId>
    <artifactId>janusgraph-core</artifactId>
    <version>0.2.0</version>
</dependency>
<dependency>
    <groupId>org.apache.tinkerpop</groupId>
    <artifactId>gremlin-driver</artifactId>
    <version>3.2.6</version>
</dependency>

文件 gremlin.yaml:

hosts: [127.0.0.1,192.168.2.57]
port: 8182
serializer: { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}

在我的服务类中,我有几个方法,每个方法都通过客户端对象连接:

public class GremlinServiceConcrete implements GremlinService {
...
..
public Set<Long> getImpactedComponentsIds (...) throws GremlinServiceException {
..
        Cluster cluster = gremlinCluster.getCluster();
        Client client = null;
        Set<Long> impactedIds = Sets.newHashSet();
        try {
            client = cluster.connect();
            binding = Maps.newLinkedHashMap();
..

在 GremlinCluster 类中,我调用驱动程序

public class GremlinCluster {

    public static final int MIN_CONNECTION_POOL_SIZE = 2;
    public static final int MAX_CONNECTION_POOL_SIZE = 20;
    public static final int MAX_CONTENT_LENGTH = 65536000;

    private static Logger logger = LoggerFactory.getLogger(GremlinCluster.class);

    private String server;
    private Integer port;

    private Cluster cluster;

    public GremlinCluster(String server, Integer port) throws FileNotFoundException {
        this.server = Objects.requireNonNull(server);
        this.port = Objects.requireNonNull(port);
        this.cluster = init();
    }

    private Cluster init() throws FileNotFoundException {
        GryoMapper.Builder kryo = GryoMapper.build().addRegistry(JanusGraphIoRegistry.getInstance());
        MessageSerializer serializer = new GryoMessageSerializerV1d0(kryo);
        Cluster cluster = Cluster.build(new File("conf/driver-gremlin.yaml")).port(port)
                .serializer(serializer)
                .minConnectionPoolSize(MIN_CONNECTION_POOL_SIZE)
                .maxConnectionPoolSize(MAX_CONNECTION_POOL_SIZE)
                .maxContentLength(MAX_CONTENT_LENGTH).create();

        logger.debug(String.format("New cluster connected at %s:%s", server, port));
        return cluster;
    }

    public Cluster getCluster() {
        return cluster;
    }

    public void destroy() {
        try {
            cluster.close();
        } catch (Exception e) {
            logger.debug("Error closing cluster connection: " + e.toString());
        }
    }

}

该应用程序通过仅连接到一台服务器运行良好。当您连接到服务器时,它运行非常缓慢。如果我停止服务器没有正确运行故障转移,我怀疑服务器以会话模式连接。Tinkerpop 文档没有指定两种模式之间的代码差异。

更正:缓慢是由于eclipse的调试模式。应用程序向两个 gremlinservers 发送请求,这部分集群功能工作正常。

服务器关闭时会发生错误操作。应用程序将请求发送到其他服务器。如果关闭的服务器已启动,则 gremlin 服务器不会检测到它并且不会重新连接。

gremlinserver 的输出: 在此处输入图像描述

GremlinCluster 是一个 spring bean (beans-services.xml):

<bean id="gremlinCluster" class="[Fully qualified name].GremlinCluster" scope="singleton" destroy-method="destroy">
    <constructor-arg name="server"><value>${GremlinServerHost}</value></constructor-arg>
    <constructor-arg name="port"><value>${GremlinServerPort}</value></constructor-arg>
</bean>

并在属性文件中。

GremlinServerHost=[Fully qualified name]/config/gremlin.yaml
GremlinServerPort=8182

在 GremlinCluster 类中:

import java.util.Objects;

import org.apache.tinkerpop.gremlin.driver.Cluster;
import org.apache.tinkerpop.gremlin.driver.MessageSerializer;
import org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0;
import org.apache.tinkerpop.gremlin.structure.io.gryo.GryoMapper;
import org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.io.File;
import java.io.FileNotFoundException;

public class GremlinCluster {

    public static final int MIN_CONNECTION_POOL_SIZE = 2;
    public static final int MAX_CONNECTION_POOL_SIZE = 20;
    public static final int MAX_CONTENT_LENGTH = 65536000;

    private static Logger logger = LoggerFactory.getLogger(GremlinCluster.class);

    private String server;
    private Integer port;

    private Cluster cluster;

    public GremlinCluster(String server, Integer port) throws FileNotFoundException {
        this.server = Objects.requireNonNull(server);
        this.port = Objects.requireNonNull(port);
        this.cluster = init();
    }

    private Cluster init() throws FileNotFoundException {
        GryoMapper.Builder kryo = GryoMapper.build().addRegistry(JanusGraphIoRegistry.getInstance());
        MessageSerializer serializer = new GryoMessageSerializerV1d0(kryo);
        Cluster cluster = Cluster.build(new File(server)).port(port)
                .serializer(serializer)
                .minConnectionPoolSize(MIN_CONNECTION_POOL_SIZE)
                .maxConnectionPoolSize(MAX_CONNECTION_POOL_SIZE)
                .maxContentLength(MAX_CONTENT_LENGTH).create();

        logger.debug(String.format("New cluster connected at %s:%s", server, port));
        return cluster;
    }

    public Cluster getCluster() {
        return cluster;
    }

    public void destroy() {
        try {
            cluster.close();
        } catch (Exception e) {
            logger.debug("Error closing cluster connection: " + e.toString());
        }
    }

}

还有一个带有查询方法的示例(GremlinServiceConcrete):

@Override
    public Long getNeighborsCount(List<Long> componentIds) throws GremlinServiceException {
        // Check argument is right
        if (componentIds == null || componentIds.isEmpty()) {
            throw new GremlinServiceException("Cannot compute neighbors count with an empty list as argument");
        }

        Cluster cluster = gremlinCluster.getCluster();
        Client client = null;
        try {
            client = cluster.connect();
            String gremlin = "g.V(componentIds).both().dedup().count()";
            Map<String, Object> parameters = Maps.newHashMap();
            parameters.put("componentIds", componentIds);

            if (logger.isDebugEnabled()) logger.debug("Submiting query [ " + gremlin + " ] with binding [ " + parameters + "]");

            ResultSet resultSet = client.submit(gremlin, parameters);
            Result result = resultSet.one();
            return result.getLong();

        } catch (Exception e) {
            throw new GremlinServiceException("Error retrieving how many neighbors do vertices " + componentIds + " have: " + e.getMessage(), e);

        } finally {
            if (client != null) try { client.close(); } catch (Exception e) { /* NPE because connection was not initialized yet */ }
        }
    }

gremlin-server.yaml:

host: 127.0.0.1
port: 8182
scriptEvaluationTimeout: 600000
channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
graphs: {
  graph: conf/janusgraph-cassandra.properties
}
plugins:
  - janusgraph.imports
scriptEngines: {
  gremlin-groovy: {
    imports: [java.lang.Math,org.janusgraph.core.schema.Mapping],
    staticImports: [java.lang.Math.PI],
    scripts: [scripts/empty-sample.groovy]}}
serializers:
  - {
      className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0,
      config: {
        bufferSize: 819200,
        ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry]
      }
    }
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0, config: {ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
processors:
  - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
  - { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor, config: { cacheExpirationTime: 600000, cacheMaxSize: 1000 }}
metrics: {
  consoleReporter: {enabled: true, interval: 180000},
  csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
  jmxReporter: {enabled: true},
  slf4jReporter: {enabled: true, interval: 180000},
  gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
  graphiteReporter: {enabled: false, interval: 180000}}
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 4096000
maxContentLength: 65536000
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 655360

janusgraph-cassandra.properties:

gremlin.graph=org.janusgraph.core.JanusGraphFactory
storage.backend=cassandrathrift
storage.hostname=192.168.2.57,192.168.2.70,192.168.2.77
cache.db-cache = true
cache.db-cache-clean-wait = 20
cache.db-cache-time = 180000
cache.db-cache-size = 0.5
#storage.cassandra.replication-strategy-class=org.apache.cassandra.locator.NetworkTopologyStrategy
#storage.cassandra.replication-strategy-options=dc1,2,dc2,1
storage.cassandra.read-consistency-level=QUORUM
storage.cassandra.write-consistency-level=QUORUM
ids.authority.conflict-avoidance-mode=GLOBAL_AUTO
4

1 回答 1

3

如果我理解正确,您的意思是,如果 Gremlin 服务器出现故障,请求会开始专门路由到服务器,但是当该服务器重新联机时,客户端不会识别它已恢复,因此所有请求都会继续流动到一直处于运行状态的一台服务器。如果那是正确的,我不能重现你的问题,至少在 Gremlin Server 3.3.0 上(虽然我不怀疑 3.2.x 上的不同行为,因为我不知道发生了任何真正的变化3.3.0 中的驱动程序在 3.2.x 上也没有出现)。

您的代码并没有真正完全显示您的测试方式。在我的测试中,我使用 Gremlin 控制台来执行此操作:

gremlin> cluster = Cluster.build().addContactPoint("192.168.1.7").addContactPoint("192.168.1.6").create()
==>/192.168.1.7:8182, localhost/127.0.0.1:8182
gremlin> client = cluster.connect()
==>org.apache.tinkerpop.gremlin.driver.Client$ClusteredClient@1bd0b0e5
gremlin> (0..<100000).collect{client.submit("1+1").all().get()}.toList();[]
java.util.concurrent.ExecutionException: java.nio.channels.ClosedChannelException
Type ':help' or ':h' for help.
Display stack trace? [yN]n
gremlin> (0..<100000).collect{client.submit("1+1").all().get()}.toList();[]

ClosedChannelException显示我杀死服务器的地方。然后,我从 Gremlin 服务器日志中记录了有多少请求已提交到保持在线的服务器。然后我重新启动了我杀死的服务器并重新启动了 Gremlin 控制台中的请求流。当我查看两个请求计数时,它们都增加了,这意味着驱动程序能够检测到停机的服务器已经重新联机。

从您的问题中不清楚您如何确定驱动程序没有重新连接,但我注意到您也在创建和销毁Cluster对象,其方式看起来像是根据对getImpactedComponentsIds应用程序服务的请求完成的。您真的应该只创建Cluster一次对象并重新使用它。它创建了昂贵的对象,因为它启动了许多网络资源池。由于这种创建/销毁方法,您可能看不到重新连接。

在考虑这一点时,我虽然可以设想一个场景,其中创建/销毁方法Cluster可能会使事情看起来好像没有发生重新连接,但是驱动程序中的负载平衡方法应该在创建时随机选择一个主机,所以除非你对于你所做的每一次测试,随机选择总是去同一个主机是非常不幸的,你应该看到它至少在某些时候连接到停机的服务器。

于 2017-11-29T13:09:29.430 回答