抽象的
我正在尝试在 Docker (v1.13.0) 中设置 Titan/Cassandra/Gremlin-Server 堆栈。我面临的问题是尝试在默认端口上连接到 Gremlin-Server 的应用程序8182
正在报告错误(详情如下)。
首先,这里是一些相关的版本信息:
- 卡桑德拉 v2.2.8
- 泰坦 v1.0.0 (Hadoop 1)
- 小精灵 3.2.3
设置
设置发生在一个Dockerfile
为了可重现。它假定 Cassandra 容器已经存在,运行的 acassandra.yaml
已start_rpc
设置为true
.
Dockerfile
如下:
FROM openjdk:alpine
ENV TITAN 'titan-1.0.0-hadoop1'
RUN apk update && apk add bash unzip && rm -rf /var/cache/apk/* \
&& adduser -S -s /bin/bash -D srg \
&& wget -O /tmp/$TITAN.zip http://s3.thinkaurelius.com/downloads/titan/$TITAN.zip \
&& unzip /tmp/$TITAN.zip -d /opt && ln -s /opt/$TITAN /opt/titan \
&& rm /tmp/*.zip \
&& chown -R srg /opt/$TITAN/ \
&& /opt/titan/bin/gremlin-server.sh -i org.apache.tinkerpop gremlin-python 3.2.3
COPY conf/gremlin-server/* /opt/$TITAN/conf/gremlin-server/
USER srg
WORKDIR /opt/titan
EXPOSE 8182
CMD ["bin/gremlin-server.sh", "conf/gremlin-server/srg.yaml"]
精明的读者会注意到我正在将自定义配置文件复制到容器中,即 Gremlin-Server 配置文件 ( srg.yaml
) 和 Titan 图形属性文件 ( srg.properties
)。
srg.yaml
host: localhost
port: 8182
threadPoolWorker: 1
gremlinPool: 8
scriptEvaluationTimeout: 30000
serializedResponseTimeout: 30000
channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
graphs: {
graph: conf/gremlin-server/srg.properties
}
plugins:
- aurelius.titan
scriptEngines: {
gremlin-groovy: {
imports: [java.lang.Math],
staticImports: [java.lang.Math.PI],
scripts: [scripts/empty-sample.groovy]},
gremlin-jython: {},
gremlin-python: {},
nashorn: {
imports: [java.lang.Math],
staticImports: [java.lang.Math.PI]}}
serializers:
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { useMapperFromGraph: graph }}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { useMapperFromGraph: graph }}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { useMapperFromGraph: graph }}
processors:
- { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
metrics: {
consoleReporter: {enabled: true, interval: 180000},
csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
jmxReporter: {enabled: true},
slf4jReporter: {enabled: true, interval: 180000},
gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
graphiteReporter: {enabled: false, interval: 180000}}
threadPoolBoss: 1
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 65536
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 65536
ssl: {
enabled: false}
srg.properties
gremlin.graph=com.thinkaurelius.titan.core.TitanFactory
storage.backend=cassandrathrift
storage.hostname=cassandra # refers to the linked container
cache.db-cache = true
cache.db-cache-clean-wait = 20
cache.db-cache-time = 180000
cache.db-cache-size = 0.25
# Start elasticsearch inside the Titan JVM
index.search.backend=elasticsearch
index.search.directory=db/es
index.search.elasticsearch.client-only=false
index.search.elasticsearch.local-mode=true
执行
容器使用以下命令运行: docker run -ti --rm=true --link test.cassandra:cassandra -p 8182:8182 titan
.
这是 Gremlin-Server 的日志输出:
0 [main] INFO org.apache.tinkerpop.gremlin.server.GremlinServer -
\,,,/
(o o)
-----oOOo-(3)-oOOo-----
297 [main] INFO org.apache.tinkerpop.gremlin.server.GremlinServer - Configuring Gremlin Server from conf/gremlin-server/srg.yaml
439 [main] INFO org.apache.tinkerpop.gremlin.server.util.MetricManager - Configured Metrics ConsoleReporter configured with report interval=180000ms
448 [main] INFO org.apache.tinkerpop.gremlin.server.util.MetricManager - Configured Metrics CsvReporter configured with report interval=180000ms to fileName=/tmp/gremlin-server-metrics.csv
557 [main] INFO org.apache.tinkerpop.gremlin.server.util.MetricManager - Configured Metrics JmxReporter configured with domain= and agentId=
561 [main] INFO org.apache.tinkerpop.gremlin.server.util.MetricManager - Configured Metrics Slf4jReporter configured with interval=180000ms and loggerName=org.apache.tinkerpop.gremlin.server.Settings$Slf4jReporterMetrics
1750 [main] INFO com.thinkaurelius.titan.core.util.ReflectiveConfigOptionLoader - Loaded and initialized config classes: 12 OK out of 12 attempts in PT0.148S
1972 [main] INFO com.thinkaurelius.titan.diskstorage.cassandra.thrift.CassandraThriftStoreManager - Closed Thrift connection pooler.
1990 [main] INFO com.thinkaurelius.titan.graphdb.configuration.GraphDatabaseConfiguration - Generated unique-instance-id=ac1100031-ad2d5ffa52e81
2026 [main] INFO com.thinkaurelius.titan.diskstorage.Backend - Configuring index [search]
2386 [main] INFO org.elasticsearch.node - [Lunatik] version[1.5.1], pid[1], build[5e38401/2015-04-09T13:41:35Z]
2387 [main] INFO org.elasticsearch.node - [Lunatik] initializing ...
2399 [main] INFO org.elasticsearch.plugins - [Lunatik] loaded [], sites []
6471 [main] INFO org.elasticsearch.node - [Lunatik] initialized
6472 [main] INFO org.elasticsearch.node - [Lunatik] starting ...
6477 [main] INFO org.elasticsearch.transport - [Lunatik] bound_address {local[1]}, publish_address {local[1]}
6507 [main] INFO org.elasticsearch.discovery - [Lunatik] elasticsearch/u2StmRW1RsyEHw561yoNFw
6519 [elasticsearch[Lunatik][clusterService#updateTask][T#1]] INFO org.elasticsearch.cluster.service - [Lunatik] master {new [Lunatik][u2StmRW1RsyEHw561yoNFw][ad2d5ffa52e8][local[1]]{local=true}}, removed {[Lunatik][kKyL9UE-R123LLZTTrsVCw][ad2d5ffa52e8][local[1]]{local=true},}, reason: local-disco-initial_connect(master)
6908 [main] INFO org.elasticsearch.http - [Lunatik] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/172.17.0.3:9200]}
6909 [main] INFO org.elasticsearch.node - [Lunatik] started
6923 [elasticsearch[Lunatik][clusterService#updateTask][T#1]] INFO org.elasticsearch.gateway - [Lunatik] recovered [0] indices into cluster_state
7486 [elasticsearch[Lunatik][clusterService#updateTask][T#1]] INFO org.elasticsearch.cluster.metadata - [Lunatik] [titan] creating index, cause [api], templates [], shards [5]/[1], mappings []
8075 [main] INFO com.thinkaurelius.titan.diskstorage.Backend - Initiated backend operations thread pool of size 4
8241 [main] INFO com.thinkaurelius.titan.diskstorage.Backend - Configuring total store cache size: 94787290
8641 [main] INFO com.thinkaurelius.titan.diskstorage.log.kcvs.KCVSLog - Loaded unidentified ReadMarker start time 2017-01-21T16:31:28.750Z into com.thinkaurelius.titan.diskstorage.log.kcvs.KCVSLog$MessagePuller@3520958b
8642 [main] INFO org.apache.tinkerpop.gremlin.server.GremlinServer - Graph [graph] was successfully configured via [conf/gremlin-server/srg.properties].
8643 [main] INFO org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor - Initialized Gremlin thread pool. Threads in pool named with pattern gremlin-*
14187 [main] INFO com.jcabi.manifests.Manifests - 108 attributes loaded from 264 stream(s) in 185ms, 108 saved, 3371 ignored: ["Agent-Class", "Ant-Version", "Archiver-Version", "Bnd-LastModified", "Boot-Class-Path", "Build-Date", "Build-Host", "Build-Id", "Build-Java-Version", "Build-Jdk", "Build-Job", "Build-Number", "Build-Time", "Build-Timestamp", "Build-Version", "Built-At", "Built-By", "Built-OS", "Built-On", "Built-Status", "Bundle-ActivationPolicy", "Bundle-Activator", "Bundle-BuddyPolicy", "Bundle-Category", "Bundle-ClassPath", "Bundle-Classpath", "Bundle-Copyright", "Bundle-Description", "Bundle-DocURL", "Bundle-License", "Bundle-Localization", "Bundle-ManifestVersion", "Bundle-Name", "Bundle-NativeCode", "Bundle-RequiredExecutionEnvironment", "Bundle-SymbolicName", "Bundle-Vendor", "Bundle-Version", "Can-Redefine-Classes", "Change", "Class-Path", "Created-By", "DynamicImport-Package", "Eclipse-AutoStart", "Eclipse-BuddyPolicy", "Eclipse-SourceReferences", "Embed-Dependency", "Embedded-Artifacts", "Export-Package", "Extension-Name", "Extension-name", "Fragment-Host", "Git-Commit-Branch", "Git-Commit-Date", "Git-Commit-Hash", "Git-Committer-Email", "Git-Committer-Name", "Gradle-Version", "Gremlin-Lib-Paths", "Gremlin-Plugin-Dependencies", "Gremlin-Plugin-Paths", "Ignore-Package", "Implementation-Build", "Implementation-Build-Date", "Implementation-Title", "Implementation-URL", "Implementation-Vendor", "Implementation-Vendor-Id", "Implementation-Version", "Import-Package", "Include-Resource", "JCabi-Build", "JCabi-Date", "JCabi-Version", "Java-Vendor", "Java-Version", "Main-Class", "Main-class", "Manifest-Version", "Maven-Version", "Module-Email", "Module-Origin", "Module-Owner", "Module-Source", "Originally-Created-By", "Os-Arch", "Os-Name", "Os-Version", "Package", "Premain-Class", "Private-Package", "Require-Bundle", "Require-Capability", "Scm-Connection", "Scm-Revision", "Scm-Url", "Specification-Title", "Specification-Vendor", "Specification-Version", "Tool", "X-Compile-Source-JDK", "X-Compile-Target-JDK", "hash", "implementation-version", "mode", "package", "url", "version"]
14842 [main] INFO org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines - Loaded gremlin-jython ScriptEngine
15540 [main] INFO org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines - Loaded nashorn ScriptEngine
16076 [main] INFO org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines - Loaded gremlin-python ScriptEngine
16553 [main] INFO org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines - Loaded gremlin-groovy ScriptEngine
17410 [main] INFO org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor - Initialized gremlin-groovy ScriptEngine with scripts/empty-sample.groovy
17410 [main] INFO org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor - Initialized GremlinExecutor and configured ScriptEngines.
17419 [main] INFO org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor - A GraphTraversalSource is now bound to [g] with graphtraversalsource[standardtitangraph[cassandrathrift:[cassandra]], standard]
17565 [main] INFO org.apache.tinkerpop.gremlin.server.AbstractChannelizer - Configured application/vnd.gremlin-v1.0+gryo with org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0
17566 [main] INFO org.apache.tinkerpop.gremlin.server.AbstractChannelizer - Configured application/vnd.gremlin-v1.0+gryo-stringd with org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0
17808 [main] INFO org.apache.tinkerpop.gremlin.server.AbstractChannelizer - Configured application/vnd.gremlin-v1.0+json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0
17811 [main] INFO org.apache.tinkerpop.gremlin.server.AbstractChannelizer - Configured application/json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0
17958 [gremlin-server-boss-1] INFO org.apache.tinkerpop.gremlin.server.GremlinServer - Gremlin Server configured with worker thread pool of 1, gremlin pool of 8 and boss thread pool of 1.
17959 [gremlin-server-boss-1] INFO org.apache.tinkerpop.gremlin.server.GremlinServer - Channel started at port 8182.
1/21/17 4:34:20 PM =============================================================
-- Meters ----------------------------------------------------------------------
org.apache.tinkerpop.gremlin.server.GremlinServer.errors
count = 0
mean rate = 0.00 events/second
1-minute rate = 0.00 events/second
5-minute rate = 0.00 events/second
15-minute rate = 0.00 events/second
180564 [metrics-logger-reporter-thread-1] INFO org.apache.tinkerpop.gremlin.server.Settings$Slf4jReporterMetrics - type=METER, name=org.apache.tinkerpop.gremlin.server.GremlinServer.errors, count=0, mean_rate=0.0, m1=0.0, m5=0.0, m15=0.0, rate_unit=events/second
症状
到目前为止,一切似乎都按预期工作。日志表明我能够加载srg.properties
数据结构并将其绑定到名为graph
.
当我尝试通过导出的端口连接到 Gremlin-Server 实例时出现问题8182
,例如使用gremlin-python
:
# executed via python 3.6.0 on the host machine, i.e. not inside of Docker
from gremlin_python import statics
from gremlin_python.structure.graph import Graph
from gremlin_python.process.graph_traversal import __
from gremlin_python.process.strategies import *
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
g = graph.traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/gremlin','graph'))
产生以下异常...
---------------------------------------------------------------------------
HTTPError Traceback (most recent call last)
<ipython-input-10-59ad504f29b4> in <module>()
----> 1 g = graph.traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/','g'))
/Users/lthibault/.pyenv/versions/3.6.0/lib/python3.6/site-packages/gremlin_python/driver/driver_remote_connection.py in __init__(self, url, traversal_source, username, password, loop, graphson_reader, graphson_writer)
41 self._password = password
42 if loop is None: self._loop = ioloop.IOLoop.current()
---> 43 self._websocket = self._loop.run_sync(lambda: websocket.websocket_connect(self.url))
44 self._graphson_reader = graphson_reader or GraphSONReader()
45 self._graphson_writer = graphson_writer or GraphSONWriter()
/Users/lthibault/.pyenv/versions/3.6.0/lib/python3.6/site-packages/tornado/ioloop.py in run_sync(self, func, timeout)
455 if not future_cell[0].done():
456 raise TimeoutError('Operation timed out after %s seconds' % timeout)
--> 457 return future_cell[0].result()
458
459 def time(self):
/Users/lthibault/.pyenv/versions/3.6.0/lib/python3.6/site-packages/tornado/concurrent.py in result(self, timeout)
235 return self._result
236 if self._exc_info is not None:
--> 237 raise_exc_info(self._exc_info)
238 self._check_done()
239 return self._result
/Users/lthibault/.pyenv/versions/3.6.0/lib/python3.6/site-packages/tornado/util.py in raise_exc_info(exc_info)
HTTPError: HTTP 599: Stream closed
怀疑此库特有的问题:
1)尝试连接到websocket端口nc
$ nc -z -v localhost 8182
found 0 associations
found 1 connections:
1: flags=82<CONNECTED,PREFERRED>
outif lo0
src ::1 port 58627
dst ::1 port 8182
rank info not available
TCP aux info available
Connection to localhost port 8182 [tcp/*] succeeded!
2) 尝试使用不同的客户端库连接到 Gremlin-Server,即go-gremlin
测试用例:
package main
import (
"fmt"
"log"
"github.com/go-gremlin/gremlin"
)
func main() {
if err := gremlin.NewCluster("ws://localhost:8182/gremlin"); err != nil {
log.Fatal(err)
}
data, err := gremlin.Query(`graph.V()`).Exec()
if err != nil {
log.Fatalf("Query error: %s", err)
}
fmt.Println(string(data))
}
输出:
$ go run cmd/test/main.go
2017/01/21 14:47:42 Query error: unexpected EOF
exit status 1
当前的结论和问题
从前面的测试中,我得出的结论是这是一个应用程序级别的问题(即websocket或ws协议级别的问题,而不是主机或容器网络堆栈的问题)。实际上,nc
报告套接字连接成功,但在 Python 和 Go 客户端库中,表面上都抱怨来自服务器的不适当(空)响应。
我尝试/gremlin
在 gremlin-python 和 go-gremlin 中从 websocket URL 中删除路径,但无济于事。
我的问题是:我从这里去哪里?任何建议或诊断路径将不胜感激!