0

我正在尝试使用 jdbc-river 将数据加载到 elasticsearch 中,但出现此错误。有人可以告诉我发生了什么事吗?

org.elasticsearch.index.mapper.MapperParsingException: object mapping for [foo] tried to parse as object, but got EOF, has a concrete value been provided to it?
    at org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:467)
    at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:515)
    at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:462)
    at org.elasticsearch.index.shard.service.InternalIndexShard.prepareCreate(InternalIndexShard.java:371)
    at org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:400)
    at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:153)
    at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:556)
    at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:426)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)
[2014-03-19 22:06:06,672][INFO ][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverMouth] bulk [11790] success [100 items] [15ms]

这是河流的定义:

curl -XPUT 'localhost:9200/_river/my_river/_meta' -d '{
              "type" : "jdbc"
              , "jdbc" : {
                    "url": "jdbc:postgresql://domainname.com:5432/myapp"
                    , "user": "user"
                    , "password": "passwd"
                    , "sql": "select * from foo"
                    , "index": "myapp"
                    , "type": "foo"
                    }
              }'

还没有弹性搜索映射。也许这是个问题。我的理解是它会自动映射,但我愿意在必要时添加任何映射。

postgres 表模式(“数据类型”、“is_nullable”):

"integer";"YES"
"boolean";"NO"
"boolean";"YES"
"character varying";"NO"
"timestamp with time zone";"YES"
"text";"YES"
"boolean";"NO"
"integer";"YES"
"integer";"YES"
"numeric";"YES"
"text";"YES"
"integer";"YES"
"numeric";"YES"
"numeric";"YES"
"numeric";"YES"
"character varying";"YES"
"character varying";"YES"
"date";"YES"
"numeric";"YES"
"numeric";"YES"
"numeric";"YES"
"character varying";"YES"
"character varying";"YES"
"character varying";"YES"
"character varying";"YES"
"boolean";"YES"
"integer";"YES"
"character varying";"YES"
"timestamp with time zone";"NO"
"timestamp with time zone";"NO"
"boolean";"YES"
"integer";"YES"
"character varying";"YES"
"numeric";"YES"
"integer";"YES"
"character varying";"YES"
"character varying";"YES"
"integer";"YES"
"integer";"NO"
"integer";"NO"
4

1 回答 1

0

我最终没有使用河流。我通过python客户端使用elasticsearch API将文档从应用程序服务器发布到elasticsearch(我们的应用程序是Python)。这很好用。我使用 python 的多处理功能来改善 20 个进程的加载时间。它在几分钟内加载了大约 28,000 个文档。

我希望这有帮助!

于 2014-10-27T16:28:43.530 回答