- 我有一个数据的 hadoop 摄取过程(就像https://druid.apache.org/docs/latest/ingestion/hadoop.html)
- 当前的 druid indexer 版本是0.14.2-incubating
- 数据是 GCS 上的 TSV 文件。
以前用过老版本的druid indexer,没有问题。升级到新版本后出现错误。
一些细节
这是我的规范中的一个解析部分:
"parser": {
"parseSpec": {
"dimensionsSpec": {
"spatialDimensions": [
{
"dimName": "geo",
"dims": ["latitude", "longitude"]
}
],
"dimensionExclusions": [],
"dimensions":[
"ip_address",
"radius",
"confidence"
]
},
"timestampSpec": {
"format": "millis",
"column": "ts"
},
"columns": [
"ts",
"ip_address",
"latitude",
"longitude",
"radius",
"confidence"
],
"format":"tsv"
},
"type": "lzo"
}
},
本节会导致错误,如下所示:
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.druid.cli.CliHadoopIndexer.run(CliHadoopIndexer.java:116)
at org.apache.druid.cli.Main.main(Main.java:118)
Caused by: java.lang.IllegalArgumentException: Instantiation of [simple type, class org.apache.druid.data.input.impl.DelimitedParseSpec] value failed: column[geo] not in columns. (through reference chain: org.apache.druid.data.input.impl.StringInputRowParser["parseSpec"])
at shade.com.fasterxml.jackson.databind.ObjectMapper._convert(ObjectMapper.java:3459)
at shade.com.fasterxml.jackson.databind.ObjectMapper.convertValue(ObjectMapper.java:3378)
at org.apache.druid.segment.indexing.DataSchema.getParser(DataSchema.java:126)
at org.apache.druid.indexer.HadoopDruidIndexerConfig.verify(HadoopDruidIndexerConfig.java:591)
at org.apache.druid.indexer.HadoopDruidIndexerJob.<init>(HadoopDruidIndexerJob.java:49)
at org.apache.druid.cli.CliInternalHadoopIndexer.run(CliInternalHadoopIndexer.java:124)
at org.apache.druid.cli.Main.main(Main.java:118)
... 6 more
Caused by: shade.com.fasterxml.jackson.databind.JsonMappingException: Instantiation of [simple type, class org.apache.druid.data.input.impl.DelimitedParseSpec] value failed: column[geo] not in columns. (through reference chain: org.apache.druid.data.input.impl.StringInputRowParser["parseSpec"])
at shade.com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.wrapException(StdValueInstantiator.java:399)
at shade.com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:231)
at shade.com.fasterxml.jackson.databind.deser.impl.PropertyBasedCreator.build(PropertyBasedCreator.java:135)
at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:442)
at shade.com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1099)
at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:296)
at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:166)
at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:136)
at shade.com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:122)
at shade.com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:93)
at shade.com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:131)
at shade.com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:518)
at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeWithErrorWrapping(BeanDeserializer.java:463)
at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:378)
at shade.com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1099)
at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:296)
at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:166)
at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:136)
at shade.com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:122)
at shade.com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:93)
at shade.com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:131)
at shade.com.fasterxml.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize(TypeWrappedDeserializer.java:42)
at shade.com.fasterxml.jackson.databind.ObjectMapper._convert(ObjectMapper.java:3454)
... 12 more
Caused by: java.lang.IllegalArgumentException: column[geo] not in columns.
at shade.com.google.common.base.Preconditions.checkArgument(Preconditions.java:148)
at org.apache.druid.data.input.impl.DelimitedParseSpec.verify(DelimitedParseSpec.java:119)
at org.apache.druid.data.input.impl.DelimitedParseSpec.<init>(DelimitedParseSpec.java:63)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at shade.com.fasterxml.jackson.databind.introspect.AnnotatedConstructor.call(AnnotatedConstructor.java:125)
at shade.com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:227)
... 33 more
我看到规范解析器试图在列之间定位维度,但它是空间维度!
这是一个非常痛苦的问题,影响了生产。有没有人知道如何解决这个错误?