1

最近,我开始用 AIS Data 研究 MongoDB 的性能。我使用了一个包含 19m 文档的集合,这些文档具有定义中描述的正确字段类型。我还在同一个集合中从坐标 (lon,lat) 创建了一个新的 geoloc 字段,其类型为:(Point)。

正在调查的查询是:

db.nari_dynamic.explain('executionStats').aggregate
(
[
  {
      "$match": {
           "geoloc": {
               "$geoWithin": {
                   "$geometry": {
                       "type" : "Polygon" ,
                       "coordinates": [ [ [ -5.00, 45.00 ], [ +0.00, 45.00 ], [ +0.00, 50.00 ], [ -5.00, 50.00 ], [ -5.00, 45.00 ] ] ]
              }}}}
  },

  { "$group": {"_id": "$sourcemmsi", "PointCount": {"$sum" : 1}, "MinDatePoint": {"$min" : {"date": "$t3" }}, "MaxDatePoint": {"$max" : {"date": "$t3" }} }},
  { "$sort": {"_id":1} },
  { "$limit":100 },
  { "$project": {"_id":1, "PointCount":1, "MinDatePoint":1, "MaxDatePoint":1} }
],
{ explain:true}
)

在调查和测试过程中,我发现了以下内容:

  1. 无任何指标:94s
  2. 使用 geoloc-2dsphere 索引:280s

以下是执行统计数据: 没有索引

{ stages: 
   [ { '$cursor': 
        { queryPlanner: 
           { plannerVersion: 1,
             namespace: 'mscdata.nari_dynamic',
             indexFilterSet: false,
             parsedQuery: 
              { geoloc: 
                 { '$geoWithin': 
                    { '$geometry': 
                       { type: 'Polygon',
                         coordinates: [ [ [ -5, 45 ], [ 0, 45 ], [ 0, 50 ], [ -5, 50 ], [ -5, 45 ] ] ] } } } },
             queryHash: '6E2EAB94',
             planCacheKey: '6E2EAB94',
             winningPlan: 
              { stage: 'PROJECTION_SIMPLE',
                transformBy: { sourcemmsi: 1, t3: 1, _id: 0 },
                inputStage: 
                 { stage: 'COLLSCAN',
                   filter: 
                    { geoloc: 
                       { '$geoWithin': 
                          { '$geometry': 
                             { type: 'Polygon',
                               coordinates: [ [ [ -5, 45 ], [ 0, 45 ], [ 0, 50 ], [ -5, 50 ], [ -5, 45 ] ] ] } } } },
                   direction: 'forward' } },
             rejectedPlans: [] } } },
     { '$group': 
        { _id: '$sourcemmsi',
          PointCount: { '$sum': { '$const': 1 } },
          MinDatePoint: { '$min': { date: '$t3' } },
          MaxDatePoint: { '$max': { date: '$t3' } } } },
     { '$sort': { sortKey: { _id: 1 }, limit: 100 } },
     { '$project': 
        { _id: true,
          PointCount: true,
          MaxDatePoint: true,
          MinDatePoint: true } } ],
  serverInfo: 
   { host: 'ubuntu16',
     port: 27017,
     version: '4.4.1',
     gitVersion: 'ad91a93a5a31e175f5cbf8c69561e788bbc55ce1' },
  ok: 1 }

以下是执行统计数据: 使用索引

{ stages: 
   [ { '$cursor': 
        { queryPlanner: 
           { plannerVersion: 1,
             namespace: 'mscdata.nari_dynamic',
             indexFilterSet: false,
             parsedQuery: 
              { geoloc: 
                 { '$geoWithin': 
                    { '$geometry': 
                       { type: 'Polygon',
                         coordinates: [ [ [ -5, 45 ], [ 0, 45 ], [ 0, 50 ], [ -5, 50 ], [ -5, 45 ] ] ] } } } },
             queryHash: '6E2EAB94',
             planCacheKey: 'F35B194B',
             winningPlan: 
              { stage: 'PROJECTION_SIMPLE',
                transformBy: { sourcemmsi: 1, t3: 1, _id: 0 },
                inputStage: 
                 { stage: 'FETCH',
                   filter: 
                    { geoloc: 
                       { '$geoWithin': 
                          { '$geometry': 
                             { type: 'Polygon',
                               coordinates: [ [ [ -5, 45 ], [ 0, 45 ], [ 0, 50 ], [ -5, 50 ], [ -5, 45 ] ] ] } } } },
                   inputStage: 
                    { stage: 'IXSCAN',
                      keyPattern: { geoloc: '2dsphere' },
                      indexName: 'geoloc-field',
                      isMultiKey: false,
                      multiKeyPaths: { geoloc: [] },
                      isUnique: false,
                      isSparse: false,
                      isPartial: false,
                      indexVersion: 2,
                      direction: 'forward',
                      indexBounds: 
                       { geoloc: 
                          [ '[936748722493063168, 936748722493063168]',
                            '[954763121002545152, 954763121002545152]',
                            '[959266720629915648, 959266720629915648]',
                            '[960392620536758272, 960392620536758272]',
                            '[960674095513468928, 960674095513468928]',
                            '[960744464257646592, 960744464257646592]',
                            '[960762056443691008, 960762056443691008]',
                            '[960766454490202112, 960766454490202112]',
                            '[960767554001829888, 960767554001829888]',
                            '[960767828879736832, 960767828879736832]',
                            '[960767897599213568, 960767897599213568]',
                            '[960767914779082752, 960767914779082752]',
                            '[960767919074050048, 960767919074050048]',
                            '[960767920147791872, 960767920147791872]',
                            '[960767920416227328, 960767920416227328]',
                            '[960767920483336192, 960767920483336192]',
                            '[960767920500113408, 960767920500113408]',
                            '[960767920504307712, 960767920504307712]',
                            '[960767920505356288, 960767920505356288]',
                            '[960767920505618432, 960767920505618432]',
                            '[960767920505683968, 960767920505683968]',
                            '[960767920505683969, 960767920505716735]',
                            '[1345075088707977217, 1345075088708009983]',
                            '[1345075088708009984, 1345075088708009984]',
                            '[1345075088708075520, 1345075088708075520]',
                            '[1345075088708337664, 1345075088708337664]',
                            '[1345075088709386240, 1345075088709386240]',
                            '[1345075088713580544, 1345075088713580544]',
                            '[1345075088730357760, 1345075088730357760]',
                            '[1345075088797466624, 1345075088797466624]',
                            '[1345075089065902080, 1345075089065902080]',
                            '[1345075090139643904, 1345075090139643904]',
                            '[1345075094434611200, 1345075094434611200]',
                            '[1345075111614480384, 1345075111614480384]',
                            '[1345075180333957120, 1345075180333957120]',
                            '[1345075455211864064, 1345075455211864064]',
                            '[1345076554723491840, 1345076554723491840]',
                            '[1345080952770002944, 1345080952770002944]',
                            '[1345098544956047360, 1345098544956047360]',
                            '[1345168913700225024, 1345168913700225024]',
                            '[1345450388676935680, 1345450388676935680]',
                            '[1346576288583778304, 1346576288583778304]',
                            '[1351079888211148800, 1351079888211148800]',
                            '[1369094286720630784, 1369094286720630784]',
                            '[5116089176692883456, 5116089176692883456]',
                            '[5170132372221329408, 5170132372221329408]',
                            '[5179139571476070401, 5179702521429491711]',
                            '[5179702521429491713, 5180265471382913023]',
                            '[5180265471382913024, 5180265471382913024]',
                            '[5183643171103440896, 5183643171103440896]',
                            '[5187020870823968768, 5187020870823968768]',
                            '[5187020870823968769, 5187583820777390079]',
                            '[5187583820777390081, 5188146770730811391]',
                            '[5188146770730811393, 5197153969985552383]',
                            '[5206161169240293376, 5206161169240293376]',
                            '[5218264593238851584, 5218264593238851584]',
                            '[5218264593238851585, 5218405330727206911]',
                            '[5218546068215562240, 5218546068215562240]',
                            '[5218546068215562241, 5219109018168983551]',
                            '[5219671968122404864, 5219671968122404864]',
                            '[5220234918075826177, 5220797868029247487]',
                            '[5220797868029247488, 5220797868029247488]',
                            '[5220938605517602817, 5221079343005958143]',
                            '[5221079343005958144, 5221079343005958144]',
                            '[5260204364768739328, 5260204364768739328]' ] } } } },
             rejectedPlans: [] } } },
     { '$group': 
        { _id: '$sourcemmsi',
          PointCount: { '$sum': { '$const': 1 } },
          MinDatePoint: { '$min': { date: '$t3' } },
          MaxDatePoint: { '$max': { date: '$t3' } } } },
     { '$sort': { sortKey: { _id: 1 }, limit: 100 } },
     { '$project': 
        { _id: true,
          MinDatePoint: true,
          MaxDatePoint: true,
          PointCount: true } } ],
  serverInfo: 
   { host: 'ubuntu16',
     port: 27017,
     version: '4.4.1',
     gitVersion: 'ad91a93a5a31e175f5cbf8c69561e788bbc55ce1' },
  ok: 1 }

当然,我知道这更复杂,因为查询具有分组功能,但我们的想法是,通常,除非索引导致引擎内部的排序与 geoNear 不同,否则我们会更快而不是更慢地获得索引。

此外,如果查询和索引改进如何对查询产生影响,MongoDB 提供了完整的分析,但对于 geoWithin 的信息并不多。MongoDB 声明结果没有使用 GeoWithin 排序,所以我没有找到延迟的原因。 https://www.mongodb.com/blog/post/geospatial-performance-improvements-in-mongodb-3-2

任何想法或意见,为什么使用索引的查询较慢?

4

1 回答 1

0

经过大量调查,似乎一旦查询请求超过 70% 的数据集,在这种情况下,95% 的索引比没有索引要慢。

这种情况也存在于地理空间以外的其他索引中,例如数字或描述性列(ship_name、ship_number 或 timestamp)中的简单索引。

发生这种情况是因为 RDBMS 必须搜索索引的键以及文档的键,这会导致更高的执行时间。

另一方面,这不应该发生,因为 Mongo-Planner 应该能够解决这个问题并且不提供索引以供进一步使用,从而保持对密钥的访问较低。

该问题在 MongoDB 支持中打开,可以在此处找到:

https://jira.mongodb.org/browse/SERVER-53709

于 2021-01-21T13:43:13.113 回答