最近,我开始用 AIS Data 研究 MongoDB 的性能。我使用了一个包含 19m 文档的集合,这些文档具有定义中描述的正确字段类型。我还在同一个集合中从坐标 (lon,lat) 创建了一个新的 geoloc 字段,其类型为:(Point)。
正在调查的查询是:
db.nari_dynamic.explain('executionStats').aggregate
(
[
{
"$match": {
"geoloc": {
"$geoWithin": {
"$geometry": {
"type" : "Polygon" ,
"coordinates": [ [ [ -5.00, 45.00 ], [ +0.00, 45.00 ], [ +0.00, 50.00 ], [ -5.00, 50.00 ], [ -5.00, 45.00 ] ] ]
}}}}
},
{ "$group": {"_id": "$sourcemmsi", "PointCount": {"$sum" : 1}, "MinDatePoint": {"$min" : {"date": "$t3" }}, "MaxDatePoint": {"$max" : {"date": "$t3" }} }},
{ "$sort": {"_id":1} },
{ "$limit":100 },
{ "$project": {"_id":1, "PointCount":1, "MinDatePoint":1, "MaxDatePoint":1} }
],
{ explain:true}
)
在调查和测试过程中,我发现了以下内容:
- 无任何指标:94s
- 使用 geoloc-2dsphere 索引:280s
以下是执行统计数据: 没有索引
{ stages:
[ { '$cursor':
{ queryPlanner:
{ plannerVersion: 1,
namespace: 'mscdata.nari_dynamic',
indexFilterSet: false,
parsedQuery:
{ geoloc:
{ '$geoWithin':
{ '$geometry':
{ type: 'Polygon',
coordinates: [ [ [ -5, 45 ], [ 0, 45 ], [ 0, 50 ], [ -5, 50 ], [ -5, 45 ] ] ] } } } },
queryHash: '6E2EAB94',
planCacheKey: '6E2EAB94',
winningPlan:
{ stage: 'PROJECTION_SIMPLE',
transformBy: { sourcemmsi: 1, t3: 1, _id: 0 },
inputStage:
{ stage: 'COLLSCAN',
filter:
{ geoloc:
{ '$geoWithin':
{ '$geometry':
{ type: 'Polygon',
coordinates: [ [ [ -5, 45 ], [ 0, 45 ], [ 0, 50 ], [ -5, 50 ], [ -5, 45 ] ] ] } } } },
direction: 'forward' } },
rejectedPlans: [] } } },
{ '$group':
{ _id: '$sourcemmsi',
PointCount: { '$sum': { '$const': 1 } },
MinDatePoint: { '$min': { date: '$t3' } },
MaxDatePoint: { '$max': { date: '$t3' } } } },
{ '$sort': { sortKey: { _id: 1 }, limit: 100 } },
{ '$project':
{ _id: true,
PointCount: true,
MaxDatePoint: true,
MinDatePoint: true } } ],
serverInfo:
{ host: 'ubuntu16',
port: 27017,
version: '4.4.1',
gitVersion: 'ad91a93a5a31e175f5cbf8c69561e788bbc55ce1' },
ok: 1 }
以下是执行统计数据: 使用索引
{ stages:
[ { '$cursor':
{ queryPlanner:
{ plannerVersion: 1,
namespace: 'mscdata.nari_dynamic',
indexFilterSet: false,
parsedQuery:
{ geoloc:
{ '$geoWithin':
{ '$geometry':
{ type: 'Polygon',
coordinates: [ [ [ -5, 45 ], [ 0, 45 ], [ 0, 50 ], [ -5, 50 ], [ -5, 45 ] ] ] } } } },
queryHash: '6E2EAB94',
planCacheKey: 'F35B194B',
winningPlan:
{ stage: 'PROJECTION_SIMPLE',
transformBy: { sourcemmsi: 1, t3: 1, _id: 0 },
inputStage:
{ stage: 'FETCH',
filter:
{ geoloc:
{ '$geoWithin':
{ '$geometry':
{ type: 'Polygon',
coordinates: [ [ [ -5, 45 ], [ 0, 45 ], [ 0, 50 ], [ -5, 50 ], [ -5, 45 ] ] ] } } } },
inputStage:
{ stage: 'IXSCAN',
keyPattern: { geoloc: '2dsphere' },
indexName: 'geoloc-field',
isMultiKey: false,
multiKeyPaths: { geoloc: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'forward',
indexBounds:
{ geoloc:
[ '[936748722493063168, 936748722493063168]',
'[954763121002545152, 954763121002545152]',
'[959266720629915648, 959266720629915648]',
'[960392620536758272, 960392620536758272]',
'[960674095513468928, 960674095513468928]',
'[960744464257646592, 960744464257646592]',
'[960762056443691008, 960762056443691008]',
'[960766454490202112, 960766454490202112]',
'[960767554001829888, 960767554001829888]',
'[960767828879736832, 960767828879736832]',
'[960767897599213568, 960767897599213568]',
'[960767914779082752, 960767914779082752]',
'[960767919074050048, 960767919074050048]',
'[960767920147791872, 960767920147791872]',
'[960767920416227328, 960767920416227328]',
'[960767920483336192, 960767920483336192]',
'[960767920500113408, 960767920500113408]',
'[960767920504307712, 960767920504307712]',
'[960767920505356288, 960767920505356288]',
'[960767920505618432, 960767920505618432]',
'[960767920505683968, 960767920505683968]',
'[960767920505683969, 960767920505716735]',
'[1345075088707977217, 1345075088708009983]',
'[1345075088708009984, 1345075088708009984]',
'[1345075088708075520, 1345075088708075520]',
'[1345075088708337664, 1345075088708337664]',
'[1345075088709386240, 1345075088709386240]',
'[1345075088713580544, 1345075088713580544]',
'[1345075088730357760, 1345075088730357760]',
'[1345075088797466624, 1345075088797466624]',
'[1345075089065902080, 1345075089065902080]',
'[1345075090139643904, 1345075090139643904]',
'[1345075094434611200, 1345075094434611200]',
'[1345075111614480384, 1345075111614480384]',
'[1345075180333957120, 1345075180333957120]',
'[1345075455211864064, 1345075455211864064]',
'[1345076554723491840, 1345076554723491840]',
'[1345080952770002944, 1345080952770002944]',
'[1345098544956047360, 1345098544956047360]',
'[1345168913700225024, 1345168913700225024]',
'[1345450388676935680, 1345450388676935680]',
'[1346576288583778304, 1346576288583778304]',
'[1351079888211148800, 1351079888211148800]',
'[1369094286720630784, 1369094286720630784]',
'[5116089176692883456, 5116089176692883456]',
'[5170132372221329408, 5170132372221329408]',
'[5179139571476070401, 5179702521429491711]',
'[5179702521429491713, 5180265471382913023]',
'[5180265471382913024, 5180265471382913024]',
'[5183643171103440896, 5183643171103440896]',
'[5187020870823968768, 5187020870823968768]',
'[5187020870823968769, 5187583820777390079]',
'[5187583820777390081, 5188146770730811391]',
'[5188146770730811393, 5197153969985552383]',
'[5206161169240293376, 5206161169240293376]',
'[5218264593238851584, 5218264593238851584]',
'[5218264593238851585, 5218405330727206911]',
'[5218546068215562240, 5218546068215562240]',
'[5218546068215562241, 5219109018168983551]',
'[5219671968122404864, 5219671968122404864]',
'[5220234918075826177, 5220797868029247487]',
'[5220797868029247488, 5220797868029247488]',
'[5220938605517602817, 5221079343005958143]',
'[5221079343005958144, 5221079343005958144]',
'[5260204364768739328, 5260204364768739328]' ] } } } },
rejectedPlans: [] } } },
{ '$group':
{ _id: '$sourcemmsi',
PointCount: { '$sum': { '$const': 1 } },
MinDatePoint: { '$min': { date: '$t3' } },
MaxDatePoint: { '$max': { date: '$t3' } } } },
{ '$sort': { sortKey: { _id: 1 }, limit: 100 } },
{ '$project':
{ _id: true,
MinDatePoint: true,
MaxDatePoint: true,
PointCount: true } } ],
serverInfo:
{ host: 'ubuntu16',
port: 27017,
version: '4.4.1',
gitVersion: 'ad91a93a5a31e175f5cbf8c69561e788bbc55ce1' },
ok: 1 }
当然,我知道这更复杂,因为查询具有分组功能,但我们的想法是,通常,除非索引导致引擎内部的排序与 geoNear 不同,否则我们会更快而不是更慢地获得索引。
此外,如果查询和索引改进如何对查询产生影响,MongoDB 提供了完整的分析,但对于 geoWithin 的信息并不多。MongoDB 声明结果没有使用 GeoWithin 排序,所以我没有找到延迟的原因。 https://www.mongodb.com/blog/post/geospatial-performance-improvements-in-mongodb-3-2
任何想法或意见,为什么使用索引的查询较慢?