1

我有一个由以下字段组成的 mongo 图像元数据集合:camera_name(str)、photographer_name(str)、resolution(str)、image_size(int in MB, rounded) 和 timestamp(10 digit UNIX timestamp)

我只想运行 2 个查询:

  1. 给定 camera_name,返回时间戳 <= 1639457261(示例 UNIX 时间戳)的记录。记录必须按降序排序
  2. 给定相机名称、摄影师名称、分辨率、图像大小和时间戳,我想检索记录,按输入的时间戳的降序排序。

我创建了 2 个索引:

  1. { "camera_name": 1, "timestamp": -1 }
  2. { "camera_name": 1, "photographer_name": 1, "resolution": 1, "image_size": 1, "timestamp": -1}

第一个索引有效,但是当我对第二个索引运行查询时,没有返回任何记录。我确信集合中存在记录,并且我希望在运行第二个查询时至少获得 10 条记录,但它返回一个空列表。

索引的配置方式有问题吗?谢谢

这是示例数据:

{"camera_name": "Nikon", "photographer_name": "Aaron", "resolution": "1920x1080", "image_size": "3", "timestamp": 1397232415}
{"camera_name": "Nikon", "photographer_name": "Paul", "resolution": "1920x1080", "image_size": "4", "timestamp": 1717286853}
{"camera_name": "Nikon", "photographer_name": "Beth", "resolution": "720x480", "image_size": "1", "timestamp": 1503582086}
{"camera_name": "Nikon", "photographer_name": "Aaron", "resolution": "1920x1080", "image_size": "4", "timestamp": 1500628458}
{"camera_name": "Nikon", "photographer_name": "Paul", "resolution": "1920x1080", "image_size": "6", "timestamp": 1407580951}
{"camera_name": "Canon", "photographer_name": "Beth", "resolution": "1920x1080", "image_size": "5", "timestamp": 1166049453}
{"camera_name": "Canon", "photographer_name": "Paul", "resolution": "720x480", "image_size": "2", "timestamp": 1086317569}
{"camera_name": "Canon", "photographer_name": "Beth", "resolution": "720x480", "image_size": "1", "timestamp": 1400638926}
{"camera_name": "Canon", "photographer_name": "Aaron", "resolution": "720x480", "image_size": "1", "timestamp": 1345248762}
{"camera_name": "Canon", "photographer_name": "Paul", "resolution": "1920x1080", "image_size": "5", "timestamp": 1462360853}
{"camera_name": "Fuji", "photographer_name": "Beth", "resolution": "720x480", "image_size": "2", "timestamp": 1815298047}
{"camera_name": "Fuji", "photographer_name": "Shane", "resolution": "720x480", "image_size": "3", "timestamp": 1666493455}
{"camera_name": "Fuji", "photographer_name": "Beth", "resolution": "1920x1080", "image_size": "5", "timestamp": 1846677247}
{"camera_name": "Fuji", "photographer_name": "Beth", "resolution": "1920x1080", "image_size": "5", "timestamp": 1630996389}
{"camera_name": "Fuji", "photographer_name": "Shane", "resolution": "720x480", "image_size": "2", "timestamp": 1816829362}

我执行的查询:

  1. camera_name=Nikon and timestamp<=1503582086 应该返回 4 条记录
  2. camera_name='Fuji' ,photographer_name='Beth', resolution='1920x1080', image_size='5' and timestamp<=1900000000 应该返回 2 条记录,但我得到 0 条记录
4

1 回答 1

0

索引不会“过滤”结果,它们允许您通过扫描索引树而不是扫描原始文档来更快地访问数据。

这意味着如果第二个查询“不返回任何内容”,则它与您构建的任何索引都无关,但您使用的实际查询与数据库中的任何文档都不匹配。

我还将提到您的第二个索引可能会更小(取决于某些假设,如规模和数据分布),这可以帮助更新/插入性能,同时额外减少存储大小。但是,从原始数据的外观来看,我认为这些并不是您的紧迫考虑。

于 2021-12-14T07:35:34.557 回答