8

我有一个由 SQLAlchemy ORM 生成的查询。它应该检索特定课程的 stream_items,以及它们的所有部分——资源、内容文本块等,以及发布它们的用户。然而,这个查询似乎非常慢,在我们的生产数据库上花费几分钟,数据库中有大约 20,000 个用户,课程有 25 个左右的 stream_items,每个 stream_item 有几个内容文本块。请注意,数据库中除了用户之外几乎没有任何其他记录,因为我们导入了一堆用户但内容很少。

编辑:请注意,每个对象 id 都是 franklin_object 表的外键。

我试过查看查询,并确定了几个令人不安的位(查看 EXPLAIN 输出)

  1. 其中一项查找是“使用临时的;使用文件排序'。
  2. 用户表被命中两次,没有索引
  3. 内容文本块表被命中两次,没有索引

但是,我真的不知道该怎么办,尤其是后两个问题。

这是查询:

SELECT stream_item.id                               AS stream_item_id,
       franklin_object.id                           AS franklin_object_id,
       franklin_object.type                         AS franklin_object_type,
       franklin_object.uuid                         AS franklin_object_uuid,
       stream_item.parent_id                        AS stream_item_parent_id,
       stream_item.shown_at                         AS stream_item_shown_at,
       stream_item.author_id                        AS stream_item_author_id,
       stream_item.stream_sort_at                   AS stream_item_stream_sort_at,
       stream_item.PUBLIC                           AS stream_item_public,
       stream_item.created_at                       AS stream_item_created_at,
       stream_item.updated_at                       AS stream_item_updated_at,
       anon_1.content_text_block_text               AS anon_1_content_text_block_text,
       anon_2.resource_id                           AS anon_2_resource_id,
       anon_2.franklin_object_id                    AS anon_2_franklin_object_id,
       anon_2.franklin_object_type                  AS anon_2_franklin_object_type,
       anon_2.franklin_object_uuid                  AS anon_2_franklin_object_uuid,
       anon_2.resource_top_parent_resource          AS anon_2_resource_top_parent_resource,
       anon_2.resource_top_parent_id                AS anon_2_resource_top_parent_id,
       anon_2.resource_title                        AS anon_2_resource_title,
       anon_2.resource_url                          AS anon_2_resource_url,
       anon_2.resource_image                        AS anon_2_resource_image,
       anon_2.resource_created_at                   AS anon_2_resource_created_at,
       anon_2.resource_updated_at                   AS anon_2_resource_updated_at,
       franklin_object_1.id                         AS franklin_object_1_id,
       franklin_object_1.type                       AS franklin_object_1_type,
       franklin_object_1.uuid                       AS franklin_object_1_uuid,
       anon_1.content_text_block_id                 AS anon_1_content_text_block_id,
       anon_1.franklin_object_id                    AS anon_1_franklin_object_id,
       anon_1.franklin_object_type                  AS anon_1_franklin_object_type,
       anon_1.franklin_object_uuid                  AS anon_1_franklin_object_uuid,
       anon_1.content_text_block_position           AS anon_1_content_text_block_position,
       anon_1.content_text_block_franklin_object_id AS anon_1_content_text_block_franklin_object_id,
       anon_1.content_text_block_created_at         AS anon_1_content_text_block_created_at,
       anon_1.content_text_block_updated_at         AS anon_1_content_text_block_updated_at,
       anon_3.user_password                         AS anon_3_user_password,
       anon_3.user_auth_token                       AS anon_3_user_auth_token,
       anon_3.user_id                               AS anon_3_user_id,
       anon_3.franklin_object_id                    AS anon_3_franklin_object_id,
       anon_3.franklin_object_type                  AS anon_3_franklin_object_type,
       anon_3.franklin_object_uuid                  AS anon_3_franklin_object_uuid,
       anon_3.user_email                            AS anon_3_user_email,
       anon_3.user_auth_token_expiration            AS anon_3_user_auth_token_expiration,
       anon_3.user_active                           AS anon_3_user_active,
       anon_3.user_activation_token                 AS anon_3_user_activation_token,
       anon_3.user_first_name                       AS anon_3_user_first_name,
       anon_3.user_last_name                        AS anon_3_user_last_name,
       anon_3.user_image                            AS anon_3_user_image,
       anon_3.user_bio                              AS anon_3_user_bio,
       anon_3.user_aspirations                      AS anon_3_user_aspirations,
       anon_3.user_website                          AS anon_3_user_website,
       anon_3.user_resume                           AS anon_3_user_resume,
       anon_3.user_resume_name                      AS anon_3_user_resume_name,
       anon_3.user_primary_role                     AS anon_3_user_primary_role,
       anon_3.user_institution_id                   AS anon_3_user_institution_id,
       anon_3.user_birth_date                       AS anon_3_user_birth_date,
       anon_3.user_gender                           AS anon_3_user_gender,
       anon_3.user_graduation_year                  AS anon_3_user_graduation_year,
       anon_3.user_complete                         AS anon_3_user_complete,
       anon_3.user_masthead_y_position              AS anon_3_user_masthead_y_position,
       anon_3.user_masthead                         AS anon_3_user_masthead,
       anon_3.user_fb_access_token                  AS anon_3_user_fb_access_token,
       anon_3.user_fb_user_id                       AS anon_3_user_fb_user_id,
       anon_3.user_location                         AS anon_3_user_location,
       anon_3.user_created_at                       AS anon_3_user_created_at,
       anon_3.user_updated_at                       AS anon_3_user_updated_at,
       anon_4.content_text_block_text               AS anon_4_content_text_block_text,
       anon_4.content_text_block_id                 AS anon_4_content_text_block_id,
       anon_4.franklin_object_id                    AS anon_4_franklin_object_id,
       anon_4.franklin_object_type                  AS anon_4_franklin_object_type,
       anon_4.franklin_object_uuid                  AS anon_4_franklin_object_uuid,
       anon_4.content_text_block_position           AS anon_4_content_text_block_position,
       anon_4.content_text_block_franklin_object_id AS anon_4_content_text_block_franklin_object_id,
       anon_4.content_text_block_created_at         AS anon_4_content_text_block_created_at,
       anon_4.content_text_block_updated_at         AS anon_4_content_text_block_updated_at,
       anon_5.user_password                         AS anon_5_user_password,
       anon_5.user_auth_token                       AS anon_5_user_auth_token,
       anon_5.user_id                               AS anon_5_user_id,
       anon_5.franklin_object_id                    AS anon_5_franklin_object_id,
       anon_5.franklin_object_type                  AS anon_5_franklin_object_type,
       anon_5.franklin_object_uuid                  AS anon_5_franklin_object_uuid,
       anon_5.user_email                            AS anon_5_user_email,
       anon_5.user_auth_token_expiration            AS anon_5_user_auth_token_expiration,
       anon_5.user_active                           AS anon_5_user_active,
       anon_5.user_activation_token                 AS anon_5_user_activation_token,
       anon_5.user_first_name                       AS anon_5_user_first_name,
       anon_5.user_last_name                        AS anon_5_user_last_name,
       anon_5.user_image                            AS anon_5_user_image,
       anon_5.user_bio                              AS anon_5_user_bio,
       anon_5.user_aspirations                      AS anon_5_user_aspirations,
       anon_5.user_website                          AS anon_5_user_website,
       anon_5.user_resume                           AS anon_5_user_resume,
       anon_5.user_resume_name                      AS anon_5_user_resume_name,
       anon_5.user_primary_role                     AS anon_5_user_primary_role,
       anon_5.user_institution_id                   AS anon_5_user_institution_id,
       anon_5.user_birth_date                       AS anon_5_user_birth_date,
       anon_5.user_gender                           AS anon_5_user_gender,
       anon_5.user_graduation_year                  AS anon_5_user_graduation_year,
       anon_5.user_complete                         AS anon_5_user_complete,
       anon_5.user_masthead_y_position              AS anon_5_user_masthead_y_position,
       anon_5.user_masthead                         AS anon_5_user_masthead,
       anon_5.user_fb_access_token                  AS anon_5_user_fb_access_token,
       anon_5.user_fb_user_id                       AS anon_5_user_fb_user_id,
       anon_5.user_location                         AS anon_5_user_location,
       anon_5.user_created_at                       AS anon_5_user_created_at,
       anon_5.user_updated_at                       AS anon_5_user_updated_at,
       anon_6.stream_item_id                        AS anon_6_stream_item_id,
       anon_6.franklin_object_id                    AS anon_6_franklin_object_id,
       anon_6.franklin_object_type                  AS anon_6_franklin_object_type,
       anon_6.franklin_object_uuid                  AS anon_6_franklin_object_uuid,
       anon_6.stream_item_parent_id                 AS anon_6_stream_item_parent_id,
       anon_6.stream_item_shown_at                  AS anon_6_stream_item_shown_at,
       anon_6.stream_item_author_id                 AS anon_6_stream_item_author_id,
       anon_6.stream_item_stream_sort_at            AS anon_6_stream_item_stream_sort_at,
       anon_6.stream_item_public                    AS anon_6_stream_item_public,
       anon_6.stream_item_created_at                AS anon_6_stream_item_created_at,
       anon_6.stream_item_updated_at                AS anon_6_stream_item_updated_at
FROM   franklin_object
       INNER JOIN stream_item
               ON franklin_object.id = stream_item.id
       INNER JOIN (SELECT franklin_object.id                    AS franklin_object_id,
                          franklin_object.type                  AS franklin_object_type,
                          franklin_object.uuid                  AS franklin_object_uuid,
                          content_text_block.id                 AS content_text_block_id,
                          content_text_block.text               AS content_text_block_text,
                          content_text_block.position           AS content_text_block_position,
                          content_text_block.franklin_object_id AS content_text_block_franklin_object_id,
                          content_text_block.created_at         AS content_text_block_created_at,
                          content_text_block.updated_at         AS content_text_block_updated_at
                   FROM   franklin_object
                          INNER JOIN content_text_block
                                  ON franklin_object.id = content_text_block.id) AS anon_1
               ON stream_item.id = anon_1.content_text_block_franklin_object_id
       LEFT OUTER JOIN contents_resources AS contents_resources_1
                    ON anon_1.content_text_block_id = contents_resources_1.content_id
       LEFT OUTER JOIN (SELECT franklin_object.id           AS franklin_object_id,
                               franklin_object.type         AS franklin_object_type,
                               franklin_object.uuid         AS franklin_object_uuid,
                               resource.id                  AS resource_id,
                               resource.top_parent_resource AS resource_top_parent_resource,
                               resource.top_parent_id       AS resource_top_parent_id,
                               resource.title               AS resource_title,
                               resource.url                 AS resource_url,
                               resource.image               AS resource_image,
                               resource.created_at          AS resource_created_at,
                               resource.updated_at          AS resource_updated_at
                        FROM   franklin_object
                               INNER JOIN resource
                                       ON franklin_object.id = resource.id) AS anon_2
                    ON anon_2.resource_id = contents_resources_1.resource_id
       LEFT OUTER JOIN contents_franklin_objects AS contents_franklin_objects_1
                    ON anon_1.content_text_block_id = contents_franklin_objects_1.content_id
       LEFT OUTER JOIN franklin_object AS franklin_object_1
                    ON franklin_object_1.id = contents_franklin_objects_1.franklin_object_id
       LEFT OUTER JOIN likers AS likers_1
                    ON stream_item.id = likers_1.post_id
       LEFT OUTER JOIN (SELECT franklin_object.id         AS franklin_object_id,
                               franklin_object.type       AS franklin_object_type,
                               franklin_object.uuid       AS franklin_object_uuid,
                               USER.id                    AS user_id,
                               USER.email                 AS user_email,
                               USER.password              AS user_password,
                               USER.auth_token            AS user_auth_token,
                               USER.auth_token_expiration AS user_auth_token_expiration,
                               USER.active                AS user_active,
                               USER.activation_token      AS user_activation_token,
                               USER.first_name            AS user_first_name,
                               USER.last_name             AS user_last_name,
                               USER.image                 AS user_image,
                               USER.bio                   AS user_bio,
                               USER.aspirations           AS user_aspirations,
                               USER.website               AS user_website,
                               USER.resume                AS user_resume,
                               USER.resume_name           AS user_resume_name,
                               USER.primary_role          AS user_primary_role,
                               USER.institution_id        AS user_institution_id,
                               USER.birth_date            AS user_birth_date,
                               USER.gender                AS user_gender,
                               USER.graduation_year       AS user_graduation_year,
                               USER.complete              AS user_complete,
                               USER.masthead_y_position   AS user_masthead_y_position,
                               USER.masthead              AS user_masthead,
                               USER.fb_access_token       AS user_fb_access_token,
                               USER.fb_user_id            AS user_fb_user_id,
                               USER.location              AS user_location,
                               USER.created_at            AS user_created_at,
                               USER.updated_at            AS user_updated_at
                        FROM   franklin_object
                               INNER JOIN USER
                                       ON franklin_object.id = USER.id) AS anon_3
                    ON anon_3.user_id = likers_1.user_id
       LEFT OUTER JOIN contents_franklin_objects AS contents_franklin_objects_2
                    ON franklin_object.id = contents_franklin_objects_2.franklin_object_id
       LEFT OUTER JOIN (SELECT franklin_object.id                    AS franklin_object_id,
                               franklin_object.type                  AS franklin_object_type,
                               franklin_object.uuid                  AS franklin_object_uuid,
                               content_text_block.id                 AS content_text_block_id,
                               content_text_block.text               AS content_text_block_text,
                               content_text_block.position           AS content_text_block_position,
                               content_text_block.franklin_object_id AS content_text_block_franklin_object_id,
                               content_text_block.created_at         AS content_text_block_created_at,
                               content_text_block.updated_at         AS content_text_block_updated_at
                        FROM   franklin_object
                               INNER JOIN content_text_block
                                       ON franklin_object.id = content_text_block.id) AS anon_4
                    ON anon_4.content_text_block_id = contents_franklin_objects_2.content_id
       LEFT OUTER JOIN (SELECT franklin_object.id         AS franklin_object_id,
                               franklin_object.type       AS franklin_object_type,
                               franklin_object.uuid       AS franklin_object_uuid,
                               stream_item.id             AS stream_item_id,
                               stream_item.parent_id      AS stream_item_parent_id,
                               stream_item.shown_at       AS stream_item_shown_at,
                               stream_item.author_id      AS stream_item_author_id,
                               stream_item.stream_sort_at AS stream_item_stream_sort_at,
                               stream_item.PUBLIC         AS stream_item_public,
                               stream_item.created_at     AS stream_item_created_at,
                               stream_item.updated_at     AS stream_item_updated_at
                        FROM   franklin_object
                               INNER JOIN stream_item
                                       ON franklin_object.id = stream_item.id) AS anon_6
                    ON anon_6.stream_item_parent_id = franklin_object.id
       LEFT OUTER JOIN likers AS likers_2
                    ON anon_6.stream_item_id = likers_2.post_id
       LEFT OUTER JOIN (SELECT franklin_object.id         AS franklin_object_id,
                               franklin_object.type       AS franklin_object_type,
                               franklin_object.uuid       AS franklin_object_uuid,
                               USER.id                    AS user_id,
                               USER.email                 AS user_email,
                               USER.password              AS user_password,
                               USER.auth_token            AS user_auth_token,
                               USER.auth_token_expiration AS user_auth_token_expiration,
                               USER.active                AS user_active,
                               USER.activation_token      AS user_activation_token,
                               USER.first_name            AS user_first_name,
                               USER.last_name             AS user_last_name,
                               USER.image                 AS user_image,
                               USER.bio                   AS user_bio,
                               USER.aspirations           AS user_aspirations,
                               USER.website               AS user_website,
                               USER.resume                AS user_resume,
                               USER.resume_name           AS user_resume_name,
                               USER.primary_role          AS user_primary_role,
                               USER.institution_id        AS user_institution_id,
                               USER.birth_date            AS user_birth_date,
                               USER.gender                AS user_gender,
                               USER.graduation_year       AS user_graduation_year,
                               USER.complete              AS user_complete,
                               USER.masthead_y_position   AS user_masthead_y_position,
                               USER.masthead              AS user_masthead,
                               USER.fb_access_token       AS user_fb_access_token,
                               USER.fb_user_id            AS user_fb_user_id,
                               USER.location              AS user_location,
                               USER.created_at            AS user_created_at,
                               USER.updated_at            AS user_updated_at
                        FROM   franklin_object
                               INNER JOIN USER
                                       ON franklin_object.id = USER.id) AS anon_5
                    ON anon_5.user_id = likers_2.user_id
WHERE  stream_item.parent_id = 11
ORDER  BY stream_item.stream_sort_at DESC,
          anon_1.content_text_block_position,
          anon_6.stream_item_stream_sort_at DESC 

和解释输出:

ID   SELECT_TYPE   TABLE    POSSIBLY_KEYS KEY KEY_LEN REF ROWS EXTRA
1   PRIMARY <derived2>  ALL NULL    NULL    NULL    NULL    599 Using     temporary; Using filesort
1   PRIMARY stream_item eq_ref  PRIMARY,parent_id   PRIMARY 4   anon_1.content_text_block_franklin_object_id    1   Using where
1   PRIMARY contents_resources_1    ref content_id  content_id  5    anon_1.content_text_block_id   2   
1   PRIMARY <derived3>  ALL NULL    NULL    NULL    NULL    7   
1   PRIMARY contents_franklin_objects_1 ref content_id  content_id  5   anon_1.content_text_block_id    1   
1   PRIMARY franklin_object eq_ref  PRIMARY PRIMARY 4   franklin.stream_item.id 1   Using where
1   PRIMARY franklin_object_1   eq_ref  PRIMARY PRIMARY 4   franklin.contents_franklin_objects_1.franklin_object_id 1   
1   PRIMARY likers_1    ref post_id post_id 5   franklin.stream_item.id 1
1   PRIMARY <derived4>  ALL NULL    NULL    NULL    NULL    136 
1   PRIMARY contents_franklin_objects_2 ref franklin_object_id  franklin_object_id  5   franklin.stream_item.id 1   
1   PRIMARY <derived5>  ALL NULL    NULL    NULL    NULL    599 
1   PRIMARY <derived6>  ALL NULL    NULL    NULL    NULL    608 
1   PRIMARY likers_2    ref post_id post_id 5   anon_6.stream_item_id   1   
1   PRIMARY <derived7>  ALL NULL    NULL    NULL    NULL    136 
7   DERIVED user    ALL PRIMARY NULL    NULL    NULL    133 
7   DERIVED franklin_object eq_ref  PRIMARY PRIMARY 4   franklin.user.id    1   
6   DERIVED stream_item ALL PRIMARY NULL    NULL    NULL    709 
6   DERIVED franklin_object eq_ref  PRIMARY PRIMARY 4   franklin.stream_item.id 1   
5   DERIVED content_text_block  ALL PRIMARY NULL    NULL    NULL    666 
5   DERIVED franklin_object eq_ref  PRIMARY PRIMARY 4   franklin.content_text_block.id        1 
4   DERIVED user    ALL PRIMARY NULL    NULL    NULL    133 
4   DERIVED franklin_object eq_ref  PRIMARY PRIMARY 4   franklin.user.id    1   
3   DERIVED resource    ALL PRIMARY NULL    NULL    NULL    7   
3   DERIVED franklin_object eq_ref  PRIMARY PRIMARY 4   franklin.resource.id    1   
2   DERIVED content_text_block  ALL PRIMARY NULL    NULL    NULL    666 
2   DERIVED franklin_object eq_ref  PRIMARY PRIMARY 4   franklin.content_text_block.id  1   

如何更快地减少所有查询?还有什么其他方法可以加快速度?

franklin_objects 的设置方式是反模式吗?它的工作方式是 franklin_object 表有两列:id 和 type。然后每个类型都是一个表,主键是 franklin_object 的外键。

生成 sql 的代码类似于:

stream_item_query = StreamItem.query.options(db.joinedload('stream_items'),db.joinedload('contents_included_in'),db.joinedload('contents.resources'),db.joinedload('contents.objects'),db.subqueryload('likers'))

stream_items = stream_item_query.filter(StreamItem.parent_id == community_id).order_by(db.desc(StreamItem.stream_sort_at)).all()

4

1 回答 1

8

哇,这个有点伤脑筋。试图弄清楚查询在做什么,所有的表是什么,以及关系是乏味的。如果您有类似的经历,请首先提示您可能在这个单一查询中尝试做太多事情。

我的建议是重新考虑你的整个方法。

SQLAlchemy 是一个非常好的工具,我不会抨击它(或您选择的 mysql),但与大多数 ORM 工具一样,您需要考虑使用它们的成本。一个例子是这种franklin_object餐桌业务。这是反模式吗?是和否。从纯粹的面向对象的角度来看,这是有道理的。您可以通过在此表中查找来确定要查询哪些id表。从关系查询的角度来看,它的用途很小。我可以从您的查询中删除每个实例,franklin_object并且除了...中的列之外什么也不会丢失franklin_object。如果这是一个可行的选择,我会立即这样做。

让我们进一步研究这种联系franklin_object。查看子查询,它们都具有相同的形式:

  SELECT franklin_object.id           AS franklin_object_id,
         franklin_object.type         AS franklin_object_type,
         franklin_object.uuid         AS franklin_object_uuid,
         linked_table.id              AS linked_table_id,
         linked_table.col2            AS col2 --and more
  FROM   franklin_object
  INNER JOIN linked_table
         ON franklin_object.id = linked_table.id) AS anon_n

关于如何优化这部分查询,无论统计信息如何,数据库都没有太多信息可以继续。也许如果通过在子句中franklin_object指定限制,查询会更好。也许。typewhere

这对于 USER 表尤其成问题,因为该表有很多记录(如您所说)。由于您正在查询大部分列,并且优化器无法准确计算将检索多少行,因此执行全表扫描是有意义的。在你的情况下,两次。

另一个方面是所涉及的连接的绝对数量。如果我们取出所有franklin_object引用,仍然有 11 个连接。如果您的数据模型更具相关性,那并不可怕,但事实并非如此。生成的查询对数据库找出执行查询的最佳方式没有多大帮助,因此它做得不好。也许你可以通过提示等来缓解这种情况,但我敢打赌,从长远来看,这会咬你一口。

您正在使用 ORM 工具,所以请真正使用它。通过一次完成如此大的查询,您不会获得任何好处。为了性能,它可以分开一点。执行惰性检索以避免庞大、复杂的查询。我会说尝试,只是为了看看进展如何,懒洋洋地做每一件事。性能可能会好,我会说更好。不太好,甚至可能无法接受,但比在数据库翻腾时喝咖啡要好。

然后,开始将事物拼凑成更流线型的块。将逻辑上有意义的对象联系在一起,例如resourcecontents_resourcesstream_item另一个例子,likers和之间的连接user是重复的。做一个查询,让 SQLAlchemy 做它的事。

作为最后的手段,可以实现某种缓存机制。也许在某处非规范化表格。在一个变化缓慢、读取量大的系统上,您可以将这些表输入到另一个结构中,在该结构中查询是直接且快速的。也就是说,预先进行处理并将其存储在一个表中。

祝你好运

于 2012-07-20T18:56:31.510 回答