我在 Citus 上运行一个需要几个连接的查询。
https://explain.dalibo.com/plan/xCH
让我们不要考虑查询是 EAV 模型(实体属性值)这一事实,而是 sort/remote_scan 只需 100 个元素就需要大约 30 秒的事实。
PS:我也在没有 EAV 模型的情况下运行它,并且遇到了相同的 remote_scan 问题。
这对我来说看起来有点慢,因为其余的似乎都非常快。
Limit (cost=3821.93..3822.18 rows=100 width=64) (actual time=31798.709..31798.728 rows=100 loops=1)
Output: remote_scan."?column?", remote_scan."?column?_1"
-> Sort (cost=3821.93..4071.93 rows=100000 width=64) (actual time=31798.707..31798.715 rows=100 loops=1)
Output: remote_scan."?column?", remote_scan."?column?_1"
Sort Key: remote_scan."?column?_1", remote_scan."?column?"
Sort Method: top-N heapsort Memory: 35kB
-> Custom Scan (Citus Adaptive) (cost=0.00..0.00 rows=100000 width=64) (actual time=31794.489..31794.942 rows=3200 loops=1)
Output: remote_scan."?column?", remote_scan."?column?_1"
-> Distributed Subplan 13_1
-> Custom Scan (Citus Adaptive) (cost=0.00..0.00 rows=100000 width=274) (actual time=4.761..4.764 rows=26 loops=1)
Output: remote_scan.entity_id
Task Count: 32
Tasks Shown: One of 32
-> Task
Node: host=dbhost port=5433 dbname=EAVDB
-> Index Scan using objects_eav_attribute_group_idx_102040 on public.objects_eav_102040 objects_eav (cost=0.56..3141.90 rows=388 width=21) (actual time=0.052..0.126 rows=1 loops=1)
Output: entity_id
Index Cond: ((objects_eav.attribute_group)::text = ''Author''::text)
Filter: ((objects_eav.json_value ->> ''UnixName''::text) = ANY (''{Brian}''::text[]))
Rows Removed by Filter: 134
Buffers: shared hit=58
Planning Time: 0.060 ms
Execution Time: 0.137 ms
Planning Time: 0.000 ms
Execution Time: 4.787 ms
-> Distributed Subplan 13_2
-> Custom Scan (Citus Adaptive) (cost=0.00..0.00 rows=100000 width=274) (actual time=10.329..10.329 rows=0 loops=1)
Output: remote_scan.from_entity_id
Task Count: 32
Tasks Shown: One of 32
-> Task
Node: host=dbhost port=5433 dbname=EAVDB
-> Nested Loop (cost=1.25..3631.28 rows=1 width=21) (actual time=0.017..0.017 rows=0 loops=1)
Output: usertag.from_entity_id
Buffers: shared hit=10
-> Index Scan using objects_eav_attribute_group_idx_102040 on public.objects_eav_102040 tag (cost=0.56..3052.29 rows=15 width=21) (actual time=0.017..0.017 rows=0 loops=1)
Output: tag.entity_id, tag.attribute_group, tag.attribute_version, tag.json_value, tag.changed_json_fields, tag.created_by, tag.created
Index Cond: ((tag.attribute_group)::text = ''Tag''::text)
Filter: ((tag.json_value ->> ''Name''::text) = ''Access''::text)
Rows Removed by Filter: 6
Buffers: shared hit=10
-> Index Scan using entity_relation_pkey_102072 on public.entity_relation_102072 usertag (cost=0.69..38.59 rows=1 width=21) (never executed)
Output: usertag.from_entity_id
Index Cond: (((usertag.from_entity_id)::text = (tag.entity_id)::text) AND ((usertag.relation_type)::text = ''UserTag''::text))
Filter: ((usertag.json_value ->> ''Value''::text) = ''private''::text)
Planning Time: 0.315 ms
Execution Time: 0.035 ms
Planning Time: 0.000 ms
Execution Time: 10.346 ms
Task Count: 32
Tasks Shown: One of 32
-> Task
Node: host=dbhost port=5433 dbname=EAVDB
-> Limit (cost=1099356.66..1099356.67 rows=1 width=64) (actual time=1917.101..1917.111 rows=100 loops=1)
Output: ((a.json_value ->> ''ID''::text)), ((a.json_value ->> ''RegistrationDate''::text))
Buffers: shared hit=614084 read=264 dirtied=2
-> Sort (cost=1099356.66..1099356.67 rows=1 width=64) (actual time=1917.100..1917.104 rows=100 loops=1)
Output: ((a.json_value ->> ''ID''::text)), ((a.json_value ->> ''RegistrationDate''::text))
Sort Key: ((a.json_value ->> ''RegistrationDate''::text)), ((a.json_value ->> ''ID''::text))
Sort Method: top-N heapsort Memory: 37kB
Buffers: shared hit=614084 read=264 dirtied=2
-> Nested Loop Semi Join (cost=1001.93..1099356.65 rows=1 width=64) (actual time=0.999..1911.488 rows=8222 loops=1)
Output: (a.json_value ->> ''ID''::text), (a.json_value ->> ''RegistrationDate''::text)
Buffers: shared hit=614084 read=264 dirtied=2
-> Nested Loop (cost=1001.13..1099349.91 rows=1 width=395) (actual time=0.905..1578.191 rows=14169 loops=1)
Output: a.json_value, a.entity_id, aei.entity_id
Inner Unique: true
Buffers: shared hit=535024 read=262 dirtied=2
-> Gather (cost=1000.56..1099345.22 rows=1 width=21) (actual time=0.316..130.349 rows=64054 loops=1)
Output: aei.entity_id
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=214973 read=43 dirtied=2
-> Parallel Index Scan using objects_eav_attribute_group_idx_102040 on public.objects_eav_102040 aei (cost=0.56..1098345.12 rows=1 width=21) (actual time=0.370..541.166 rows=21351 loops=3)
Output: aei.entity_id
Index Cond: ((aei.attribute_group)::text = ''UserExtraInfo''::text)
Filter: ((((aei.json_value ->> ''Stat5''::text))::double precision > ''0.1''::double precision) AND (((aei.json_value ->> ''Stat5''::text))::double precision <= ''1000''::double precision) AND (((aei.json_value ->> ''NLogin''::text))::integer = 0) AND (((aei.json_value ->> ''Views''::text))::integer = 0) AND (((aei.json_value ->> ''Stat2''::text))::integer = 0) AND (((aei.json_value ->> ''Stat3''::text))::integer = 0) AND (((aei.json_value ->> ''Stat4''::text))::integer = 1))
Rows Removed by Filter: 494566
Buffers: shared hit=214973 read=43 dirtied=2
Worker 0: actual time=0.146..1461.372 rows=60358 loops=1
Buffers: shared hit=190199 read=41 dirtied=2
Worker 1: actual time=0.895..151.434 rows=3389 loops=1
Buffers: shared hit=19233 read=2
-> Index Scan using objects_eav_pkey_102040 on public.objects_eav_102040 a (cost=0.57..2.62 rows=1 width=374) (actual time=0.024..0.024 rows=0 loops=64054)
Output: a.entity_id, a.attribute_group, a.attribute_version, a.json_value, a.changed_json_fields, a.created_by, a.created
Index Cond: (((a.entity_id)::text = (aei.entity_id)::text) AND ((a.attribute_group)::text = ''User''::text))
Filter: ((NOT (hashed SubPlan 1)) AND ((a.json_value ->> ''RegistrationDate''::text) > ''2006-01-01''::text) AND ((a.json_value ->> ''RegistrationDate''::text) <= ''2020-06-01''::text) AND ((a.json_value ->> ''Referral''::text) = ANY (''{Google,Facebook}''::text[])) AND ((a.json_value ->> ''UserType''::text) = ''normal''::text) AND ((a.json_value ->> ''Status''::text) = ''Online''::text) AND ((a.json_value ->> ''LastLogin''::text) = ''1d''::text) AND ((a.json_value ->> ''Region''::text) = ''USA''::text))
Rows Removed by Filter: 1
Buffers: shared hit=320051 read=219
SubPlan 1
-> Function Scan on pg_catalog.read_intermediate_result intermediate_result_1 (cost=0.00..0.00 rows=1 width=274) (actual time=0.047..0.047 rows=0 loops=1)
Output: intermediate_result_1.from_entity_id
Function Call: read_intermediate_result(''13_2''::text, ''binary''::citus_copy_format)
-> Hash Semi Join (cost=0.80..6.30 rows=44 width=21) (actual time=0.023..0.023 rows=1 loops=14169)
Output: amp.from_entity_id
Hash Cond: ((amp.to_entity_id)::text = (intermediate_result.entity_id)::text)
Buffers: shared hit=79060 read=2
-> Index Only Scan using entity_relation_pkey_102072 on public.entity_relation_102072 amp (cost=0.69..5.47 rows=89 width=42) (actual time=0.022..0.022 rows=4 loops=14169)
Output: amp.from_entity_id, amp.relation_type, amp.to_entity_id, amp.to_entity_id_2
Index Cond: ((amp.from_entity_id = (a.entity_id)::text) AND (amp.relation_type = ''UserMaintainer''::text))
select a.json_value ->> 'ID', a.json_value ->> 'RegistrationDate'
from objects_eav a
where a.attribute_group = 'User'
and a.json_value ->> 'UserType' in ('normal')
and a.json_value ->> 'Status' in ('Online')
and a.json_value ->> 'LastLogin' = '1d'
and a.json_value ->> 'RegistrationDate' > '2006-01-01'
and a.json_value ->> 'Region' in ('USA')
and a.json_value ->> 'RegistrationDate' <= '2020-06-01'
and a.json_value ->> 'Referral' in ('Google', 'Facebook')
and a.entity_id in (select aei.entity_id
from objects_eav aei
where aei.attribute_group = 'UserExtraInfo'
and (aei.json_value ->> 'NLogin')::int = 0
and (aei.json_value ->> 'Views')::int in (0)
and (aei.json_value ->> 'Stat2')::int in (0)
and (aei.json_value ->> 'Stat3')::int in (0)
and (aei.json_value ->> 'Stat4')::int in (1)
and (aei.json_value ->> 'Stat5')::float > 0.100000
and (aei.json_value ->> 'Stat5')::float <= 1000.000000)
and a.entity_id not in (select UserTag.from_entity_id
from entity_relation UserTag
join objects_eav Tag on
Tag.attribute_group = 'Tag'
and UserTag.from_entity_id = Tag.entity_id
where UserTag.relation_type = 'UserTag'
and Tag.json_value ->> 'Name' = 'Access'
and UserTag.json_value ->> 'Value' = 'private')
and a.entity_id in (select amp.from_entity_id
from entity_relation amp
where amp.relation_type = 'UserMaintainer'
and amp.to_entity_id in (select entity_id
from objects_eav
where attribute_group = 'Author'
and json_value ->> 'UnixName' in
('Brian')))
ORDER BY a.json_value ->> 'RegistrationDate', a.json_value ->> 'ID'
limit 100;