我有以下表结构:
CREATE TABLE mytable (
id serial PRIMARY KEY,
data jsonb
);
以下数据(部分为简洁起见......请注意年份的随机性和销售/费用年份彼此不一致):
INSERT INTO mytable (data)
VALUES
('{"employee": "Jim Romo",
"sales": [{"value": 10, "yr": "2012"}, {"value": 5, "yr": "2013"}, {"value": 40, "yr": "2014"}],
"expenses": [{"value": 2, "yr": "2007"}, {"value": 1, "yr": "2013"}, {"value": 3, "yr": "2014"}],
"product": "tv", "customer": "1", "updated": "20150501"
}'),
('{"employee": "Jim Romo",
"sales": [{"value": 10, "yr": "2012"}, {"value": 5, "yr": "2013"}, {"value": 41, "yr": "2014"}],
"expenses": [{"value": 2, "yr": "2009"}, {"value": 3, "yr": "2013"}, {"value": 3, "yr": "2014"}],
"product": "tv", "customer": "2", "updated": "20150312"
}'),
('{"employee": "Jim Romo",
"sales": [{"value": 20, "yr": "2012"}, {"value": 25, "yr": "2013"}, {"value": 33, "yr": "2014"}],
"expenses": [{"value": 8, "yr": "2012"}, {"value": 12, "yr": "2014"}, {"value": 5, "yr": "2009"}],
"product": "radio", "customer": "2", "updated": "20150311"
}'),
('{"employee": "Bill Baker",
"sales": [{"value": 1, "yr": "2010"}, {"value": 2, "yr": "2009"}, {"value": 3, "yr": "2014"}],
"expenses": [{"value": 3, "yr": "2011"}, {"value": 1, "yr": "2012"}, {"value": 7, "yr": "2013"}],
"product": "tv", "customer": "1", "updated": "20150205"
}'),
('{"employee": "Bill Baker",
"sales": [{"value": 10, "yr": "2010"}, {"value": 12, "yr": "2011"}, {"value": 3, "yr": "2014"}],
"expenses": [{"value": 4, "yr": "2011"}, {"value": 7, "yr": "2009"}, {"value": 4, "yr": "2013"}],
"product": "radio", "customer": "1", "updated": "20150204"
}'),
('{"employee": "Jim Romo",
"sales": [{"value": 22, "yr": "2009"}, {"value": 17, "yr": "2013"}, {"value": 35, "yr": "2014"}],
"expenses": [{"value": 14, "yr": "2011"}, {"value": 13, "yr": "2014"}, {"value": 8, "yr": "2013"}],
"product": "tv", "customer": "3", "updated": "20150118"
}')
对于每个员工,我需要评估最近更新的行,并找到 2014 年电视销售额大于 30 的员工。从那里我需要进一步过滤平均电视费用低于 5 的员工。对于平均值,我只需要取他们所有的电视费用,而不仅仅是最新的一排。
我的预期输出将是 1 行:
employee | customer | 2014 tv sales | 2013 avg tv expenses
------------+----------+-----------------+----------------------
Jim Romo | 1 | 40 | 4
我可以(有点)做1或其他但不能同时做:
一个。获得 2014 年销售额 > 30(但无法获得他们最近的“电视”销售额;(
SELECT * FROM mytable WHERE (SELECT (a->>'value')::float FROM
(SELECT jsonb_array_elements(data->'sales') as a) as b
WHERE a @> json_object(ARRAY['yr', '2014'])::jsonb) > 30
湾。获取 2013 年的平均费用(这需要是平均电视费用)
SELECT avg((a->>'value')::numeric) FROM
(SELECT jsonb_array_elements(data->'expenses') as a FROM mytable) as b
WHERE a @> json_object(ARRAY['yr', '2013'])::jsonb
编辑:这可能是一个非常大的表,因此任何关于性能和索引需求的评论都将不胜感激,因为我对 postgresql 和 jsonb 都是新手。
编辑#2:我已经尝试了这两个答案,但对于大桌子来说似乎都没有效率;(