我有一个数据集,每天为每个 mac 地址填充 0 到 48 个测量值(每半小时一次)(有时由于各种原因我们可能无法获得每个测量值)。通常,我按天分组并取测量值的平均值,但是随着 MAC 地址数量的增加,我们打算用更少的测量值来构成平均值。这是我执行的查询示例:
select fmc.mac_address,
inf.node,
inf.uf,
inf.cidade,
date_trunc('day', fmc.data) as data,
avg(inf.qoe) as qoe,
avg(inf.qoe_download) as qoe_download,
avg(inf.qoe_upload) as qoe_upload,
avg(inf.qoe_packetloss) as qoe_packetloss,
avg(inf.qoe_latency) as qoe_latency,
avg(inf.qoe_jitter) as qoe_jitter
from fixa_medicoes_claro fmc inner join public.inference_mac inf on fmc.mac_address = inf.mac
where data >= '2020-12-14'
and mac_address in {}
group by fmc.mac_address,
inf.node,
inf.uf,
inf.cidade,
date_trunc('day', fmc.data)
现在我们想为每个分组数据查询较少数量的样本,但有一个限制,无论 mac_address 每天的测量次数是多少,我想查询其中的最大“n”个样本,同时也限制它们是时间间隔相等。Ps.:时间戳只记录一天,所以我们不知道特定样本的小时/分钟。