sql - 在 PostgreSQL 中查找下一个最接近的数字

Question

我在 Windows Server 2008 R2 下运行 PostgreSQL 9.1.9 x64 和 PostGIS 2.0.3。

我有一张桌子：

CREATE TABLE field_data.trench_samples (
   pgid SERIAL NOT NULL,
   trench_id TEXT,
   sample_id TEXT,
   from_m INTEGER
);

里面有一些数据：

INSERT INTO field_data.trench_samples (
   trench_id, sample_id, from_m
)
VALUES
   ('TR01', '1000001', 0),
   ('TR01', '1000002', 5),
   ('TR01', '1000003', 10),
   ('TR01', '1000004', 15),
   ('TR02', '1000005', 0),
   ('TR02', '1000006', 3),
   ('TR02', '1000007', 9),
   ('TR02', '1000008', 14);

现在，我感兴趣的是找到记录的“from_m”和该trench_id 的“下一个”“from_m”之间的差异（在本例中以米为单位）。

因此，根据上面的数据，我想得到一个生成下表的查询：

pgid, trench_id, sample_id, from_m, to_m, interval
1, 'TR01', '1000001', 0, 5, 5
2, 'TR01', '1000002', 5, 10, 5
3, 'TR01', '1000003', 10, 15, 5
4, 'TR01', '1000004', 15, 20, 5
5, 'TR02', '1000005', 0, 3, 3
6, 'TR02', '1000006', 3, 9, 6
7, 'TR02', '1000007', 9, 14, 5
8, 'TR02', '1000008', 14, 19, 5

现在，您可能会说“等等，我们如何推断每行中最后一个样本的间隔长度，因为没有“下一个” from_m 可以比较？

对于行的“结束”（sample_id 1000004 和 1000008），我想使用前两个样本的相同间隔长度。

当然，我不知道如何在我当前的环境中解决这个问题。非常感激你的帮助。

score 1 · Accepted Answer

以下是您如何获得差异，最后使用前面的一个示例（如数据所示，但文本中未明确解释）。

这里的逻辑是重复应用lead()and lag()。首先申请lead()计算区间。然后lag()通过使用先前的间隔来应用计算边界处的间隔。

其余的基本上只是算术：

select trench_id, sample_id, from_m,
       coalesce(to_m,
                from_m + lag(interval) over (partition by trench_id order by sample_id)
               ) as to_m,
       coalesce(interval, lag(interval) over (partition by trench_id order by sample_id))
from (select t.*,
             lead(from_m) over (partition by trench_id order by sample_id) as to_m,
             (lead(from_m) over (partition by trench_id order by sample_id) -
              from_m
             ) as interval
      from field_data.trench_samples t
     ) t

这是显示它工作的 SQLFiddle。

sql - 在 PostgreSQL 中查找下一个最接近的数字

1 回答 1

Related

Reference