我有以下时尚的数据
Prog_Id Low_latency Max_Latency 一个 1 4 一个 -1 5 一个 3 8 一个 11 12 一个 12 15
现在我希望看到输出为
Prog_Id Low_latency Max_Latency 一个 -1 8 一个 11 15
基本上我希望合并重叠数据。谁能帮我写代码。如果有 OVERLAPS 子句的解决方案,我可以在延迟的地方管理时间。
谢谢里沙布
我最初的答案并不总是有效。现在看起来是这样的:
select distinct *
from (
select
t1.Prog_ID,
min(least(l, Low_latency)),
max(greatest(g, Max_Latency))
from yourtable t1 inner join (select
t1.Prog_ID,
least(t1.Low_latency, t2.Low_latency) l,
greatest(t1.Max_Latency, t2.Max_Latency) g
from
yourtable t1 inner join yourtable t2
on t1.Prog_ID=t2.Prog_ID
and t1.Low_latency<=t2.Max_Latency
and t1.Max_Latency>=t2.Low_Latency) t2
on t1.Prog_ID=t2.Prog_ID
and t1.Low_latency<=t2.g
and t1.Max_Latency>=t2.l
group by t1.Low_latency, t1.Max_latency) s
请看这里。它是 MySql 代码,但可以转换为其他 DBMS。
这取决于您使用的数据库服务器 (DBMS)。但是没有简单的解决方案。有可能使用存储过程。但我更喜欢用编程语言(你使用哪种语言?)
在用其他人的查询进行了一些测试后,我发现在 SQL 中没有办法。
这是在java中映射reduce的类似东西
public class YourData {
Double Low_latency;
Dobule Max_Latency;
int Prog_Id;
// add getter and setter here
public boolean tesetOverlapping(YourData data) {
if ((this.Low_latency<=data.Low_latency && data.Low_latency<=t1.Max_Latency) ¦¦ (this.Low_latency<=data.Max_Latency && data.Max_Latency<=this.Max_Latency)) {
this.Low_latency = Math.min(this.Low_latency, data.Low_latency);
this.Max_Latency = Math.min(this.Max_Latency, data.Max_Latency);
return true
}
return false;
}
}
String sql = "
SELECT
t1.Prog_Id,
t1.Low_latency,
t1.Max_Latency
FROM yourtable t1"
ArrayList<ArrayList<Double>> values = new ArrayList<ArrayList<Double>>();
while (row = get sql rows) {
int progIndex = values.indexOf(row.Prog_Id);
if (progIndex == -1) {
progIndex = values.indexOf(row.Prog_Id);
values.add(progIndex, new ArrayList<Double>());
}
values[progIndex].add(new YourData(row));
}
boolean foundOverlapping = false;
for (int progIndex = 0; progIndex < values.size(); progIndex++) {
// Do map reduce for each progIndex
do {
foundOverlapping = false;
for (int i = 0; i < values[progIndex].size(); i++) {
if (!values[progIndex].contains(i)) {
continue;
}
YourData cur = values[progIndex][i];
for (int x = 0; x < values[progIndex].size(); x++) {
if (i != x && values[progIndex].contains(x)) {
if (cur.tesetOverlapping(values[progIndex][x])) {
foundOverlapping = true;
values[progIndex].remove(x);
}
}
}
}
} while (foundOverlapping == true);
}
假设您想以-infinity...9
, 10...19
,20...29
模式分组以降低延迟,您将需要类似
SELECT
Prog_Id,
MIN(Low_latency) AS Low_latency,
MAX(Max_Latency) AS Max_Latency
FROM
your_table_name
GROUP BY
Prog_Id,
IF(FLOOR(Low_latency/10)<0,0,FLOOR(Low_latency/10))
显然最后一行将取决于所使用的 RDBMS,但在大多数情况下应该非常相似。
您可能还想添加一个ORDER BY
子句。