0

桌子:

输入
日期数据,版本标识符
20130909 dsfcat 0 , 7
20130909 dosdfg 1 , 7
20130909 dsdfog 0 , 8
20130909 dsdfog 3 , 8
20130910 afsdfds 0 , 8
130910 afsdfds 2 , 8
输出
日期数据,版本标识符
20130909 dosdfg 0 , 7
20130909 dsdfog 0 , 8
20130910 afsdfds 0, 8
  1. 我有一张表,每天存储多个版本的数据。
  2. 每个数据不需要具有所有版本。主键是日期、标识符、版本

目标:对于每个日期的每个标识符(可以由多行组成),将最大版本设置为 0 并删除所有其他记录。

我试图避免使用连接。我可以用 perl、bash 脚本来补充它

编辑:1。也许我可以删除所有不是最高版本的值,然后将八位版本设置为 0。

4

2 回答 2

2

这可以完全通过 SQL 来完成:

sqlite> SELECT * FROM tbl;
20130909|dsfcat|0|7
20130909|dosdfg|1|7
20130909|dsdfog|0|8
20130909|dsdfog|3|8
20130910|afsdfds|0|8
20130910|afsdfds|2|8

sqlite> DELETE FROM
   ...>   tbl
   ...> WHERE
   ...>   Date||':'||identifier||':'||version NOT IN (
   ...>     SELECT
   ...>       Date||':'||identifier||':'||version
   ...>     FROM
   ...>       tbl
   ...>     GROUP BY
   ...>       identifier,Date
   ...>     HAVING
   ...>       version=MAX(version)
   ...>   );
sqlite> UPDATE tbl SET version=0;

sqlite> SELECT * FROM tbl;
20130909|dosdfg|0|7
20130909|dsdfog|0|8
20130910|afsdfds|0|8

这样做是删除其键字段与包含每个标识符/日期的最高版本号的记录相关联的键列表不匹配的所有记录,然后将其余记录更新为版本号为 0。

补充建议:

  1. I don't know how big this database is going to be, but you might want to add a "processed" field that indicates whether a given record has been through the daily pruning process or not; otherwise the set of data the query runs through will always contain already-processed records which will grow over time.
  2. If you changed your current primary key definition to a unique index and then added a single primary key field (Such as an auto incrementing number) it would free the query from needing to concatenate all the key fields together in the DELETE and sub-SELECT.
于 2013-10-05T00:35:35.793 回答
0

以下代码是否符合您的要求?

#!/usr/bin/perl

use strict;
use warnings;

open my $fh, "data.in" or die "$!";

my %table;
while (<$fh>) {
    chomp;
    my ($date, $data, $version, $identifier) = split;

    if (not defined $table{$date}{$identifier}{version}
            or $table{$date}{$identifier}{version} < $version) {
        $table{$date}{$identifier}{version} = $version;
        $table{$date}{$identifier}{data} = $data;
    }
}

close $fh;

foreach my $date (sort keys %table) {
    foreach my $identifier (sort keys $table{$date}) {
        my $data = $table{$date}{$identifier}{data};
        $table{$date}{$identifier}{version} = 0;
        my $version = $table{$date}{$identifier}{version};
        print "$date\t$data\t$version\t$identifier\n";
    }
}

如果您想删除除存储在哈希中的记录之外的所有记录,我认为您可以执行类似的操作

my $query = "DELETE FROM $table_name WHERE date = $date AND identifier = $identifier AND data != $data";

在内foreach循环内部。但是,我没有对此进行测试,所以我不确定它是否有效。

于 2013-10-04T22:55:38.343 回答