sql - 在 sas proc sql 中处理 where 条件，同时连接到其他一些数据库

Question

我正在处理一个包含超过 3000 万条记录的表。该表在 sybase 上，我正在使用 sas。有一个 feed_key(numeric) 变量，其中包含记录条目的时间戳。我想提取特定时间范围内的记录。

proc sql ;
Connect To Sybase (user="id" pass="password" server=concho);
create table table1 as
select * from connection to sybase
(
select a.feed_key as feed_key,
              a.cm15,
              a.country_cd,
              a.se10,
              convert(char(10),a.se10) as se_num,
              a.trans_dt,
              a.appr_deny_cd,
              a.approval_cd,
              a.amount        
         from abc.xyz a
  where a.country_cd in ('ABC') and a.appr_deny_cd in ('0','1','6') and a.approval_cd not in ('123456') and feed_key > 12862298
);
disconnect from sybase;
quit;

无论我是否设置了 feed_key 条件，它都会提取相同的记录数，并且执行查询所需的时间几乎相同（没有 feek_key 条件的 16 分钟和有 feed_key 条件的 15 分钟）。

请澄清在这种情况下 where 子句的工作。

因为我相信 feed_key 条件应该使查询运行得更快，因为超过 80% 的记录不符合这个条件......

score 0 · Accepted Answer

如果您返回相同数量的记录，则处理查询将花费相同的时间。

这是因为 I/O（将数据传回 SAS 并存储）是操作中最耗时的部分。这就是为什么缺少索引不会对总时间产生很大影响的原因。

如果您调整查询以使其返回更少的行，您将获得更快的处理速度。

您可以通过查看 SAS 日志来判断这种情况，该日志将显示 CPU 使用了多少时间（其余为 IO）：

    NOTE: PROCEDURE SQL used (Total process time):
          real time           11.07 seconds
          cpu time            1.67 seconds

sql - 在 sas proc sql 中处理 where 条件，同时连接到其他一些数据库

1 回答 1

Related

Reference