1

这几天写了一些hive Statement,但是在集成hive Statement的时候,出现了一些问题,具体情况在这(使用hadoop集群):

当我跑步时:</p>

from tmp
  insert overwrite local directory '/tmp/out/jpg'
  select count(1) where logdate=0222 and req_uri regexp '\.(jpg|JPG)';

或者

from tmp
  insert overwrite local directory '/tmp/out/jpg_hit'                 
  select count(1) where logdate=0222 and req_uri regexp '\.(jpg|JPG)' and hit_status="hit";

结果在“/tmp/out/jpg”或“/tmp/out/jpg_hit”下面,都只有一个文件,文件上的结果(两个结果不相等)

但是当我跑步时:</p>

 from tmp
  insert overwrite local directory '/tmp/out/jpg'
  select count(1) where logdate=0222 and req_uri regexp '\.(jpg|JPG)'
  insert overwrite local directory '/tmp/out/jpg_hit'                 
  select count(1) where logdate=0222 and req_uri regexp '\.(jpg|JPG)' and hit_status="hit";

在“/tmp/out/jpg”或“/tmp/out/jpg_hit”下,都有很多文件,当我把每个文件编号相加时,我发现两个结果相等,并且等于大数字,结果是错误。我可以问你我如何解决这个问题吗?

4

0 回答 0