0

新猪用户。我将 mysql 语句转换为 pig 并遇到了以下问题。我有 2 个需要加入的表,并且加入的值有一个计算。我认为这一定是一个简单的问题。

例如,我的表是machine1machinemeansPig ,我加入了它们。在手册中找不到用于在加入中进行计算的语法。有什么建议么?

    select region, os, group, f.machine, f.machine_users, f.machine_tm,
    f.machine_users - g.users_per_machine outliers,
    f.machine_tm - g.tm_per_machine    outlying_tm,
    tm_per_machine/(f.machine_tm+1) factor
    from machine1 f
    inner join machinemeans g using(region, os, group)
    order by 4, 1, 2, 3

谢谢

更新:谢谢,温妮尼克劳斯。我试过你的建议,但我得到一个标量有超过 1 行的输出错误。这是我的代码。

 machine1 = LOAD 'S1'  AS (

    block:chararray,
    region:chararray,
    os:chararray,
    group:int,
    machine:int,
    machine_users:int,
    machine_tm:float
);

machinemeans = LOAD 'S2' AS (

    region:chararray,
    os:chararray,
    group:int,
    tot_machines:int,
    tot_users:int,
    users_per_machine:float,
    tm_per_machine:float,
    tm_per_user:float,
    cnt_per_block:float,
    cnt_per_user:float
);


imbalance = FOREACH (JOIN machine1 by (region,os,group), 
machine2 by   (region,os,group))    
GENERATE 
  region,os,group,
  machine1.machine,
  machine1.machine_users,
  machine1.machine_tm,
  machine1.machine_users - machinemeans.users_per_machine,
  machine1.machine_tm - machinemeans.tm_per_machine;
4

1 回答 1

0

单个 SQL 查询可能需要多个 Pig Latin 语句。您所指的计算并不完全在 SQL 中join;真的是在select语句中,select ... from ...在SQL中基本对应Pig的FOREACH ... GENERATE ...。所以FOREACHJOIN. 例如:

result =
    FOREACH (
        JOIN table1 BY key1, table2 BY key2
    ) GENERATE
        table1.field1,
        table1.field2,
        table2.field3,
        table1.field4 - table2.field5;

如果您需要进行计算以获取连接键,但以后不关心它们,您甚至可以这样做

JOIN table1 BY (field1+field4), table2 BY myUDF(field3);
于 2013-01-14T14:04:39.307 回答