Find centralized, trusted content and collaborate around the technologies you use most.
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
我有一个巨大的文件,每行有两列,由制表符分隔。
我有另一个文件,其中包含一个值列表,每行一个。
现在我想过滤第一个文件,这样我就得到了第二个文件中存在第一列的所有行。
如何在猪中做到这一点?
您可以使用内部联接:
A = LOAD 'file1' USING PigStorage('\t') AS (f1:int, f2:int); B = LOAD 'file2' USING PigStorage(',') AS (f3:int); C = JOIN A BY f1, B BY f3; DUMP C;