1

I have a list of file names stored in a filenames.txt. Is it possible to load them all together using a single LOAD command?

They are not in the same directory, nor with similar format, so it is not like using /201308 to load 20130801.gz through 20130831.gz.

Plus there are too many files in the list, preventing me to do like this:

shell: pig -f script.pig -param input=/user/training/test/{20100810..20100812}

pig: temp = LOAD '$input' USING SomeLoader() AS (...);

Thanks in advance for insights!

4

1 回答 1

2

If the number of files are reasonably small (e.g: in the command line you fit into ARG_MAX) you may try to concat the lines in the file into one string:

pig -param input=`cat filenames.txt | tr "\n" ","` -f script.pig

script.pig:
A = LOAD '$input' ....

Probably it would be better to list the directories rather than the individual files if it is an option for you.

于 2013-09-24T10:55:14.427 回答