这是数据
123,456,789,q,w,e,r,20120513
123,77,88,8,jj,oo,"ooo,\r\n""d,\r\ndf,123",20120514
123,77,88,8,jj,oo,ooo,20120514
我想使用 pig 脚本将这些 \r\n 替换为换行符。
Pig Script:
REGISTER file:///usr/share/pig/contrib/piggybank/java/piggybank.jar;
DEFINE CSVLoader org.apache.pig.piggybank.storage.CSVLoader;
RAW = LOAD '/home/bannie/test/test.log'
USING CSVLoader() AS (
a: chararray,
b: chararray,
c: chararray,
d: chararray,
e: chararray,
f: chararray,
g: chararray,
h: chararray
);
C = FOREACH RAW GENERATE REPLACE(g, '\\\\r\\\\n', '\uxxxx') as max;
grunt> C = FOREACH RAW GENERATE REPLACE(g, '\\\\r\\\\n', '\u000f') as max;
grunt> C = FOREACH RAW GENERATE REPLACE(g, '\\\\r\\\\n', '\u000e') as max;
grunt> C = FOREACH RAW GENERATE REPLACE(g, '\\\\r\\\\n', '\u000d') as max;
2013-06-03 17:53:42,629 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <line 55, column 32> mismatched input '(' expecting SEMI_COLON
Details at logfile: /home/bannie/pig_1370249955149.log
grunt> C = FOREACH RAW GENERATE REPLACE(g, '\\\\r\\\\n', '\u000a') as max;
2013-06-03 17:53:47,601 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <line 55, column 32> mismatched input '(' expecting SEMI_COLON
Details at logfile: /home/bannie/pig_1370249955149.log
grunt> C = FOREACH RAW GENERATE REPLACE(g, '\\\\r\\\\n', '\u000b') as max;
grunt> C = FOREACH RAW GENERATE REPLACE(g, '\\\\r\\\\n', '\u000c') as max;
Anyone knows how to insert it?