Assuming that field time
looks like 2013-01-01T00:00:00.000Z
, piggybank.jar
has been imported already , and command EXTRACT
has been defined (DEFINE EXTRACT org.apache.pig.piggybank.evaluation.string.EXTRACT();) What's the best way to extract fields year, month, day, hour, minute, second
? That's what I have done so far:
data = FOREACH data GENERATE FLATTEN(EXTRACT(time, '(\\d+)-(\\d+)-(\\d+)T(\\d+):(\\d+):(\\d+).(\\s+)'))
AS (
year: int,
month: int,
day: int,
hour: int,
minute: int,
second: int,
tail: chararray
);