我无法在某些领域使用我的 udf,但我可以在其他领域使用。如果我使用我的第一个字段,ipAddress
udf 会按预期工作。但是,如果我将其更改为date
1066 错误。这是我的脚本。
可以工作并调用 udf 的 Pig 脚本。
REGISTER myudfs.jar;
DEFINE HOUR myudfs.HOUR;
A = load 'access_log_Jul95' using PigStorage(' ') as (ip:chararray, dash1:chararray, dash2:chararray, date:chararray, date1:chararray, getRequset:chararray, location:chararray, http:chararray, code:int, port:int);
B = FOREACH A GENERATE HOUR(ip);
dump B;
Pig 脚本不起作用,并调用 udf
REGISTER myudfs.jar;
DEFINE HOUR myudfs.HOUR;
A = load 'access_log_Jul95' using PigStorage(' ') as (ip:chararray, dash1:chararray, dash2:chararray, date:chararray, date1:chararray, getRequset:chararray, location:chararray, http:chararray, code:int, port:int);
B = FOREACH A GENERATE HOUR(date);
dump B;
可以工作但不调用 udf 的 Pig 脚本
REGISTER myudfs.jar;
DEFINE HOUR myudfs.HOUR;
A = load 'access_log_Jul95' using PigStorage(' ') as (ip:chararray, dash1:chararray, dash2:chararray, date:chararray, date1:chararray, getRequset:chararray, location:chararray, http:chararray, code:int, port:int);
B = FOREACH A GENERATE date;
dump B;
样本数据
199.72.81.55 - - [01/Jul/1995:00:00:01 -0400] "GET /history/apollo/ HTTP/1.0" 200 6245
Java UDF
package myudfs;
import java.io.IOException;
import org.apache.pig.EvalFunc;
import org.apache.pig.data.Tuple;
import org.apache.pig.impl.util.WrappedIOException;
public class HOUR extends EvalFunc<String>
{
@SuppressWarnings("deprecation")
public String exec(Tuple input) throws IOException {
if (input == null || input.size() == 0)
return " ";
try{
String str = (String)input.get(0);
return str.substring(0, 1);
}catch(Exception e){
throw WrappedIOException.wrap("Caught exception processing input row ", e);
}
}
}
错误
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias B
如果还有什么,请告诉我。我在本地运行时遇到此错误,并且在 map reduce 上运行。