0

在 pig 中,我使用的是使用外部 jar 文件例程的 udf,而在 eclipse 中,我也导出了该 jar 文件。但是在运行 pig -x local script1.pig 时,它给了我外部 jar 文件例程的错误。

请帮忙!

谢谢。

编辑1:

正如我的代码评论中所问的那样:

script1.pig:

    REGISTER ./csv2arff.jar;

    csvraw = LOAD 'sample' USING PigStorage('\n') as (c);

    arffraws = FOREACH csvraw GENERATE pighw2java.CSV2ARFF(c);

pighw2java.CSV2ARFF:

public String exec(Tuple input) throws IOException {
    if (input == null || input.size() == 0)
        return null;
    try{
            System.out.println(">>> " + input.get(0).toString());
            // 1.1) csv to instances
            ByteArrayInputStream inputStream = new ByteArrayInputStream(input.get(0).toString().getBytes("UTF-8"));
            CSVLoader loader = new CSVLoader();    **HERE IS ERROR**

                 .....

        }
}

我得到的错误:

java.lang.NoClassDefFoundError: weka/core/converters/CSVLoader
at pighw2java.CSV2ARFF.exec(CSV2ARFF.java:24)
at pighw2java.CSV2ARFF.exec(CSV2ARFF.java:1)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:216)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:305)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:322)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:271)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:266)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
    Caused by: java.lang.ClassNotFoundException: weka.core.converters.CSVLoader
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
... 14 more
4

1 回答 1

0

您需要注册自定义 UDF 中使用的所有第 3 方依赖项。这里:

register '/path/to/weka.jar'
于 2012-11-13T23:24:48.833 回答