1

我尝试测试 Giraph。

VertexId 类型文本

输入边基

如果我使用 Text 作为 VertexId,我会得到错误。如果 LongWritable,一切正常。

问题: 1. 使用Text作为VertexId可以吗?2. 如果是,我在做什么?

错误

14/10/15 14:59:28 INFO worker.InputSplitsCallable: call: Loaded 1 input splits in 0.08243016 secs, (v=0, e=12) 0.0 vertices/sec, 145.57777 edges/sec
14/10/15 14:59:28 ERROR utils.LogStacktraceCallable: Execution of callable failed
java.lang.ArrayIndexOutOfBoundsException
    at java.lang.System.arraycopy(Native Method)
    at org.apache.giraph.utils.UnsafeArrayReads.readFully(UnsafeArrayReads.java:103)
    at org.apache.hadoop.io.Text.readFields(Text.java:265)
    at org.apache.giraph.utils.ByteStructVertexIdDataIterator.next(ByteStructVertexIdDataIterator.java:65)
    at org.apache.giraph.edge.AbstractEdgeStore.addPartitionEdges(AbstractEdgeStore.java:161)

自定义格式:

public class TextDoubleTextEdgeInputFormat extends
    TextEdgeInputFormat<Text, DoubleWritable> {
  /** Splitter for endpoints */
  private static final Pattern SEPARATOR = Pattern.compile("[\t ]");

...

主要课程

public class HelloWorld extends
        BasicComputation<Text, Text, NullWritable, NullWritable> {

    @Override
    public void compute(Vertex<Text, Text, NullWritable> vertex,
            Iterable<NullWritable> messages) {
        System.out.print("Hello world from the: " + vertex.getId().toString()
                + " who is following:");
        for (Edge<Text, NullWritable> e : vertex.getEdges()) {
            System.out.print(" " + e.getTargetVertexId());
        }
        System.out.println("");
        vertex.voteToHalt();

    }

    public static void main(String[] args) throws Exception {
        System.exit(ToolRunner.run(new GiraphRunner(), args));
    }
}

测试启动器

@Test
public void textDoubleTextEdgeInputFormatTest() throws Exception {

    String[] graph = { "1 2 1.0", "2 1 1.0", "1 3 1.0", "3 1 1.0",
            "2 3 2.0", "3 2 2.0", "3 4 2.0", "4 3 2.0", "3 5 1.0",
            "5 3 1.0", "4 5 1.0", "5 4 1.0" };

    GiraphConfiguration conf = new GiraphConfiguration();
    conf.setComputationClass(HelloWorld.class);
    conf.setEdgeInputFormatClass(TextDoubleTextEdgeInputFormat.class);
    // conf.setEdgeInputFormatClass(TextLongL6TextEdgeInputFormat.class);

    // conf.setVertexOutputFormatClass(IdWithValueTextOutputFormat.class);

    InternalVertexRunner.run(conf, null, graph);
}
4

1 回答 1

0

我在 Giraph 中没有经验,但在 Apache SparkX 中 VertexId 的类型为 Long。

如果在 Apache Spark GraphX 中重用 Apache Giraph 的设计模式,我不会感到惊讶。因此,我猜想 Long 类型是可行的方法,因为您对 LongWritable 类型的实现是成功的。

于 2015-02-01T22:35:35.627 回答