0

我正在尝试运行在http://alchemy.cs.washington.edu/spn/找到的一些 Sum Product Network 代码

当我尝试在我的 mac (ver 10.8.4) 上运行它时,我遇到了以下错误:

mpjrun.sh -np 1 eval.Run -d O
MPJ Express (0.40) is started in the multicore configuration
[Rank=0] *** Parameters ***
[Rank=0]    domain=O
[Rank=0]    numSumPerRegion=20
[Rank=0]    numComponentsPerVar=4
[Rank=0]    sparsePrior=1.0
[Rank=0]    baseResolution=4
[Rank=0]    numSlavePerClass=50
[Rank=0]    numSlaveGrp=1
[Rank=0] <TIME> init 1687 ms

mpjdev.MPJDevException: In Comm.irecv(), requested source 1 does not exist in 

communicator of size 1
        at mpjdev.Comm.recv(Comm.java:864)
        at mpi.Comm.recv(Comm.java:1294)
        at mpi.Comm.Recv(Comm.java:1255)
        at spn.SPN.recvUpdate(SPN.java:650)
        at spn.GenerativeLearning.learnHardEM(GenerativeLearning.java:52)
        at spn.GenerativeLearning.learn(GenerativeLearning.java:16)
        at eval.Run.runOlivetti(Run.java:147)
        at eval.Run.proc(Run.java:46)
        at eval.Run.main(Run.java:40)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at runtime.starter.MulticoreStarter$1.run(MulticoreStarter.java:277)
        at java.lang.Thread.run(Thread.java:744)
    java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at runtime.starter.MulticoreStarter$1.run(MulticoreStarter.java:277)
        at java.lang.Thread.run(Thread.java:744)
    Caused by: mpi.MPIException: mpi.MPIException: mpjdev.MPJDevException: In Comm.irecv(), requested source 1 does not exist in communicator of size 1
        at mpi.Comm.Recv(Comm.java:1259)
        at spn.SPN.recvUpdate(SPN.java:650)
        at spn.GenerativeLearning.learnHardEM(GenerativeLearning.java:52)
        at spn.GenerativeLearning.learn(GenerativeLearning.java:16)
        at eval.Run.runOlivetti(Run.java:147)
        at eval.Run.proc(Run.java:46)
        at eval.Run.main(Run.java:40)
        ... 6 more
    Caused by: mpi.MPIException: mpjdev.MPJDevException: In Comm.irecv(), requested source 1 does not exist in communicator of size 1
        at mpi.Comm.recv(Comm.java:1317)
        at mpi.Comm.Recv(Comm.java:1255)
        ... 12 more
    Caused by: mpjdev.MPJDevException: In Comm.irecv(), requested source 1 does not exist in communicator of size 1
        at mpjdev.Comm.recv(Comm.java:864)
        at mpi.Comm.recv(Comm.java:1294)
        ... 13 more

这发生在我给出的任何 np 值上。我假设这不是 SPN 代码的问题,而是我正在使用 MPJ-Express 做的事情。我已经为 MPJ-Express 尝试了 0.40 和 0.37 版本并得到了相同的结果。

谢谢你的时间。

4

1 回答 1

0

当我运行代码并在 SPN 用户指南中找到答案时,我遇到了同样的问题。运行 SPN 的命令是:

mpjrun.sh -np [NUM_PROCESSOR] -dev niodev -mx8000m eval.Run [SPN OPTIONS] > [LOG FILE]

其中 NUM_PROCESSOR 取决于每个图像类别的从属进程数和从属组数。应该等于 (numSlavePerCat + 1) × numSlaveGroup,numSlavePerCat 和 numSlaveGroup 可以在 common/Parameter.java 中找到。如果你想在没有那么多处理器的机器上运行,你可以修改 numSlavePerCat。

于 2016-11-16T12:14:28.617 回答