我正在softRoCE之上开发Accelio。
Ib devices configured -
# ibv_devices
device node GUID
------ ----------------
rxe1 821f02fffef91598
rxe0 d6bed9fffebe94af
error while running the accelio client -
# xio_ow_client
=============================================
Server Address : 127.0.0.1
Server Port : 2061
Transport : rdma
Header Length : 32
Data Length : 32
Connection Index : 0
CPU Affinity : 0
Finite run : 0
=============================================
**** starting ...
session event: connection error. reason: No such device
# rping -c
rdma_resolve_route: No such device
因此检查了 opensm 状态 - # /etc/init.d/opensmd status opensm is stopped # /etc/init.d/opensmd start opensm start [FAILED]
# tail -f /var/log/opensm.log
Jul 09 15:04:45 655213 [AA4F3700] 0x03 -> OpenSM 3.3.7
Jul 09 15:04:45 692960 [AA4F3700] 0x80 -> OpenSM 3.3.7
Jul 09 15:04:45 693149 [AA4F3700] 0x02 -> osm_vendor_init: 1000 pending umads specified
Jul 09 15:04:45 797977 [AA4F3700] 0x80 -> Entering DISCOVERING state
Jul 09 15:04:45 799152 [AA4F3700] 0x02 -> osm_vendor_bind: Binding to port 0xd6bed9fffebe94af
Jul 09 15:04:45 800414 [AA4F3700] 0x01 -> osm_vendor_bind: ERR 5426: Unable to register class 129 version 1
Jul 09 15:04:45 800422 [AA4F3700] 0x01 -> osm_sm_mad_ctrl_bind: ERR 3118: Vendor specific bind failed
Jul 09 15:04:45 800425 [AA4F3700] 0x01 -> osm_sm_bind: ERR 2E10: SM MAD Controller bind failed (IB_ERROR)
Jul 09 15:04:45 800430 [AA4F3700] 0x01 -> osm_sa_mad_ctrl_unbind: ERR 1A11: No previous bind
Jul 09 15:04:45 829702 [AA4F3700] 0x80 -> Exiting SM
我会很感激一些指示,这样我就可以理解我哪里出错了。