- connections from switch to IB nodes (switch -> nodes)
- switch port [6] is connected to the [2] channel of IB node node2
- switch port [5] is connected to the [1] channel of IB node node2
- switch port [4] is connected to the [2] channel of IB node node1
- switch port [3] is connected to the [1] channel of IB node node1
- connections from IB node node1 to switch (node -> switch)
- the [1] IB channel is connected to switch port [3]
- the [2] IB channel is connected to switch port [4]
- connections from IB node node2 to switch (node -> switch)
- the [1] IB channel is connected to switch port [5]
- the [2] IB channel is connected to switch port [6]
Do you understand the logic of it? I think it's simple. And it is evident the IB connections are full-duplex in our scenario.
I'm going to skip the ibnodes command. Its output is the same as without running subnet manager. Next command, the ibroute command, is producing the following nice forwarding table:
I'm going to skip the ibnodes command. Its output is the same as without running subnet manager. Next command, the ibroute command, is producing the following nice forwarding table:
Unicast lids [0x0-0x5] of switch Lid 2 guid 0x00144f00006e9794 ():
Lid Out Port Destination Info
0x0001 003 : (Channel Adapter portguid 0x0003ba0001007ba9: 'node1 HCA-1')
0x0002 000 : (Switch portguid 0x00144f00006e9794: '')
0x0003 004 : (Channel Adapter portguid 0x0003ba0001007baa: 'node1 HCA-1')
0x0004 005 : (Channel Adapter portguid 0x0003ba0001003de5: 'node2 HCA-1')
0x0005 006 : (Channel Adapter portguid 0x0003ba0001003de6: 'node2 HCA-1')
5 valid lids dumped
It lists the assigned LIDs, corresponding switch ports and the other ends of the connections. It's classical routing table saying that a LID X is reachable via a switch port Y with an additional information about the entity owning that LID number. For example, the LID 1 is reachable via the switch port 3 and it is the channel adapter of node node1.
To make the final decision if the IB network is working run the ibchecknet command. The output might say that we have 2 working IB HCAs, 3 IB nodes (two with HCA and one switch) and 8 working IB ports (physically only four but the network is full-duplex in our scenario).
To make the final decision if the IB network is working run the ibchecknet command. The output might say that we have 2 working IB HCAs, 3 IB nodes (two with HCA and one switch) and 8 working IB ports (physically only four but the network is full-duplex in our scenario).
# Checking Ca: nodeguid 0x0003ba0001003de4From now, we have working Infiniband network and we are able to do this:
# Checking Ca: nodeguid 0x0003ba0001007ba8
## Summary: 3 nodes checked, 0 bad nodes found
## 8 ports checked, 0 bad ports found
## 0 ports have errors beyond threshold
- ibping the nodes natively
- ping the nodes over ipovib
- run unmodified network applications over ipoib (e.g. NFS, FTP and so on)
- run natively RDMA application
No comments:
Post a Comment