Friday, July 25, 2008

RHEL and Infiniband - basic diagnostics

I am going to close the article series about Infiniband technology on RHEL platform (check the previous posts 1, 2, 3) with posts intended to the IB troubleshooting. I would like to introduce a basic diagnostic steps of IB environment which may help you to uncover errors and misconfiguration.

The most of troubles you may meet with are traceable via OFED diagnostics tools. They are part of openib-diags package until OFED 1.2. Since version 1.3, it is replaced with infiniband-diags package. Let's take a look at the most useful ones:
  1. ibstat - shows IB device status like firmware version, ports state, their rate, GUIDs, LIDs ...
  2. ibnetdiscover - discovers IB network topology
  3. ibroute - queries for IB switch forwarding table (like routing table)
  4. ibnodes - shows IB nodes in topology
  5. ibchecknet - runs IB network validation
  6. ibping - ping IB address
  7. sysfs - Linux virtual filesystem representing kernel structures, for IB is there directory /sys/class/infiniband
The IB network is similar to the other high performance network technologies like Fibre Channel. The most of troubles with IB are in common. You may need to resolve connectivity issues, firmware or higher level software revisions incompatibilities, driver bugs and similar.

At first, I would like to explain the usage of last two tools - ibping and sysfs. They are simple enough and known from other fields. The IB ping works in client-server fashion. That means you need to run ibping in server mode at one side and another side will act as a client. The server is ponging to the client's pings.
  1. Server mode - ibping -S -v
  2. Client mode - ibping -v SERVER_LID_ADDR
The -v argument increases verbosity level only. The right LID address can be found with ibnetdiscover command. Run it, find the server node line and use the associated LID now. I will explain it later. If the IB network is healthy ibping should produce the output at the server side like this (the server LID is 4, his hostname is node2):

ibwarn: [6795] ibping_serv: starting to serve...
ibwarn: [6795] ibping_serv: Pong: node2.(none)

The pongs have to be visible at the client side:

ibwarn: [17946] ibping: Ping..

Pong from node2.(none) (Lid 4): time 0.235 ms

If you aren't able to see them you should check the connectivity status of your IB HCA. One method to do it is via sysfs. Each IB HCA is represented with a subdirectory under the /sys/class/infiniband directory where you can find a lof of useful stuff. For example, if you have dual ported HCA from Mellanox then there should be the following entries for port states:
  1. /sys/class/infiniband/mthca0/ports/0/state
  2. /sys/class/infiniband/mthca0/ports/1/state
The state can have three predefined values with these meanings:
  1. DOWN - port is physically disconnected
  2. INIT - port is connected and it is initialized
  3. ACTIVE - port is online and it is working
If ibping has to work the ports of both nodes have to be in ACTIVE state. If they are in INIT state then the subnet manager may be not running. The DOWN state simply means cable problem. By the way, there are other methods to achieve this with help of remaining tools. I am going to explore them next time.

No comments: