Friday, July 18, 2008

RHEL and Infiniband - basic usage

As I written in the previous post, the /etc/init.d/openibd init script is in charge of starting Infiniband (IB) network. The script parses the /etc/ofed/openibd.conf configuration file where you can specify which ULPs should be initialized. By default, all ULPs I mentioned last time - ipoib, srp, sdp - are enabled.

The opensm IB network manager is controlled with the /etc/init.d/opensmd init script which is configurable via /etc/ofed/opensm.conf configuration file. You can turn on debugging here but it is not normally needed. It is more useful to enable verbose mode which increases the log verbosity level. The default log file is /var/log/osm.log. So, if something goes wrong enable verbose mode and check the log file.

After executing the init scripts, you should check the IB network state. The openibd script is started automatically during the system startup, while the opensm has to be enabled (with ntsysv or chkconfig). Follow this checklist:
  1. Is Mellanox HCA recognized?
    • check the output of lsmod | grep ib_mthca
    • check the output of dmesg
  2. Are appropriate ULPs loaded?
    • check the output of lsmod | grep ib_
      • should contain ib_ipoib, ib_srp, ib_sdp
  3. Is IB network initialized and working?
    • check the output of cat /sys/class/infiniband/mthca0/ports/X/state
      • should be ACTIVE
  4. Is ib0 network interface available?
    • check the output of ifconfig -a
If you passed all the checks you would be able to use IP protocol over IB network. I supposed you have two IB nodes in the IB network at least, both are configured the same way and both have passed the checks (like in the first article). To configure it follow the commands:
  1. assign an IP address to the nodes
    • run ifconfig ib0 IP_ADDR1 up at first node
    • run ifconfig ib0 IP_ADDR2 up at second node
  2. check the IPoIB functionality
    • run ping IP_ADDR2 from the first node
    • run ping IP_ADDR1 from the second node
So, wasn't it simple? If everything is working the ping should receive replies from the other side. Now, you can run any IP based application over IB - FTP, NFS and so on and utilize its benefits like high throughput and low latencies. Please, if you are interested in the topic leave me a comment.

3 comments:

Anonymous said...

Really nice and simple article for a layman to follow. Thanks for it, I could understand why my ib0 interface was not coming up (I had to load the ipoib module)!

In my system, /sys/class/infiniband/mthca0 directory itself if not being created. Is that normal?

Jerry said...

Great series of articles David! I have been battling with IB over the past few days and your clear, concise instructions are a godsend. Thank you!!

Brian Topping said...

Excellent pair of articles on getting IB setup on RHEL, thank you for making it so easy! Turns out I've had everything installed and configured now for 18 months, just missing the startup for OpenSM to get the interfaces live.

Can't wait to read more of your blog! The tag cloud looks juicy!