Monday, July 7, 2008

RHEL and Infiniband - software intro

Let's continue with software introduction. As I wrote the switch is equipped with the ALOM remote management. There is an universal set of commands for platform independent management like password, poweroff, setupsc, resetsc and so on and then a set of commands which are more specific to the platform. In the case of our IB switch there are two such commands:
  1. setbp - for setting so-called blueprint of switch
  2. showbp - for showing the current blueprint
  3. there are five predefined blueprints:
    • 9 node, 12 node, 18 node, none and unmanaged
The natural question is what does the blueprint mean? According to official documentation it seems to be like a predefined configuration of the switch. You can change it with the setbp command which asks you if you want to run IB management software, how many hosts will be in the subnet and what is the subnet identifier. By default, if you use the switch preconfigured from the factory then two switches will have the same subnet ID. The trouble is, if you intend to configure some level of redundancy between IB switches you will have to have them in different subnets with different subnet IDs. I think it strange because I had to disable the IB management software otherwise I wasn't able to see the nodes in the fabric. As we will see, the IB mangement software including IB subnet manager doesn't seem to like the OFED included in RHEL distro (more about RHEL and OFED I wrote here).

What about the servers? I preinstalled them with CentOS 5.1 distribution (which is binary compatible with RHEL 5.1). The distribution contains the OFED implementation in version 1.2. The complete OFED implementation in CentOS is divided in a set of RPM packages. The platform dependent part of OFED that means kernel modules are distributed with kernel package. Let's make a quick summary of basic packages:
  1. kernel - contains IB hardware, IB core and IB ULP modules
    • ULP means Upper Level Protocol
    • everything is placed in the following directories:
      • /lib/modules/`uname -r`/kernel/drivers/infiniband/hw
      • /lib/modules/`uname -r`/kernel/drivers/infiniband/core
      • /lib/modules/`uname -r`/kernel/drivers/infiniband/ulp
    • currently there are supported only IB HCAs from Mellanox
    • the supported ULPs are
      • ipoib - IP over IB driver
      • srp - IB SCSI RDMA initiator driver
      • sdp - SDP driver
  2. openib - this package contains a lot of useful documentation and the important part is the OFED configuration file /etc/ofed/openib.conf and the init script /etc/init.d/openibd which takes care of activating/deactivating the IB network interfaces. Simply, it loads the IB core modules and specified ULP modules in the config.
  3. openib-diags - this package contains diagnostic tools for IB debugging, I will introduce them later.
  4. opensm - here we have our IB subnet manager. The package provides the init script /etc/init.d/opensmd for starting it and the /etc/ofed/opensm.conf configuration file.
  5. libibverbs - this package provides a library allowing userspace programs direct hardware access.
  6. libibcommon, libibmad, libibumad, opensm-libs - and finally library dependencies for the above packages.
I need to add that the OFED packages belongs to the System Environment/Libraries RPM group and they are not installed by default apart from the openib and libibverbs and of course kernel package. That's all for now and next time I'm going to describe how to work with it.


Geraldo Maia said...

Hello David,
It is a great pleasure to be visiting your nice blog.
Best wishes from Brazil:

David Sumsky said...

Thanks for your opinion.