Tuesday, July 1, 2008

RHEL and Infiniband - hardware intro

In my two previous articles, I summarized a few facts about the Infiniband support in RHEL distros and included protocols - you can go through them from the following links - RHEL and Infiniband support and Infiniband, RDP, SDP.... Let's be more particular now.

My scenario was based on two servers Sun Fire X4200 M2 and one Infiniband (IB) switch Sun IB Switch 9P. The servers had installed Infiniband host channel adapters (HCA) Sun Dual Port 4x IB HCA to be able to communicate over the IB fabric. The switch provides nine IB compliant ports at dual speeds of 4X/12X what means that each port is able to deliver of 10/30Gbit raw bandwidth. What surprised me was that the switch management is like at the SUN SPARC midrange servers. Yes, it is ALOM and it is perfect because you can use the same interface and similar commands you are used to. By the way, the switch chassis looks like a regular SUN server.

The switch is equipped with the IB subnet manager (SM) which is required to initialize the IB hardware and to allow the communication over the IB fabric. Each IB subnet has to have at least one and each has unambiguous identifier (ID) over the fabric. To be complete, the fabric comprises defined subnets. In my opinion, the IB SM seems to be working like ARP cache and DHCP server in LANs. Each HCA in a fabric is globally identified with so-called node GUID which is like WWN in FC or MAC in LAN. The switch has own GUID as well. The ports of HCA have so-called port GUID. Now, when one HCA or its port want to communicate with another one in the subnet we need to have assigned some network address. This address is called LID or local identifier and the IB SM is in charge of assigning it to the members of the subnet. The conclusion is the LIDs are available inside the subnet only and the GUIDs are routable over the subnets of fabric.

But one thing confused me a bit. When you configure the switch you will need to remember setting its blueprint otherwise you will ask for trouble. I'm going to write about it in the next part.

No comments: