dsumsky lines . . .: 03/2008

Friday, March 28, 2008

Quickly - Raw devices summary

Yesterday, I posted an article about configuration of raw devices on RHEL 5 platform. Just for completeness, I would like to add a few words about what raw devices on Linux are useful for.

A raw device allows to do raw I/O operations with an underlying block device. The raw I/O operation means that it bypasses kernel's block buffer cache so the I/O operation is sent directly to the underlying device. The raw I/O operation or better the application issuing it has to be aware of physical layout of the device. That means it has to be aligned in memory and on disk as well. The operation has to begin at the specified sector offset and has to have the size of multiple sectors. The restriction holds for both memory and disk.

The feature is useful for application requiring direct access to device like DBMS which are doing data caching by themselves.

Thursday, March 27, 2008

Raw devices on RHEL 5

I found out in a few discussion forums that many people are asking about raw devices configuration in the next generation of RHEL. A lot of them are thinking that support of raw devices was removed from the distro. They are right because Red Hat marked it as a deprecated and related configuration file and init script removed. It may be quite confusing for the people.

On the other hand, it introduced another way how to achieve the goal, newly via udev. I think it is more convenient than specifying raw device bindings in the /etc/sysconfig/rawdevices file which is then used by the /etc/init.d/rawdevices init script. Here it is a simple example of old bindings:

/dev/raw/raw1 /dev/sdc1
/dev/raw/raw2 8 34

The previous entries are passed to the /bin/raw command and corresponding bindings are created:

/bin/raw /dev/raw/raw1 /dev/sdc1
/bin/raw /dev/raw/raw2 8 34

To query created raw devices use the raw -qa command. To be more familiar with the raw command check the related man page (e.g. here). Here it is its output:

/dev/raw/raw1: bound to major 8, minor 33
/dev/raw/raw2: bound to major 8, minor 34

So, how to deal with the raw devices the new way? The udev "device manager" is a strong tool which is able to create device nodes on the fly according to kernel generated events. It can also run a set of defined commands if such a event happens.

The RHEL 5 introduced additional udev rules stored in /etc/udev/rules.d/60-raw.rules. This file can contain as many rules as you have bindings in your /etc/sysconfig/rawdevices file. Let's try to convert our two previous bindings to the new ones:

ACTION=="add", KERNEL=="sdc1",
RUN+="/bin/raw /dev/raw/raw1 %N"

ACTION=="add", ENV{MAJOR}=="8", ENV{MINOR}=="34",
RUN+="/bin/raw /dev/raw/raw2 %M %m"

Remember that udev rules cannot span multiple lines. The previous rules are splitted only due to its length (the RUN part is on the next line). If you specified a new rule and you want to activate it without reboot run udevtrigger command. Before you run udevtrigger check the rule with udevtest to see what's going on:

udevtest /block/sdc/sdc3

So, isn't it easy? From the presented examples it is clear that nothing changed. Only the /bin/raw command isn't called from the init script but directly from udev daemon. The syntax was changed the semantics remains the same.

RHEL and Infiniband support

The InfiniBand is a high speed, low latency switched fabric communications link. Its architecture specifies how to design interconnections between processor nodes (servers) and I/O nodes (storage devices). The interconnections are serial and bidirectional and have a point-to-point topology. More details are e.g. at wikipedia.org.

The article is a quick overview of InfiniBand and its support across RHEL distros. The InfiniBand technology preview was included in RHEL 4 update for the first time. It is based on OFED (OpenFabrics Enterprise Distribution) implementation from OpenIB.org what is a validated version of open-source OpenFabrics software stack optimized for performance (with help of RDMA). The OFED is thoroughly tested and ready to be adopted by Linux vendors and their distros. The OFED contains required kernel modules and user space libraries and tools.

Back to the RHEL. More information about the first inclusion of InfiniBand implementatin in the RHEL 4 update 3 is written in its release notes. The most important is the notice that it is not supported for production environments due to a possibility of its API changes. The preview supports SDP (Sockets Direct Protocol), IPoIB (IP over InfiniBand) and RDMA (SCSI Remote Direct Memory Access) drivers. The implementation is splitted across a few RPM packages containing kernel modules, user space libraries and so on. Check the release notes or this FAQ entry.

Next, the update 4 of RHEL 4 contains updated OFED in revision 1.0. It is still not supported in production. The release 1.0 support wider range of hardware and iSCSI over InfiniBand driver. Check the release notes again.

The RHEL 4 update 5 is the first release of RHEL which supports InfiniBand in production. The OFED was updated to the revision 1.1. This release suppports only mthca-based (cards from Mellanox) InfiniBand HCA (Host Channel Adapter). Release notes are here. The latest revision of RHEL 4 contains updated OFED in version 1.2. Read the release notes please.

And what about the RHEL 5? It is quite similar. The initial release of RHEL 5 includes the OFED 1.1. Its first update contains the OFED 1.2. Both are considered stable and ready for production use. Their release notes are here for RHEL 5 and here for RHEL 5 update 1.

The latest stable revision of OFED is 1.3 and it was released at the end of february. I'm sure it will be included in the future updates of RHEL. Finally, here is a summary of included OFED revisions in RHEL releases:

RHEL 4 update 3 - technology preview of OFED
RHEL 4 update 4 - technology preview of OFED 1.0
RHEL 4 update 5 - fully supported OFED 1.1
RHEL 4 update 6 - fully supported OFED 1.2
RHEL 5 - fully supported OFED 1.1
RHEL 5 update 1 - fully supported OFED 1.2