Initialising new Local Grid RAID

Putting a GPT partition-table on each RAID lun

When you are creating a single file-system on a whole RAID lun, it's not strictly necessary to put a partition-table describing that one partition, and in fact lack of a partition table avoids having to consider the question of stripe/partition alignment (below); note that you will get warning messages from commands like mkfs, of the sort do you really want to do this to the whole device! But if you are going to partition, and if the file-system is to exceed 2TB in size, or if the lun size exceeds 4TB, then it has to be a GPT partition rather than a traditional DOS/fdisk one.

First, you can check that the RAID lun you are about to initialise is the one you want, by exercising that bit of the RAID:

dd if=/dev/sdb of=/dev/null 

will exercise that part of the RAID corresponding to the lun assigned to /dev/sdb, and you should be able to tell from the activity lights that it's the one you want. Or in the case of a lun which is only part of a RAIDset, that at least this lun is part of that RAIDset, and not some other RAIDset! Cancel the above command after your visual check.

So for a RAID lun of size 5TB exactly, currently at /dev/sdb, do the following, first to put a GPT partition table on the device, then to create a single 5TB partition on that device, for RAIDs with our configuration parameters, I would use the following:

# parted /dev/sdb mklabel gpt
# parted /dev/sdb mkpart 1 2560s 5000gb

Notice that the start of the partition has been specified as 2560 sectors (512-byte sectors). The idea is to ensure that the start of the partition is aligned with the data stripes of the RAID. Our RAIDs have a stripe size of 128 kiB, so with a RAID6 over 10+2 disks, the stripe width is 1280 kiB. This is 2560 sectors. Our earlier RAIDs may have relied on the filesystem internals checking the partition offset so as to align data stripes sensibly, but that is not sufficient according to this forum discussion on stripe/partition alignment when using the XFS filesystem.

Note that if you attempt to make a partition bigger than the device, parted will complain, e.g. The location 5001gb is outside of the device /dev/sdb. You can list partitions thus:

# parted /dev/sdb print

Putting a file-system on each GPT partition

The type of file-system put on the RAIDs is xfs, as this is supported by the SL5 kernel (certainly from SL5.2 onwards). File-system type ext4 was not supported by early SL5 distros, so we couldn't yet take advantage of its performance benefits over xfs, as GRID machines don't currently run on a system more recent than SL5.

Until a file-system is labelled, great care needs to be taken to ensure that the scsi device /dev/sd* is the right one to format. Linux enumerates SCSI/SAS devices in a logical but not always obvious way. And it depends on the cabling and number of switched-on RAIDs: swapping cables from one channel to another, or switching off one of the RAIDs and rebooting, alters the enumeration order, and hence all the /dev/sd* devices are different. If yet another RAID box was added, they could change again.

Once the file-systems are labelled, they can be referred to by LABEL=xyz in the /etc/fstab file, and there is no ambiguity.

Some information can be gleaned from /var/log/messages; a way of checking at a physical level would be, for example:

dd if=/dev/sdb1 of=/dev/null

and visually inspect the RAID to ensure that the expected section of the RAID is being exercised.

Formatting with xfs

Since the RAID in question is RAID 6 with 12 physical disks per RAIDset, the effective data width is 10 data disks. The block stripe size is 128 kBytes as defaulted in the RAID firmware. This gives us the following formatting command for the first RAID:

 mkfs.xfs -d su=128k,sw=10 -L f16a /dev/sdb1

-- Original version LawrenceLowe - 22 Feb 2010

Topic revision: r4 - 27 Apr 2012 - 13:52:06 - LawrenceLowe
Computing.LocalGridRaidFormat moved from Computing.LocalGridRaidInitialise on 04 Mar 2010 - 15:45 by LawrenceLowe - put it back
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback