Recovery notes… (how to save a misbehaving computer with RAID setup)

Long story short:  I have a serious problem with my Dell Precision Workstation, which among other things has lead to 2 out of 3 disks being replaced. That will kill any RAID array unless you’ve set up RAID6 (which isn’t possible with 3 disks, I think). Before sending back the disks (zeroed out, of course), I managed to boot the workstation into a rescue mode (Ubuntu 9.04 without graphical interface – it doesn’t like my dual dual-DVI video cards, but that’s another story). The disks were presented as raw disks by the BIOS, partitioned equivalently, and then set up with a mixture of JBOD, RAID1 (root and /var), RAID0 (/tmp) and RAID5 (/home). /home is where the meat is. Thus, from rescue mode, I attached a 1TB USB drive (not in eSATA mode, since that was part of what was misbehaving), and copied over the entire raw disk:

dd if=/dev/sda of=/media/disk/sda.img

So now I have three images of the disks – what next? Well, I guess start the RAID5 array in degraded mode using these images. Here’s what I did:

losetup /dev/loop0 /media/disk-2/sda.zotique.img
losetup /dev/loop1 /media/disk-2/sdb.zotique.img
fdisk -lu /dev/loop0 (inspect output) - suspect that /dev/loop0p8 is the home directory - largest partition.
fdisk -lu /dev/loop1 (inspect output) - suspect that /dev/loop1p8 is the home directory - largest partition, and equal to /dev/loop0p8.
Device Boot      Start         End      Blocks   Id  System
/dev/loop1p1              63     8401994     4200966   82  Linux swap / Solaris
/dev/loop1p2         8401995    16900379     4249192+  fd  Linux raid autodetect
/dev/loop1p3        16900380    17302004      200812+  fd  Linux raid autodetect
/dev/loop1p4        17302005   488279609   235488802+   f  W95 Ext'd (LBA)
/dev/loop1p5        17302068    59247719    20972826   fd  Linux raid autodetect
/dev/loop1p6        59247783    61352234     1052226   fd  Linux raid autodetect
/dev/loop1p7        61352298    63456749     1052226   fd  Linux raid autodetect
/dev/loop1p8        63456813   488279609   212411398+  fd  Linux raid autodetect

Now we need to start the MD RAID5 array. We note where partition 8 starts: 63456813 (in blocks) = 32489888256 (in bytes). We create another loop device:

losetup -fo 32489888256 /dev/loop0
losetup -fo 32489888256 /dev/loop1

And this is where things went bad: the two disk images are of different sizes. The second command hangs. What’s up? Maybe a hint: sda was copied while hooked up to its native SATA port, while sdb  was copied while hooked into a USB/eSATA dock, attached through USB. Can it matter? Or was there something bad on the disk?



Leave a Reply