26 July 2009 by Lars
Recovery notes… (how to save a misbehaving computer with RAID setup)
Long story short: I have a serious problem with my Dell Precision Workstation, which among other things has lead to 2 out of 3 disks being replaced. That will kill any RAID array unless you’ve set up RAID6 (which isn’t possible with 3 disks, I think). Before sending back the disks (zeroed out, of course), I managed to boot the workstation into a rescue mode (Ubuntu 9.04 without graphical interface – it doesn’t like my dual dual-DVI video cards, but that’s another story). The disks were presented as raw disks by the BIOS, partitioned equivalently, and then set up with a mixture of JBOD, RAID1 (root and /var), RAID0 (/tmp) and RAID5 (/home). /home is where the meat is. Thus, from rescue mode, I attached a 1TB USB drive (not in eSATA mode, since that was part of what was misbehaving), and copied over the entire raw disk:
dd if=/dev/sda of=/media/disk/sda.img
So now I have three images of the disks – what next? Well, I guess start the RAID5 array in degraded mode using these images. Here’s what I did:
losetup /dev/loop0 /media/disk-2/sda.zotique.img losetup /dev/loop1 /media/disk-2/sdb.zotique.img
fdisk -lu /dev/loop0 (inspect output) - suspect that /dev/loop0p8 is the home directory - largest partition.
fdisk -lu /dev/loop1 (inspect output) - suspect that /dev/loop1p8 is the home directory - largest partition, and equal to /dev/loop0p8. Device Boot Start End Blocks Id System /dev/loop1p1 63 8401994 4200966 82 Linux swap / Solaris /dev/loop1p2 8401995 16900379 4249192+ fd Linux raid autodetect /dev/loop1p3 16900380 17302004 200812+ fd Linux raid autodetect /dev/loop1p4 17302005 488279609 235488802+ f W95 Ext'd (LBA) /dev/loop1p5 17302068 59247719 20972826 fd Linux raid autodetect /dev/loop1p6 59247783 61352234 1052226 fd Linux raid autodetect /dev/loop1p7 61352298 63456749 1052226 fd Linux raid autodetect /dev/loop1p8 63456813 488279609 212411398+ fd Linux raid autodetect
Now we need to start the MD RAID5 array. We note where partition 8 starts: 63456813 (in blocks) = 32489888256 (in bytes). We create another loop device:
losetup -fo 32489888256 /dev/loop0
losetup -fo 32489888256 /dev/loop1
And this is where things went bad: the two disk images are of different sizes. The second command hangs. What’s up? Maybe a hint: sda was copied while hooked up to its native SATA port, while sdb was copied while hooked into a USB/eSATA dock, attached through USB. Can it matter? Or was there something bad on the disk?