Migrate Linux RAID 1+LVM System To Larger Disks HOWTO

I have made this document for my own (future) use, but given the amount of internet searching I have done in order to find the info I thought I might just share it with the rest of the world.

Introduction

I have a software, mdadm, RAID 1 mirror in my computer. It works fine, and once setup, I never have to think about it. But then I run out of space, hard disk prices go down, and I migrate the system to new larger disks. And never remember neither how I set it up nor how I migrated the last time. So this time around I decided to document it for future use.

Equipment/Setup

This is how I have set up my computer:

Disks → Partitions → RAID mirror → Physical Volume → Volume Group → Logical Volume → File System
[Lower layer → Upper layer]

There are reasons why I have this setup. Using LVM makes the system agnostic of the underlying hardware, which is perfect when devices fails, are moved, etc. The system does not care, but keeps booting and running while waiting to be fixed. It is a home computer, so I really do not need a lot of different partitions for the various mount points. Keeping swap inside the mirror means there is not much need to think if a drive needs to be replaced; simply partition the drive and let mdadm do the rest/job. Done.

I have tested this setup with removed drives, trashed partition tables, power cut, etc., and it is really robust. I like that.

Approach

There are several ways to do this. I will list two of them. Both have their up- and downsides.

Alternative 1: Let mdadm do the job with the system up

The idea here is to let mdadm do the job for us. That way, we do not need to worry about where the superblocks reside and not, and so forth.

Skip to the cookie recipe section below if you want to go straight down to business.

We know from the man page that mdadm mdadm will set up mirrors from partitions of different sizes, choosing the smallest one as the resulting mirror size. We also know that mdadm is able to grow an array if there is free space in the partitions the mirror uses. Then the workflow is:

After that, all the other layers above (physical volume, logical volumes, file system) can be resized one after another.
The coolest thing with this approach is that everything is done while the system is up an running normally!

Cookie recipe

/dev/sda The first old disk
/dev/sdb The second old disk
/dev/sdc The first new (larger) disk
/dev/sdd The second new (larger) disk
   
/dev/md0 The RAID mirror
   
vg00 The volume group
   
/dev/mapper/vg00-swap The swap logical volume
/dev/mapper/vg00-root The root logical volume
  1. Create new large partitions on both new disks
  2. Make sure no superblocks are found on the new disks!
  3. Add the first new partition to the array:
  4. Remove one of the old disks from the raid in order to change the added new one from hot spare to active component:
  5. Run LILO in order to get MBR in order on the newly added disk
  6. Add the second new partition to the array:
  7. Remove the last one of the old disks from the raid in order to change the added new one from hot spare to active component:
  8. Run LILO in order to get MBR in order on the newly added disk
  9. Extend RAID partition; now that mdadm sees two larger partitions, it will be able to grow the mirror:
  10. Extend Volume Group:
  11. Extend Logical Volume (the -r flag resizes the file system to the new volume size):
  12. Set the reserved space to something else than the legacy default value of 5 (%) that is a lot of space on modern, large filesystems:
  13. Done!


Alternative 2: Copy data to new mirror

The idea here is to create an entirely new RAID mirror and then copy everything from the old one to it, finishing with running a bootloader installation to make it bootable.

Skip to the cookie recipe section below if you want to go straight down to business.

The benefits of this method is that the copying part is faster than the RAID sync, and that you get a newly created, unfragmented file system). The downside is that your system will be offline when you have created the new array and need to copy the contents from the old filesystem to the new one, you will need a rescue disk or live system and you need new names for your RAID array and volume group (and thus need to change /etc/fstab and /etc/lilo.conf in the new file system accordingly).

Cookie recipe

/dev/sda The first old disk
/dev/sdb The second old disk
/dev/sdc The first new (larger) disk
/dev/sdd The second new (larger) disk
   
/dev/md0 The old RAID mirror
/dev/md1 The new RAID mirror
   
vg00 The old volume group
vg01 The new volume group
   
/dev/mapper/vg00-swap The old swap logical volume
/dev/mapper/vg00-root The old root logical volume
/dev/mapper/vg01-swap The new swap logical volume
/dev/mapper/vg01-root The new root logical volume
  1. Create new partions on new disks:
  2. Create new RAID array:
  3. Create physical volume:
  4. Create Volume Group:
  5. Create Logical Volumes:
  6. Create file system(s) and swap area:
  7. Boot rescue system or live disk (RIPLinux, Debian Live, etc.)
  8. Assemble arrays and activate volume groups:
  9. Mount old and new file system:
  10. Copy data from old array to new:
  11. Make sure that not only all data is copied, but also that the new mirror has been successfully synchronized!
    (cat /proc/mdstat)
  12. Run LILO or GRUB on the new array in order to make it bootable. For LILO, you might want to use the -r chroot parameter, or chroot first and run LILO afterwards; regardless of what you will the need the /dev, /proc and /sys directories to be correctly populated:

APPENDIX: Large Disk Strange Issue (and solution)

When I migrated from 1 TB disks to 2 TB disks, I ran into a strange problem that I had never seen before. Running LILO in order to install correctly on the new set would fail with a “device-mapper: Sector outside mapped device?” error. This was both when I had created a set with “modern” partitions starting at sector 2048 and using the old standard with partitions starting at sector 63. If you try to find info about that with Google (or even Bing...), you will see that there is not much to find but the LILO source code that produces the output.

I got the advice to try GRUB instead, but it would fail to install because /boot/grub/core.img was 771 bytes too big to fit before sector 63 (and at that point I did not feel very much like redoing it all again to get the partitions back to starting at sector 2048). Since MBR is quite stone age, I will use GPT in the future, but at this point I did not feel very much to fiddle around with that either. My assumption was that with cloned disks and the only difference being the enlarged partitions/RAID set/Volume Group/Logical Volume/File System, probably the large size would be the issue. Thinking that the swap volume, that I created first, would probably reside in the beginning of the disk, I shrunk the swap volume, created a new logical volume for boot in that space, rsync:ed the contents of /boot, mounted that new logical volume for boot at /boot (and remembered to update /etc/fstab), ran lilo, and voilà, it worked. (For now?)

Please feel free to e-mail me if you have positive comments or concrete suggestions of updates.