*Coder Blog

Life, Technology, and Meteorology

Shrinking a Linux Software RAID Volume

I upgrade the disks in my servers a lot, and often times this requires replacing 3-4 drives. Throwing the old drives out would be a huge waste, so I bring them back to my office and put them in a separate Linux file server with a ton of drive bays. I wrote about the fileserver previously.

In the file server, I configure the drives into multiple RAID 5 volumes. Right now, I have 3 RAID volumes, each with four drives. Yesterday, one of the disks in an older volume went bad. So right now I’m running 3 out of 4 drives in a RAID 5. No data loss yet, which is good. Since this is an older RAID volume, I’ve decided not to replace the failed drive. Instead, I’ll just shrink the RAID from 4 disks into 3 disks. It was quite a hassle to figure out how to do this by researching online, so I thought I would document the entire process here, step by step, to save other people some time in the future. It should go without saying that you should have a recent backup of everything on the volume you are about to change.

  1. Make sure the old disk really is removed from the array. The device name shouldn’t show up in /proc/mdstat and mdadm –detail should say “removed”. If not, be sure you mdadm –fail and mdadm –remove the device from the array.
    # cat /proc/mdstat
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6]... 
    md0 : active raid5 sdh2[1] sdj2[0] sdi2[3]
          1452572928 blocks level 5, 64k chunk, algorithm 2 [4/3] [UU_U]
          
    unused devices: <none>
    # mdadm --detail /dev/md0
    /dev/md0:
            Version : 0.90
      Creation Time : Wed Apr  8 12:24:35 2009
         Raid Level : raid5
         Array Size : 1452572928 (1385.28 GiB 1487.43 GB)
      Used Dev Size : 484190976 (461.76 GiB 495.81 GB)
       Raid Devices : 4
      Total Devices : 3
    Preferred Minor : 0
        Persistence : Superblock is persistent
    
        Update Time : Tue Aug 16 13:33:25 2011
              State : clean, degraded
     Active Devices : 3
    Working Devices : 3
     Failed Devices : 0
      Spare Devices : 0
    
             Layout : left-symmetric
         Chunk Size : 64K
    
               UUID : 02f177d1:cb919a65:cb0d4135:3973d77d
             Events : 0.323834
    
        Number   Major   Minor   RaidDevice State
           0       8      146        0      active sync   /dev/sdj2
           1       8      114        1      active sync   /dev/sdh2
           2       0        0        2      removed
           3       8      130        3      active sync   /dev/sdi2
  2. Unmount the filesystem:
    # umount /dev/md0
  3. Run fsck on the filesystem:
    # e2fsck -f /dev/md0
  4. Shrink the filesystem, giving yourself plenty of extra space for disk removal. Here I resized the partition to 800 GB, to give plenty of breathing room for a RAID 5 of three 500 GB drives. We’ll expand the filesystem to fill the gaps later.
    # resize2fs /dev/md0 800G
  5. Now we need to actually reconfigure the array to use one less disk. To do this, we’ll first query mdadm to find out how big the new array needs to be. Then we’ll resize the array and reconfigure it for one fewer disk. First, query mdadm for a new size (replace -n3 with the number of disks in the new array):
    # mdadm --grow -n3 /dev/md0
    mdadm: this change will reduce the size of the array.
           use --grow --array-size first to truncate array.
           e.g. mdadm --grow /dev/md0 --array-size 968381952
  6. This gives our new size as being 968381952. Use this to resize the array:
    # mdadm --grow /dev/md0 --array-size 968381952
  7. Now that the array has been truncated, we set it to reside on one fewer disk:
    # mdadm --grow -n3 /dev/md0 --backup-file /root/mdadm.backup
  8. Check to make sure the array is rebuilding. You should see something like this:
    # cat /proc/mdstat
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6]... 
    md0 : active raid5 sdh2[1] sdj2[0] sdi2[3]
          968381952 blocks super 0.91 level 5, 64k chunk, algorithm 2 [3/2] [UU_]
          [>....................]  reshape =  1.8% (9186496/484190976) 
                                      finish=821.3min speed=9638K/sec
  9. At this point, you probably want to wait until the array finishes rebuilding. However, Linux software RAID is smart enough to figure things out if you don’t want to wait. Run fsck again before expanding your filesystem back to it’s maximum size (resize2fs requires this).
    # e2fsck -f /dev/md0
  10. Now do the actual expansion so the partition uses the complete raid volume (resize2fs will use the max size if a size isn’t specified):
    # resize2fs /dev/md0
  11. (Optional) Run fsck one last time to make sure everything is still sane:
    # e2fsck -f /dev/md0
  12. Finally, remount the filesystem:
    # mount /dev/md0

Everything went smoothly for me while going through this process. I could have just destroyed the entire old array and recreated a new one, but this process was easier and I didn't have to move a bunch of data around. Certainly if you are using a larger array, and are going from 10 disks to 9 or something along those lines, this benefits of using this process are even greater.

9 Comments

  1. Thanks for sharing this howto. It will probably save time of lot’s of people…

  2. Thanks a lot for documenting this so clearly. Worked great on my Debian/OpenMediaVault. I also had been searching the Web for this fruitlessly until I found your blog.

  3. Worked great (although slightly different since I was using LVM physical volumes). In particular, appreciated the trick of getting “mdadm –grow -n3 /dev/md0” to print out the exact size required.

    I also was not aware of the –backup-file having never required it before in other mdadm operations, so you likely saved me time there as well. Thanks.

  4. Great page.. well written!

    I wanted to go from 6 disks to 3 and wasn’t sure if I had to repeat this process stepping down one disk at a time.. So I made sure I had a good backup and tried to shrink directly down to 3 in one step.. worked perfectly!

    Thanks!

  5. Perhaps it’s not so obvious, so I should also mention my array was a raid5 array, so I could only fail and remove ONE disk in this procedure, even though I was dropping from 6 to 3 drives. Failing and removing more than 1 drive in a RAID5 will of course destroy the array and the data.

    You should only fail and remove the remaining drives (2 in this case) AFTER the array finishes reshaping.

  6. Just for the records – I wonder if this applies the same for Harddisks with GPT ? – any changes recomended?

  7. Max U:
    The partitioning scheme (GPT, MBR, etc) used for the individual disks shouldn’t matter when running through this process. The disk partitioning is at a lower level than the RAID volume and the filesystem on the RAID volume. All of the operations above modify the RAID volume and the filesystem on that volume. The underlying partitioning scheme of the individual disks isn’t touched.

  8. hello,

    i have a raid 10 with 16 disks with 8tb disks (apx. 58tb) and would like to shrink this raid to a 8 disks – act. used space is apx. 4tb.

    my main question is if i must fail devices or can i just use the resize + grow option?
    can i specify which disks are the remaining ones?

    Personalities : [raid10] [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4]
    md0 : active raid10 sdc[16] sdj[9] sdg[6] sdp[15] sdi[8] sdn[13] sde[4] sdk[10] sdm[12] sdl[11] sdh[7] sdd[3] sdo[14] sdb[1] sdf[5] sda[0]
    62511161344 blocks super 1.2 512K chunks 2 far-copies [16/16] [UUUUUUUUUUUUUUUU]
    bitmap: 3/466 pages [12KB], 65536KB chunk

    it would be cool if do not need to backup/restore all the data and build the raid from scratch.

    any help highly apriciated.
    thanks
    holli

  9. It might be pretty difficult to drop disks from a volume of that size, especially since you’re using RAID 10.

    With a RAID 5/6, you can fail one disk at a time and shrink the array in small blocks. With RAID 10, you would have to shrink the filesystem enough to remove one of the mirror sets. Then fail those disks out of the RAID and recreate the array with fewer disks. I’m not sure that last step will work, because if you failed two disks from the same stripe out of the array, the whole volume would be in a failed state (unlike RAID 5/6 which would just be degraded).

    Since you don’t seem to have too much actual used storage, it would probably be a lot easier to copy the data to another volume and rebuild the array with the reduced number of disks you want to use.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

© 2019 *Coder Blog

Theme by Anders NorenUp ↑