Proxmox VE 3.2 software RAID using MDADM

Intro

Proxmox is presently my GUI of choice for using KVM. However, during Proxmox’s ISO install you are only given the choice of what disk to install but not the layout. There are other articles on how to do this but for VE 3.2 there is no single place you can find instructions because the partition type changed from MSDOS to GPT.

These instructions only concern fresh installs of Proxmox VE 3.2 from the ISO because if you upgraded — performed a dist-upgrade — your will retain the disk layout of 3.1.

Proxmox VE 3.2 software raid

Fresh install of Proxmox VE 3.2

My setup

Install type Bare metal
Proxmox Version 3.2-5a885216-5 built on 2014-05-06
Install disk /dev/sda: a WD 1TB black
Mirrored disk /dev/sdb: a WD 1TB black
RAID setup mdadm –level=1: mirror setup

Step 1: Install and setup

Disk prep

Before you even start your install of Proxmox I highly recommend — unless you are sure about your disk history — that you boot into something like Kali Linux or System rescue CD and clear the two disks you intend to use for the OS mirror. It is a good idea when setting up RAID that you use the exact same disk models but it is not a requirement.

Once you boot into your live CD or choice use parted or gparted to clear all partitions on the disks you want to use. A second and very important step before setting up RAID is making sure the disks don’t have any hardware or software RAID metadata on them. Rather than worry about if the disks have this data just clear where it lives with these two commands — after you remove all partitions –.

The following command will clear the first 512*10 bytes of the disk. This is me being overzealous as the RAID metadata is only suppose to live in the first 512 bytes.

dd if=/dev/zero of=/dev/sdX bs=512 count=10

Then because some versions of RAID live in the last 512 bytes clear that as well.

dd if=/dev/zero of=/dev/sdX bs=512 seek=$(( $(blockdev --getsz /dev/sdX) - 1 )) count=1

Make sure you run both of these commands for each sdX where X is your drive (a/b/c/d etc). Now we can be certain no previous RAID data is going to cause major headaches and confusion with our install going forward.

Hint: if you find your /dev/md0 or /dev/md1 showing up as md126 or something similar after a reboot you probably didn’t clear the drives correctly.

Proxmox Install

Boot from your ISO or burned disk and perform the install. Keep track of which /dev/sdX you install to; in my example I am using /dev/sda as my install disk. Once your installation is completed you’ll want to make sure you have internet connectivity.

There is a good chance you don’t have a Proxmox community subscription so if you don’t have a license key follow the instructions in the link below to change over to the “pve-nosubscription” repo.

https://pve.proxmox.com/wiki/Package_repositories#Proxmox_VE_No-Subscription_Repository

Once you have completed these steps run the following commands to get your Proxmox install up to date

apt-get update
apt-get dist-upgrade
apt-get upgrade

Now that your Proxmox is up to date you’ll need to install MDADM to be able to perform RAID later on.

apt-get install mdadm

During mdadm’s install you will be prompted; feel free to leave the answers at default unless you know why you want to change them. Once this completes we can move to the next stage.

Step 2: Move /boot and grub to a software raid mirror

Understanding Proxmox partition layout

The default Proxmox partitioning after a fresh install looks like this:

(parted) print
Model: ATA WDC WD1002FAEX-0 (scsi)
Disk /dev/sda: 1000GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
 
Number  Start   End     Size    File system  Name     Flags 
1      1049kB  2097kB  1049kB               primary  bios_grub 
2      2097kB  537MB   535MB   ext3         primary  boot 
3      537MB   2000GB  2000GB               primary  lvm
 
(parted)

We can see from this list that partitions 2 and 3 are the ones we need to mirror through MDADM. Remember that in my setup I am using /dev/sda and /dev/sdb but that you may have decided to use other disks so you will need to transpose your device names.

Cloning the installed partitions into RAID

Because Proxmox VE 3.2 uses GPT instead of MSDOS we have to use the tool sgdisk instead of sfdisk. Run the following commands to prep your blank second disk.

sgdisk -R=/dev/sdb /dev/sda
sgdisk -t 2:fd00 /dev/sdb
sgdisk -t 3:fd00 /dev/sdb

The first command copies the partition table from sda to sdb while the second and third set the partition types of /deb/sdb2 & /dev/sdb3 to RAID instead of boot/lvm. If you want to learn more partition flags and types sgdisk –list-types shows all types you can set.  Your partition should now look like this:

(parted) select /dev/sdb
Using /dev/sdb
(parted) print
Model: ATA WDC WD1002FAEX-0 (scsi)
Disk /dev/sdb: 1000GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
 
Number  Start   End     Size    File system  Name     Flags 
1      1049kB  2097kB  1049kB               primary  bios_grub 
2      2097kB  537MB   535MB                primary  raid 
3      537MB   1000GB  1000GB               primary  raid
 
(parted)

Create your RAID array

Since you already installed MDADM earlier on creating the two RAID arrays is simple. However, we are still booted off of a non-raid /dev/sdaX partition so we are going to add a “missing” disk that will later be replaced by /dev/sdaXs. The following commands will create the arrays:

mdadm --create /dev/md0 --level=1 --raid-disks=2 missing /dev/sdb2
mdadm --create /dev/md1 --level=1 --raid-disks=2 missing /dev/sdb3

During this step you will likely see the following complaint which is OK you can answer “y”

mdadm: /dev/sdd2 appears to contain an ext2fs file system
    size=1950656K  mtime=Tue Aug 19 11:12:27 2014
mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
    --metadata=0.90
Continue creating array?

Ultimately /dev/md0 is going to hold your /boot while /dev/md1 will hold your LVM partition.

Install /boot to your newly created /dev/md0

We are going to format /dev/md0 as ext3, copy grub to it and set it up as our /boot mount.

mkfs.ext3 /dev/md0 
mkdir /mnt/tmp
mount /dev/md0 /mnt/tmp
cp -ax /boot/* /mnt/tmp
umount /mnt/tmp
rmdir /mnt/tmp

Now we just need to fix our /etc/fstab so it mounts /boot from MD0 instead of /dev/sda2 — below is a copy of my modified fstab —

/dev/pve/root / ext3 errors=remount-ro 0 1
/dev/pve/data /var/lib/vz ext3 defaults 0 1
/dev/md0        /boot   ext3    defaults 0 1
#UUID=46b4d3d6-fdec-43b6-a4cb-3f8f8a9c6c10 /boot ext3 defaults 0 1
/dev/pve/swap none swap sw 0 0
proc /proc proc defaults 0 0

You can see I commented out the /dev/sda UUID and replaced it with /dev/md0. At this point we are ready to reboot but I must stress that this work requires accuracy and if mistakes are made you might need to start over because the system won’t boot. Once your reboot is complete typing the following command will show you were /boot was mounted from.

root@host:/# mount | grep boot
/dev/md0 on /boot type ext3 (rw,relatime,errors=continue,user_xattr,acl,barrier=0,data=ordered)

Above we are simply asking the system /boot was mounted; /dev/md0 is the answer we want.

Tell GRUB to boot from /dev/md0

Next we need to make a few changes to update and reinstall grub.

echo 'GRUB_DISABLE_LINUX_UUID=true' >> /etc/default/grub
echo 'GRUB_PRELOAD_MODULES="raid dmraid"' >> /etc/default/grub
echo raid1 >> /etc/modules
echo raid1 >> /etc/initramfs-tools/modules
grub-install /dev/sda
grub-install /dev/sdb
update-grub
update-initramfs -u

You shouldn’t get any errors during this process.

Add /dev/sda2 into the raid array

Just like you did for /dev/sdb2 we need to convert /dev/sda2 to a raid partition by typing:

sgdisk -t 2:fd00 /dev/sda

Once this is completed you can add it to the /dev/md0 array.

mdadm --add /dev/md0 /dev/sda2

Once you run the above command mdadm will “repair” the /dev/sda2 partition to mirror /dev/sdb2.

Step 3: Move the 3rd LVM partition over to /dev/md1

The root partition for Proxmox is installed on a logical disk managed by LVM, Moving this isn’t as simple as copying because it requires special steps to create the volume on /dev/md1 as well as to remove it from /dev/sda3.

Create new LVM volume

Run the following commands to move our root FS LVM from /dev/sda3 to /dev/md1

pvcreate /dev/md1
vgextend pve /dev/md1
pvmove /dev/sda3 /dev/md1

The move command takes a very long time — several hours for the 1TB blacks that I used –. If you’re running this over SSH and are concerned your session might timeout you can press “ctrl+z” and then type “bg && disown -a” to separate the task from your session.

Remove /dev/sda3 from the LVM volume

Once the pvmove above is completed we can safely remove /dev/sda3 from the volume which will enable us to add it to the /dev/md1 array.

vgreduce pve /dev/sda3
pvremove /dev/sda3

Add /dev/sda3 to the /dev/md1 array

The final step is to add our /dev/sda3 partition to the /dev/md1 array with the following two commands

sgdisk -t 3:fd00 /dev/sda
mdadm --add /dev/md1 /dev/sda3

At this point /dev/sda3 will start healing to become a clone of /dev/sdb3. If you made it this far you are now running Proxmox VE 3.2 software raid!

Conclusion: Proxmox VE 3.2 on software raid

The extra work is worth it

Proxmox is a great ISO install for running KVM or OpenVZ. However, the installation does not give you options like software raid — I suspect to make it as simple as possible –. While performing the above steps may seem like a burden it gives you the safety of being able to lose one of your OS drives without having to introduce hardware RAID.

Don’t use this for VM storage

For several reasons I would recommend storing your VM disk files elsewhere.

  1. It’s good practice to have OS and data storage independent of each other
  2. I find that MDADM mirrored storage is much slower than single disk write performance

Below is a test on the array of two 1TB WD Blacks.

root@host:/tmp# dd if=/dev/zero of=./test.img bs=1M count=1024 conv=fdatasync
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 14.4757 s, 74.2 MB/s

Thanks

24 comments to Proxmox VE 3.2 software RAID using MDADM

  • Two small things we found:

    1) Missing / after boot in “cp -ax /boot* /mnt/tmp”. Should be “cp -ax /boot/* /mnt/tmp”
    2) Missing pve after vgreduce in “vgreduce /dev/sda3”. Should be “vgreduce pve /dev/sda3”

    • Thanks for noticing those mistakes. I have updated the tutorial to reflect both corrections. Additionally I will be adding a new node to my own production setup within the next day or so. I will make sure to follow this tutorial to the letter to verify its’ integrity and fix any other errors I find along the way.

  • Jacob Dall

    Thank you for a great tutorial!

    I encountered an issue on a completely fresh install of Proxmox 3.3 (both sda and sdb unpartitioned) in the ‘mdadm –create /dev/md0’ step, because a md0 array already existed (probably configured when mdadm was installed). So I had to do the following steps before continuing with the guide:

    1) mdadm –stop /dev/md0
    2) mdadm –remove /dev/md0

    Also, before the reboot, the following was required to allow success on the boot (otherwise the boot loader complained about an ext2 filesystem issue):

    1) mdadm –zero-superblock /dev/sda1
    2) remove the md0 array definition from mdadm.conf
    3) mdadm –detail –scan >> /etc/mdadm/mdadm.conf

    • I’m glad you ran into this issue because I’ve had trouble verifying hurdles in the installation where the disks used have previously been in a raid array. It appears that wiping the first and last sectors of the disk isn’t enough. I will add your steps to my disk prep section as I do recall at one point having to do what you just described.

      I’m glad this tutorial works on 3.3 — I haven’t performed a fresh 3.3 install myself, just upgrades –.

  • macpip

    After big headache I’ve been able to discover that raid1 was not recognized at boot on a new and fresh Proxmox 3.3 installation because of /etc/default/mdadm

    SOLVED by changing in /etc/default/mdadm

    -AUTOSTART=false
    +AUTOSTART=true

    Thanks you for this great guide.

  • macpip

    In addition, at the last reboot the server did not start complaining it was unable to find /dev/pve/root because initramfs is not aware of the raid (Thanks to Martin Dimov in http://www.cesararaujo.net/en/proxmox-v3-software-raid/ )

    SOLVED by changing, again in /etc/default/mdadm

    -INITRDSTART=’none’
    +INITRDSTART=’all’

    before the last update-initramfs -u (in the “Tell GRUB to boot from /dev/md0” step)

  • Jens Hellermann

    Dear Mark,

    thank you for this great manual. I managed to setup the raid for Proxmox with its guidance.

    I failed two times though rebooting after changing the fstab. Only when I worked myself through the next part “Tell Grub to boot from /dev/md0” and then rebooted I could continue with the next steps.

    Maybe you could verify, if that step should be made before rebooting.

    With kind regards, Jens Hellermann.

  • Mark

    Is using md-raid 1 only ‘bad’ for performance reasons? Or does it bring stability issues as well?

    I’m building a slim energy efficient server with intel 1265Lv2 + 32Gb ram and 2x 3TB WD RED drives.

    Most of the vm’s will be idling most of the time. Bunch of vm’s but most for personal usage. (mail, www, Shell-mess-about-machine, pfsense … some testing vm’s)

    I might add a 3rd disk for running fast-vm’s that backup nightly to 1 backup-vm that lives on raid1.

  • Walter

    Dear mark,
    thank you for your great work! At the “Create new LVM volume” i run in an error

    root@monschauer:~# pvcreate /dev/md1
    Can’t initialize physical volume “/dev/md1” of volume group “pve1” without -ff
    root@monschauer:~# pvcreate /dev/md1 -ff
    Really INITIALIZE physical volume “/dev/md1” of volume group “pve1” [y/n]? y
    Can’t open /dev/md1 exclusively. Mounted filesystem?

    md1 is not mounted

    Any idea what happens to me?

    Thank you, Walter

  • Dariusz

    On Proxmox 3.3

    When executing:
    # grub-install /dev/sdb
    /usr/sbin/grub-probe: error: no such disk.
    Auto-detection of a file system of /dev/md0 failed.
    Please report this together with the output of “/usr/sbin/grub-probe –device-map=/boot/grub/device.map –target=fs -v /boot/grub” to

    To get around the issue:
    grub-install –recheck /dev/sda
    grub-install –recheck /dev/sdb

    regards,

  • Paul Littlefield

    I finished the instructions and all seemed fine, but when I came to reboot the Grub menu has come up with File Not Found and now I have a grub rescue> prompt :(

  • Dan

    I’ve extended this method to a LVM on RAID10(mdadm) configuration.

    root@oliveto:~# lsblk
    NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
    sda 8:0 0 931.5G 0 disk
    ├─sda1 8:1 0 1M 0 part
    └─sda2 8:2 0 931.5G 0 part
    └─md0 9:0 0 1.8T 0 raid10
    ├─pve-root (dm-0) 253:0 0 96G 0 lvm /
    ├─pve-swap (dm-1) 253:1 0 15G 0 lvm [SWAP]
    ├─pve-data (dm-2) 253:2 0 32G 0 lvm /var/lib/vz
    └─pve-boot (dm-3) 253:3 0 512M 0 lvm /boot
    sdb 8:16 0 931.5G 0 disk
    ├─sdb1 8:17 0 1M 0 part
    └─sdb2 8:18 0 931.5G 0 part
    └─md0 9:0 0 1.8T 0 raid10
    ├─pve-root (dm-0) 253:0 0 96G 0 lvm /
    ├─pve-swap (dm-1) 253:1 0 15G 0 lvm [SWAP]
    ├─pve-data (dm-2) 253:2 0 32G 0 lvm /var/lib/vz
    └─pve-boot (dm-3) 253:3 0 512M 0 lvm /boot
    sdc 8:32 0 931.5G 0 disk
    ├─sdc1 8:33 0 1M 0 part
    └─sdc2 8:34 0 931.5G 0 part
    └─md0 9:0 0 1.8T 0 raid10
    ├─pve-root (dm-0) 253:0 0 96G 0 lvm /
    ├─pve-swap (dm-1) 253:1 0 15G 0 lvm [SWAP]
    ├─pve-data (dm-2) 253:2 0 32G 0 lvm /var/lib/vz
    └─pve-boot (dm-3) 253:3 0 512M 0 lvm /boot
    sdd 8:48 0 931.5G 0 disk
    ├─sdd1 8:49 0 1M 0 part
    └─sdd2 8:50 0 931.5G 0 part
    └─md0 9:0 0 1.8T 0 raid10
    ├─pve-root (dm-0) 253:0 0 96G 0 lvm /
    ├─pve-swap (dm-1) 253:1 0 15G 0 lvm [SWAP]
    ├─pve-data (dm-2) 253:2 0 32G 0 lvm /var/lib/vz
    └─pve-boot (dm-3) 253:3 0 512M 0 lvm /boot

    As you can see, I’ve moved boot onto the lvm and I’ve shrunk pve-data temporarily while I move this around. I plan to now attach VMs on ISCSI.

    Anyone have experience with this arrangement? If so, what is the proper way to handle the 1MB GPT/bios_grub partition across a raid set like this?

  • raf

    Hello
    I installed proxmox 3.3, and wanted to mograte to linux raid, but when i rebooteed system after editing /etc/fstab, i got an error Couldn`t find device with uuid … Unable to find LVM volume pve/root.
    What is wrong ?

  • Tony

    Just wanted to say thank you for this tutorial! We have been in the plans of migrating our VMs all to ProxMox and their respective storage to Ceph through ProxMox. With this I had setup all of the systems with raid 1 on the boot drives and today (about 3 weeks into stability testing) one of the boot drives failed – the server chugged along happily and I was easily able to shut it down (one of the ceph boxes), replace the drive, add it to the array and it is rebuilding itself. Without this guide I think that box would have been lost and I would be rebuilding it from scratch!

  • Scott

    I’d like to add that I followed this tutorial with Proxmox 3.4 and it worked. The only thing that threw me for a loop was some “‘null’ not found” error during ‘grub-install’ for both drives. I didn’t make any changes and continued with the guide and it all worked out. My array is currently rebuilding. Thank you for this.

  • Martin

    Can I add my thanks – a really clear tutorial and very helpful!

    Also just wanted to note for the benefit of others that I did see some warnings when installing grub to sda1 and sda2:

    root@proxmox:~# grub-install /dev/sda
    Installing for i386-pc platform.
    grub-install: warning: Couldn’t find physical volume `(null)’. Some modules may be missing from core image..
    grub-install: warning: Couldn’t find physical volume `(null)’. Some modules may be missing from core image..
    Installation finished. No error reported.

    I haven’t chased down the root cause, but it appears to be a similar problem to that discussed at http://serverfault.com/questions/617552/grub-some-modules-may-be-missing-from-core-image-error

    I decided to just risk it and reboot to see if it cleared the warning, as it did for various of the people posting on the page above. The server came back fine and installing grub works cleanly now.

  • So I tried to follow this guide for Proxmox 3.4. It was pretty similar, except /boot is part of the root partition, and /dev/sda2 is mounted at /boot/efi.
    So I did the changes accordingly, copied /boot/efi to /dev/sdb2 and set up /dev/md0 and changed /etc/fstab accordingly, rebooted and everything worked.
    However, when I tried to continue with the LVM partition, everything seems to work until I reboot, and I get thrown into the grub-rescue-prompt saying it can’t find lvmid/[long UUID].
    I’m dead stuck here, and have no idea what to do or how to fix this.
    I tried using a live Linux Mint, but it’s unable to work with the files inside the raid inside the lvm system.

    Any suggestions?

    • EB

      I am having problems getting this to work on 3.4-11
      I get to the point of editing /etc/fstab and the UUID line is not there (as it was in 3.4-3 which I had this working beautifully on), and if I add the line that used to work before – to boot off of /dev/md0 – after reboot I am told there is no /dev/md0 and am put into a recovery console. So I remove that line from fstab, reboot without any issue, and sure enough there is no md0.
      Have to put off upgrading all my nodes because this will break them all for sure.

  • After the second half of the mirrors are added, why is the second modification of mdadm.conf needed? On my system, mdadm.conf was the same before and after.

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre lang="" line="" escaped="" cssfile="">