If you take a look at my last couple of blog entries, you'd know that I had a hard drive that was approaching imminent failure:
On to the replacement:
Using mdadm, tell the Linux RAID to not recognize the disk as usable:
mdadm --fail /dev/md0 /dev/sda1
If you are like me, and you don't know which one is which, use the Disk Manager tool and write down the serial number of the drive. This will correlate to the number on the printed label of the physical drive. Note: it is handy to tape a piece of paper to the inside of your computer listing all of your drive serial numbers and the associated partition for future reference. I actually had forgotten that I did this the last time a drive failed, wrote down the serial number of my drive, and then realized the paper was in the computer.
Power down the machine, remove the faulty drive, and replace with the new one.
Once the drive is replaced, power on your computer. You should see a /dev/md0 fail event upon startup. Mine said something to the effect of 3 out of 4 devices available, 1 removed.. etc.
Next, format the new drive with fdisk:
sudo fdisk /dev/sda
This will bring you into the fdisk program. Type m for the help menu and available input options. Perform these in order:
p - print the current configuration and verify there is no partition already. This is a quick idiot check to make sure you are configuring the correct drive.
n - new partition
p - make it a primary partition
<enter> - accept the default start sector (should be 2048)
<enter> - accept the default end sector (should be the end of the hard drive)
t - change the type of the partition
fd - make it a Linux RAID autodetect
p - verify all of your settings are correct. It should look something like this:
w - write your changes to the file
This will write the new partition table, exit fdisk and return you to the command line. Execute partprobe to ensure your system will recognize the new partition.
Tell mdadm that the drive is now available:
sudo mdadm --add /dev/md0 /dev/sda1
Your data from the other 3 drives will now be rewritten across the new sda1 drive. This will take some time, but can be monitored:
watch cat /proc/mdstat
It is important to leave your machine on and uninterrupted until the rebuilding process is complete.
Aren't RAID 5's a beauty? I love having automatic hardware failure protection... assuming not more than 1 drive fails at a time. I hope you found this useful. If you have any questions or comments feel free to post in the comments below.
Next up will be to create a RAID 1 using my existing system drive and a spare unused drive I've had sitting around.... without losing any data. Should be fun!