Configure
Build your server

Windows — How to Swap NVMe Drives of a Dynamic Array "On the Fly"

Windows — How to Swap NVMe Drives of a Dynamic Array "On the Fly" 10 October 2023

In this article, we'll demonstrate practically how to replace NVMe drives of a dynamic array on the fly.

We have a dynamic mirrored array on Windows Server 2016, constructed with two identical NVMe drives. The capacity of the array is about 3 TB and is used for QuickBooks. Naturally, the client began to run out of space. Two new 6 TB NVMe drives were purchased. We need to transfer the dynamic array to these new drives.

The issue is that the server on which everything operates can only accommodate two NVMe drives; all the slots are taken. We will have to replace the drives in the array sequentially and then expand it.

Another challenge is that the server cannot be shut down, nor can QuickBooks be halted. We managed to negotiate a possible brief service interruption, not exceeding 5 minutes.

In Disk Management, you can view our array.

Windows — How to Swap NVMe Drives of a Dynamic Array "On the Fly"

Drive D: is located on two physical NVMe drives.

The array's status is Healthy.

Jumping ahead, we managed to replace the drives on two such servers without stopping services. However, there's a catch, which will be explained later. The chances of doing this without stopping services are 50/50. This is due to an annoying feature of dynamic arrays that I find puzzling, disturbing, and simply harmful.

Drive D: has 49% free space, but this is only because we were able to temporarily transfer some of the databases to a neighboring server.

We're using the following drives: Intel SSD 6.4 TB U.2 - SSDPF2KE064T1.

Windows — How to Swap NVMe Drives of a Dynamic Array "On the Fly"

Preparing to replace the first drive

Before ejecting the drive, you need to disassemble our dynamic array.

win

Right-click on any of the D: drives, it doesn’t matter which. Select Break Mirrored Volume.

win

You are warned that after this process, the data on the drives will no longer be identical. Click Yes.

win

Our dynamic mirrored array ceased to be an array and split into two separate dynamic disks. One of them remained as drive D:, while the other one is now E:.

This situation seems very odd to us. Drive letter assignments occur randomly and do not depend on which drive you right-clicked when disassembling the array. We tried several times to disassemble the mirror on the same array: each time the drive designation was random. This is such an inconvenient oversight that using dynamic mirrored arrays in critical environments becomes practically impossible.

In our case, it doesn't matter which of the drives remained as D:. Currently, it's Disk 1, which the database continues to use, and we won't touch it.


win

We don't need the E: drive.

win

We right-click on E: and select "Delete Volume...".

win

The data on the E: drive will be deleted, but we don’t need it. Yes.

win

Disk 0 now has an unallocated space; this physical drive will be the first one we change.

Herein lies another difficulty, as we need to determine the serial number of the drive we'll be extracting, specifically Disk 0's serial number. Unfortunately, we couldn't find out the serial number using the OS's standard tools and had to resort to third-party software.

win

The required drive is labeled as Disk 0, and we note down the serial number A07B8F5A.


win

Windows Server 2016 supports hot-swapping of drives. In the system tray, we click the appropriate button and command the drive to be ejected. Ensure not to confuse it with the D: drive; although the model is the same, it's labeled differently. Eject.


win

We are notified that the device can be safely removed.

And here we go again with another bug. The drive remains in Online mode, which isn't good. It's unclear why the drive isn't disconnecting, so manual intervention is required.

win

Right-click on Disk 0 and switch it to Offline.

win

Disk 0 is now in Offline status. The drive can now be safely removed.

Replacing the First Drive

Holding the new drive in your hands at the data center, the question arises: which one to pull out?

nvme

Both drives appear identical, flashing differently due to various loads, of course, but you don't want to make a mistake. If it were possible to shut down the server, we would simply remove the drives and identify the right one by its serial number. However, shutting down the server is not an option. We need to illuminate the drive.

Almost all servers and storage arrays have a mechanism that allows you to "illuminate a drive". Some have this feature implemented in the web interface for server management (IMM, iLO, IPMI, and other BMCs). Some provide the option to activate the light via a CLI command.

In the latest Supermicro servers, the IPMI management web interface has a dedicated section for managing NVMe drives under Server Health → Storage Monitoring. Under the Physical View tab, you can see a list of available drives and their details: model, manufacturer, serial number, temperature, etc. Additionally, there's an option to perform various operations on these drives. We simply locate the desired drive by its serial number and highlight it.

In our case, there was an issue with the server model; the drive wouldn't illuminate.


win

We locate the necessary drive by its serial number.


win

And highlight it, Blink.

The drive started flashing; now we know its location.

In the drop-down list of Available Actions, select Eject. Check the desired disk and click Apply. However, there's a catch: the button is inactive, unclickable, and doesn't work. It's the same issue as with the backlight. We need to address it.


nvme

Eject. Yes.

nvme

The drive indicator will turn green, and it's safe to remove the drive.

Drive Indicators:

blue solid on — drive is in place
blue blinking — I/O activity

red solid on — failure
red blinking at 1 Hz — rebuilding
red blinking pattern 2+1 at 1 Hz — hot spare
red blinking every 5 seconds — drive power on
red blinking at 4 Hz — identification

green solid on — safe to remove

yellow blinking at 1 Hz — warning, do not remove

Remove the drive and make sure we haven't made a mistake, that the drive's serial number is the one we need, and that the system hasn't crashed.

win

Disk 0 has disappeared from the system.

win

Wait for 5 minutes, transfer the drive caddy to the new drive, and insert it.


win

Ensure that the drive appears in the IPMI web interface. If the drive isn't there and the inserted drive's green light continues to shine, remove the drive and re-insert it a couple of minutes later. I've experienced this on one of the servers.

win

In the Disk Management snap-in, a new Disk 0 appears with a larger volume.

win

Right-click on the drive and initialize it.

win

Since the drive is larger than 2 TB, we choose GPT. OK.

win

The drive is initialized.

win

Now, we need to recreate a mirrored dynamic array. Right-click on drive D:, select Add Mirror.

win

Choose Disk 0. Add Mirror.

win

This operation will convert Disk 0 to dynamic. Yes.

win

A RAID 1 mirrored array is being created. However, the data isn't synced yet. The synchronization process starts, and we see the progress percentage. The process takes a while, quite a long time actually. Disk 0 is marked with an exclamation sign since its data doesn't match the primary drive.

win

After the synchronization is complete, we have a software RAID 1 array with two drives.

The first drive has been replaced, half the job is done.

Preparing to Replace the Second Drive

And once again, we need to break our dynamic array.

win

Right-click on any of the D: drives; it doesn't matter which one. Choose Break Mirrored Volume.

win

We're warned that after this process, the data on the drives will no longer be identical. Yes.

win

Our mirrored dynamic array is no longer an array and has split into two separate dynamic drives; one remains as the D: drive, and the other is now E:. Unfortunately, the D: drive is the one we intended to remove.

Replacing the Second Drive

Now we're back in the data center with the second drive.


win

In the IPMEI web interface, eject the drive, Eject. We no longer need the serial number since we can't mix up the drives.

win

Remove the drive from the server. The drive disappears from the system. All services continue to run smoothly.
Switch the brackets to the new drive and insert it into the slot.

win

The drive shows up in the system.

win

Initialize it.

win

GPT. OK.

win

Both physical drives have been replaced. From here on, it's straightforward:

  1. Expand the array to cover the entire Disk 0.
  2. Create a mirrored array, adding Disk 1.

win

In this example, we demonstrated how to hot-swap two NVMe drives on a server and expand the array without downtime.

If you need a restored server, experts from Newserverlife can help you with the selection, ensure quality, and deliver promptly.