ZFS - Remove a Disk from Pool
I currently have a ZFS pool with 16 disks. This pool is a temporary one and no data is currently sitting on the pool. A current migration requires a new disk and I do not have a spare with me at the moment.
Let's remove a disk from the current pool. Run it as degraded for the time being. I can bring a spare with me tomorrow and replaced the removed disk.
1) Pool and Disk information
Let's start listing our pools.
zpool list
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
rpool 278G 21.9G 256G - - 8% 7% 1.00x ONLINE -
temp1-pool 8.72T 3.12G 8.72T - - 0% 0% 1.00x ONLINE -
We also need to know which disks are part of the pool.
zpool status -P temp1-pool
pool: temp1-pool
state: ONLINE
scan: scrub repaired 0B in 0 days 00:00:01 with 0 errors on Sun Jul 10 00:24:03 2022
config:
NAME STATE READ WRITE CKSUM
temp1-pool ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
/dev/sdh1 ONLINE 0 0 0
/dev/sdi1 ONLINE 0 0 0
/dev/sdj1 ONLINE 0 0 0
/dev/sdk1 ONLINE 0 0 0
/dev/sdc1 ONLINE 0 0 0
/dev/sdb1 ONLINE 0 0 0
/dev/sdt1 ONLINE 0 0 0
/dev/sds1 ONLINE 0 0 0
raidz2-1 ONLINE 0 0 0
/dev/sdu1 ONLINE 0 0 0
/dev/sdr1 ONLINE 0 0 0
/dev/sdq1 ONLINE 0 0 0
/dev/sdv1 ONLINE 0 0 0
/dev/sdp1 ONLINE 0 0 0
/dev/sdw1 ONLINE 0 0 0
/dev/sdo1 ONLINE 0 0 0
/dev/sdn1 ONLINE 0 0 0
errors: No known data errors
I guess that /dev/sdn
is disk in the slot 24 of my server.
{
"slot-number": 23,
"enclosure-id": "17",
"enc-position": "1",
"device-id": "37",
"wwn": "5000C5007228FFB0",
"media-error-count": "0",
"other-error-count": "2",
"predict-fail-count": "0",
"pd-type": "SAS",
"raw-size": 600126116593.664,
"sector-size": "512",
"logical-size": "512",
"firmware-state": "JBOD",
"serial": "SEAGATEST600MM00060004S0M2BLYE",
"device-speed": "6.0Gb/s",
"link-speed": "6.0Gb/s",
"drive-temp": "29C",
"os-path": "/dev/sdn"
}
Perfect. The disk to be removed is /dev/sdn
currently in slot 24. Well, actually slot 23 because the counting starts from 0.
2) Removing the Disk from the Pool
Since we now know the disk to be removed let's being.
We will use the command zpool offline
.
zpool offline
- takes the specified physical device offline. While the device is offline, no attempt is made to read or write to the device.
zpool offline temp1-pool /dev/sdn
There won't be any output when the command is launched. However, we can check if the disk is offline inspecting the pool.
zpool status -P temp1-pool
pool: temp1-pool
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scan: scrub repaired 0B in 0 days 00:00:01 with 0 errors on Sun Jul 10 00:24:03 2022
config:
NAME STATE READ WRITE CKSUM
temp1-pool DEGRADED 0 0 0
raidz2-0 ONLINE 0 0 0
/dev/sdh1 ONLINE 0 0 0
/dev/sdi1 ONLINE 0 0 0
/dev/sdj1 ONLINE 0 0 0
/dev/sdk1 ONLINE 0 0 0
/dev/sdc1 ONLINE 0 0 0
/dev/sdb1 ONLINE 0 0 0
/dev/sdt1 ONLINE 0 0 0
/dev/sds1 ONLINE 0 0 0
raidz2-1 DEGRADED 0 0 0
/dev/sdu1 ONLINE 0 0 0
/dev/sdr1 ONLINE 0 0 0
/dev/sdq1 ONLINE 0 0 0
/dev/sdv1 ONLINE 0 0 0
/dev/sdp1 ONLINE 0 0 0
/dev/sdw1 ONLINE 0 0 0
/dev/sdo1 ONLINE 0 0 0
/dev/sdn1 OFFLINE 0 0 0
errors: No known data errors
3) Marking the Disk for Removal
My disks are configured as JBOD and connected using a megaRAID card. Because the disk is not part of an array. We do not need to mark it as offline, missing and prepare the removal.
Let's put its LED to blink and go the server to physically remove it.
megacli -PdLocate -start -physdrv[17:23] -a0
Adapter: 0: Device at EnclId-17 SlotId-23 -- PD Locate Start Command was successfully sent to Firmware
Exit Code: 0x00
Conclusion
With the above steps we succesfully removed a disk from a ZFS pool.