Hello forum,
Please, help to diagnose the reboot issue during disk replace.
Symptom:
System rebooted during disk replace operation. After reboot it happens that replace passed successfully. Pool is in healthy state and is resilvering now. No errors reported.
System:
x510/2GB RAM/E7400 CPU with 4x1TB disks configured as RaidZ1 (REDPOOL - ada4/ada5/ada6/ada7)
IcyDock JBOD connected to x510 on esata with 4x1TB disks configured as RaidZ1 (ICEPOOL - ada0/ada1/ada2/ada3)
FreeNAS 11.0 U3 + emby jail powered off + plex jail running
Periodic snapshot scheduled to run 09:00-18:00 everyday. Replication from x510 to IcyDock on all datasets with periodic snapshot.
What caused the reboot:
1) In ICEPOOL I've set disk ada3 to OFFLINE
2) In REDPOOL I've set disk ada6 to OFFLINE
3) removed disks ada3 and ada6
4) installed disk ada3 in ada6 bay
5) installed disk ada6 in ada3 bay
6) initiated REPLACE to member disk ada6 in REDPOOL with force option
7) waited for REPLACE to finish. WEB gui was not responsive. Samba was not responsive. PING failed.
It happens that system restarted. Because later all services loaded normally
9) After restarts initiated REPLACE to member disk ada3 for pool ICEPOOL. Passed successfully without restart and resilvering now.
zpool status on both pools reports that drives are ONLINE and resilvering.
No reboot/halt messages. Cannot find out at what point system actually rebooted. Reallocation event should not cause system restart.
Last records from messages log before restart:
Sep 28 09:12:24 VAULT (aprobe3:siisch0:0:3:0): SOFT_RESET. ACB: 00 00 00 00 00 00 00 00 00 00 00 00
Sep 28 09:12:24 VAULT (aprobe3:siisch0:0:3:0): CAM status: Uncorrectable parity/CRC error
Sep 28 09:12:24 VAULT (aprobe3:siisch0:0:3:0): Error 5, Retries exhausted
Sep 28 09:12:24 VAULT (aprobe3:siisch0:0:3:0): SOFT_RESET. ACB: 00 00 00 00 00 00 00 00 00 00 00 00
Sep 28 09:12:24 VAULT (aprobe3:siisch0:0:3:0): CAM status: Uncorrectable parity/CRC error
Sep 28 09:12:24 VAULT (aprobe3:siisch0:0:3:0): Error 5, Retries exhausted
Sep 28 09:13:48 VAULT smartd[73500]: Device: /dev/ada6, Failed SMART usage Attribute: 5 Reallocated_Sector_Ct.