Last week PRTG has saved a lot of my data. It alerted my about a failing hard drive in my nas server. While I was replacing the drive I found that the backup from the nas to my Google Drive wasn’t working for months. That could have ended disastrous without PRTG…
A few days ago I received an email alert from my PRTG system, that monitors our house. PRTG had discovered, that one of the hard drives in my QNAP nas (running as RAID 6) showed a “bad” status: “S.M.A.R.T Status: BAD“.

After running fine for a little over 700 days the drive had found a bad block during the monthly disk scan. Due to the redundancy of RAID 6 there was no immediate threat of data loss (a second drive would need to fail). But this was not a good feeling.

I immediately ordered a new WD Red 3 TB disk from Amazon:

And two days later I was ready to replace the drive. Before I disconnected the drive in the QNAP user interface I had to make sure that I had a proper backup of my data. I checked whether the syncing process of my data to my Google Drive was still working – and I found it had failed months ago and did not recover since.

What? (note to self: need to find out how I can monitor this syncing process of QNAP’s Cloud Drive Sync app!)

I restarted the syncing process and proceeded with the drive replacement afterwards.

About 10 hours later the RAID6 system with 3,2 TB of data was successfully rebuilt.
2019/07/07 01:56:15 [Storage & Snapshots] Finished rebuilding RAID group “1”. Storage pool: 1.
2019/07/06 16:16:37 [Storage & Snapshots] Started rebuilding RAID group “1”. Storage pool: 1
2019/07/06 16:15:57 [Hardware Status] “Host: Disk 3”: Connected.
2019/07/06 16:13:54 [Hardware Status] “Host: Disk 3”: Disconnected.
2019/07/06 16:11:59 [Storage & Snapshots] Host: Disk 3 detached successfully.
2019/07/06 16:11:59 [Storage & Snapshots] RAID group “1” is degraded. Storage pool: 1.
2019/07/06 16:11:55 [Storage & Snapshots] Finished hot-removing disk “Host: Disk 3”.
2019/07/06 16:11:40 [Storage & Snapshots] Starting detach Host: Disk 3.
2019/07/01 11:37:53 [Storage & Snapshots] Finished scrubbing RAID group “1”. Storage pool: 1, Blocks repaired: 0.
2019/07/01 02:24:12 [Hardware Status] “Host: Disk 3”: Medium error. Run a bad block scan on the drive. Replace the drive if the error persists.
2019/07/01 02:15:04 [Storage & Snapshots] Started scrubbing RAID group “1”. Storage pool: 1, Priority: Default.
Lessons learnt:
- It’s a good idea to monitor your nas, even with redundancy (RAID5 or RAID 6, etc.)!
- I need to find a way to monitor the gdrive syncing process
I do have a NAS of the competitor of QNAP: Synology. When a backup process fail, it will send an email to me. (now it send a email every day, regardless of the result. So when I do not got an email of the backup, maybe the nas is turned off (but I will notice that quicker 😉)
Beside of that, I also monitor the nas with the free version of PRTG 👍.
LikeLike