August 2, 2016
August 1, 2016 – 7:28 amSad day for me. After nearly 30 years of technology, saving files, programs, pictures, and so much more – ended up losing everything in a large data loss incident. Over the years, storage needs have grown and grown. I have moved from servers with physical disks that were shared via NFS/CIFS, to TB NAS drives, and beyond. I always had another drive for backups. In the early part of the year, I had spanned past my 3TB multi-drive USB/eSATA NAS, and needed more. That’s 3TB storage, and 3TB backups – so the need was pretty big. I found a nice 12TB server and figured I could do both on there, build it as FreeNAS, use it for all my VM’s, and my other storage needs, and just carve out part of it for my backups. I suppose partly because I was doing things too quickly, and partly because I was being cheap I guess, I went for the single ‘device’ approach.
It was a six drive Dell 2950 server, with RAID-5, providing 10TB, which was perfect for my needs. I built it out, all was good for about 4-5 months, which was long enough for me to be confident that the approach was good. At that point, I started to re-deploy the old NAS for some container project work, redeployed the old server for container project work, and ended up having to rebuild the workstation that I used while doing all the data transfer from the prior storage migration.
I guess you can see where this is going. I had a drive failure on the Dell, so I picked up a new 2TB drive, and it rebuilt ok. Phew. About a month later, that same drive slot had been reporting an issue, so I got another 2TB drive, and swapped it out. Note that the first time, I brought the server down, performed the RAID rebuild completely offline through the server bios, and it took about 2 days. The second time I did the rebuild, I went with what everyone else was saying, since i was more nervous this time. Everyone says *never* bring a server down with a bad drive like I did previously, so I ‘listened’. I swapped the drive with the server online. About half a day into the rebuild, another drive reported failure … and I was effed. That was it for my storage. It was hardware raid-5, with FreeNas ZFS on top of that, carved into multiple pools. Everything became confused, and I can no longer access the data. I’ve tried so many ZFS rebuild tricks, but just can’t recover enough with only 4 of 6 drives.
Moral of the story is that after 30 years, I should have been smarter, and not put backups on the same drive as the content … and I know better. I’m also not rebuilding this stuff in my own datacenter again. This time its all cloud. I lost my music, my pictures, my documents – all that stuff. But i also lost all my VM’s, which is the real pain point. Of course I lost my backups, but at this point – it’s the same difference. So this time it’s VM’s in AWS, micro-instances where possible, and I need to find a better solution for the personal storage (music, movies, pictures, etc).