|
taqueso posted:Yeah, it's obviously nice to have but millions of people get away without somehow Millions of people aren't running ZFS. However, I don't think it's worth selling usable hardware to buy "server-grade" stuff unless you NEED IPMI, Xeon support, Dual-Xeon, etc. Or you want homelab credo, FWTW. Both my FreeNAS Systems are running ECC, but that's because the motherboards require it. UDIMMS, too, which aren't cheap like Registered ECC. I'd just run what you've got, at worst, you have to build a new box and transfer the data over. If you're using ZFS, that's stupid-simple. If not, then grab a couple of backup disks and back it all up and copy it over. Also, RAID (including RAIDZx) isn't backup.
|
# ? Apr 15, 2020 20:56 |
|
|
# ? May 17, 2024 07:13 |
|
D. Ebdrup posted:Say what?! https://blocksandfiles.com/2020/04/15/shingled-drives-have-non-shingled-zones-for-caching-writes/ Yep They use a 30-100GB non-shingled "CMR area" as a write buffer and then behind the scenes it's rewriting the shingles. But if you write too much at once then they hang until it's ready to accept more, which can be multiple minutes, and the controller naturally drops the drive. Most home users don't load the whole thing full at once so you may not run into this at first, but when you hit it with a rebuild then it's quite likely to drop. Oh, and to make this all worse, what you think is a read/write operation that the HDD can just scan right through without seeks, may not be at all. Like a SSD, it's got its own "page tables" internally and the logical mappings don't follow the physical mappings. Kind of an obvious approach when you think about the CMR area, it obviously needs some mapping to know what's "in buffer" and what's been flushed out, but not really the behavior you expect from a HDD. Additionally, WD has a bug (or, differing interpretation of the "correct" behavior), where when ZFS tries to read an area that the drive knows hasn't been written (because it's not in the page tables) the drive will throw a hardware error, and that probably also leads to the drive being dropped. Paul MaudDib fucked around with this message at 22:31 on Apr 15, 2020 |
# ? Apr 15, 2020 22:18 |
|
Welp. Glad I decided to populate my NAS with 8TB drives. Seriously, though, that's bad juju all the way around. I could see (and sorta agree) with using SMR on a Green / Blue drive where you can reasonably assume that the user won't need to push more than 100GB at once, and doing so lets you get more storage at a given price point. But putting it in Reds that they're actively advertising for NAS usage is straight up irresponsible.
|
# ? Apr 15, 2020 22:44 |
sharkytm posted:Millions of people aren't running ZFS. Paul MaudDib posted:https://blocksandfiles.com/2020/04/15/shingled-drives-have-non-shingled-zones-for-caching-writes/ Are there even vendors that sell non-poo poo 6TB drives?
|
|
# ? Apr 15, 2020 23:35 |
|
D. Ebdrup posted:Welp, not buying anymore WD or HGST drives, then. I can't afford 10TB or bigger. Your options are Toshiba and hoping that HGST is still operating independently enough to not be doing shady poo poo. Seagate has a mix of CMR and SMR drives in the 4-8TB range. I think their IronWolf drives are CMR, but they also tend to have lower reliability than WD Reds. WD's 8TB drives are (at least according to that article) proper CMR drives, so that's a little easier on the wallet. And considering what Toshiba drives cost, you may be able to get a WD Red 8TB for about the same price as a Toshiba 6TB.
|
# ? Apr 15, 2020 23:40 |
|
Paul MaudDib posted:[interesting SMR info] Seagate’s SMR drives even use a hierarchy of DRAM (few megabytes), NAND (few gigabytes), CMR (few hundred gigabytes) and SMR (few terabytes) and the drive firmware quietly shuffles data blocks around as it thinks it is needed. I posted about this a few months ago. WD might do the same. It was quite fascinating because the drive is relatively smart about keeping hot files in the cache, so editing photos and documents felt like working on a SSD (because everything was kept in the NAND), but multi-hundred-gigabyte transfers IIRC slowed to 20 megabytes per second once all buffers were full and the firmware was forced to write directly to SMR. This behavior kind of works because very few consumers do writes in excess of the CMR section (let’s say 200GB) in one sitting, most workloads would give the drive some idle time to flush to the SMR section. Reading isn’t as big of a problem, if I remember correctly SMR even has an advantage over CMR when it comes to sequential read throughout, but don’t quote me on that. ZFS RAID is a great example of a workload that would completely grind this whole contraption to a halt, because the drives firmware doesn’t know what the filesystem is doing (I’m assuming it just looks at blocks and shuffles those around, so a scrub or resilver would leave it thoroughly confused) and the filesystem has no idea what it is dealing with. I think the concept itself has potential for many use cases, but I see a need for a standard that makes the OS/filesystem fully aware of the underlying hardware architecture.
|
# ? Apr 16, 2020 00:14 |
|
My Trump check got deposited into my account today , who else plans on buying a couple more 10/12/14 TB EasyStores to shuck with it?
|
# ? Apr 16, 2020 00:39 |
|
I can grab 32GB of Samsung DDR4 PC-2400 RDIMMs off Ebay for $100 easy. That's substantially cheaper than for DIMMs off Newegg that are of varying quality.
|
# ? Apr 16, 2020 01:36 |
|
D. Ebdrup posted:Millions of people probably are having their data stored on ZFS, without knowing it. And I'll bet those storage units are using ECC... But I digress.
|
# ? Apr 16, 2020 04:18 |
|
I have 2 zpools Main-Volume and datastore. Main-Volume is 6 4TB Reds in Z0, datastore is 10 8TB Reds in Z0. All of Main-Volume is on an LSI controller built into the motherboard, and 8 of datastore drives are in an external enclosure connected via ESATA with another LSI controller while the two remaining drives are on the same LSI as Main-Volume. Both controllers are flashed to IT mode. I turned off all VMs and LXCs, shut down Prometheus collections and ran some fio runs. code:
code:
Note: With the set up it's probably not a controller issue, as 20% of datastore drives are on the same controller. code:
code:
Given how much free space datastore has I'm now doing a syncoid to transfer everything over, and if I can figure out what's going on I'm prepared to rebuild the pool completely. The questions are: A) How do I get more data on what's going on? B) What is going on? C) How do I fix it?
|
# ? Apr 16, 2020 15:27 |
|
Can you do a pastebin or gist of `zpool get all` for both and maybe a zdb -C as well of both? Offhand maybe the ashift is really wrong for the Main-Volume.
|
# ? Apr 16, 2020 15:38 |
|
We just had a discussion on the WD Reds above using SMR - I suspect your main volume is suffering from SMR issues. Your request latency being so insanely high is enough evidence of that pattern. I have a bunch of Toshiba drives to offset this problem in my 4TB drive based array but because I have several WD Reds in there I'm now looking to migrate the whole array completely to perhaps 12TB drives now. Just trying to figure out how many total drives I should try keeping online because it's already at 16 drives in a craptastic garbo setup I keep putting off properly housing...
|
# ? Apr 16, 2020 15:39 |
|
I was gonna say SMR too but the read speeds in SMR arrays or ones with some SMR drives are not nearly that bad.
|
# ? Apr 16, 2020 15:50 |
|
necrobobsledder posted:We just had a discussion on the WD Reds above using SMR - I suspect your main volume is suffering from SMR issues. Your request latency being so insanely high is enough evidence of that pattern. I have a bunch of Toshiba drives to offset this problem in my 4TB drive based array but because I have several WD Reds in there I'm now looking to migrate the whole array completely to perhaps 12TB drives now. Just trying to figure out how many total drives I should try keeping online because it's already at 16 drives in a craptastic garbo setup I keep putting off properly housing... The Red is from 2014. And they're EFRX which is not the line that has SMR. Less Fat Luke posted:Can you do a pastebin or gist of `zpool get all` for both and maybe a zdb -C as well of both? Offhand maybe the ashift is really wrong for the Main-Volume. 100% possible, I was looking into that yesterday. http://sprunge.us/KIhcOv --zpool get all and zdb (not -C). ashift is 12 for Main-Volume, zdb alone didn't dump datastore though.
|
# ? Apr 16, 2020 15:57 |
|
Fragmentation shouldn't be it - I just checked my main pool and it's at 14% and the performance is way better than that. Are any of the drives reporting errors in SMART or syslog? Possible that a drive is dying in a way that's just making it very, very slow to respond, but not actually chuck data errors. not-edit: I'm wondering if something is weird with how fio is testing this, because I just ran that same test on my datastore and it's claiming read BW of 1MB/sec, which seems physically impossible given the workloads this array supports. I didn't stop everything else but the server isn't doing that much at the moment: code:
code:
|
# ? Apr 16, 2020 16:16 |
|
IOwnCalculus posted:Fragmentation shouldn't be it - I just checked my main pool and it's at 14% and the performance is way better than that. I don't know enough about fio to know what a good command line is. This was just one I found online and got these two very different results. SMART status was checked all drives marked as PASSED, non of the counters looked bad or in Prefail land.
|
# ? Apr 16, 2020 16:18 |
|
Yeah my guess is that one drive is bad, RAIDZ1 is going to read everything simultaneously so if something is making GBS threads the bed it'll stall everything. I'd run a copy or cat from the drive and maybe watch `iostat -x` and see if one drives util% column is maxed out compared to the others. Alternatively export the pool and cat the drive raw devices one by one to dev null and watch the speed metrics.
|
# ? Apr 16, 2020 16:26 |
Hughlander posted:The Red is from 2014. And they're EFRX which is not the line that has SMR.
|
|
# ? Apr 16, 2020 16:41 |
|
Less Fat Luke posted:Yeah my guess is that one drive is bad, RAIDZ1 is going to read everything simultaneously so if something is making GBS threads the bed it'll stall everything. I'd run a copy or cat from the drive and maybe watch `iostat -x` and see if one drives util% column is maxed out compared to the others. Alternatively export the pool and cat the drive raw devices one by one to dev null and watch the speed metrics. code:
EDITED: Actual iostat -x code:
|
# ? Apr 16, 2020 16:53 |
|
Yeah check the iostat during the fio run. That iostat is showing about 500MB/s being read accounting for parity so if that's Main-Volume then you're getting good speeds. Edit: Also 4GB is a very small test for fio, it should be larger than your ARC or RAM in my opinion but maybe it's smart enough with that direct flag to bypass even the ARC? Either way I'd do it with a much larger file. Additionally random reads in ZFS with something like FIO *should* be terrible, you're using spinning disks in RaidZ1. If anything both of those tests are anomalous because that 300+MBS in random reads can't possibly be coming from real spinners. Less Fat Luke fucked around with this message at 17:02 on Apr 16, 2020 |
# ? Apr 16, 2020 16:58 |
|
Less Fat Luke posted:Yeah check the iostat during the fio run. That iostat is showing about 500MB/s being read accounting for parity so if that's Main-Volume then you're getting good speeds. Thanks, I never used fio so would appreciate any pointers. To avoid the X/Y problem I dug myself into... I started looking at a real world issue I was having. Scanning files for a backup would take 40-60 minutes for 1 million files. All it was doing was grabbing the mtime for the files. Doing a similar scan of a similar size on the other zpool would take 2 minutes - 2:30. From there I looked to get reproducable measurements to show that yes, one pool is slower than the other. I'm taking a different tact now, I'm doing a zfs send/receive moving the same 1M files from Main-Volume to datastore and I'll run the same backup there. if it completes in 2 minutes though then I'll still want to understand what is going on between the two pools and how to improve the performance of one.
|
# ? Apr 16, 2020 17:05 |
|
Yeah that's interesting, I almost wonder if you're somehow priming the ARC in the "fast" pool and all the mtimes are readily available in memory. I bet clearing the ARC with a reboot (or dropping and re-increasing the size) first before each test would help narrowing the issue down.
|
# ? Apr 16, 2020 17:09 |
|
Less Fat Luke posted:Yeah that's interesting, I almost wonder if you're somehow priming the ARC in the "fast" pool and all the mtimes are readily available in memory. I bet clearing the ARC with a reboot (or dropping and re-increasing the size) first before each test would help narrowing the issue down. It was a separate set of files that also hadn't been accessed very recently. If this is the time that it's expected, I may just need to find a different backup solution. I really miss crashplan and it's filewatcher so much.
|
# ? Apr 16, 2020 17:28 |
|
Do any of these homebrew distros have a backup client for Windows that will automatically backup over the internet (via VPN or similar)? Imagine you are trying to keep family members in their 70s systems backed up.
|
# ? Apr 16, 2020 19:28 |
|
Charles posted:Do any of these homebrew distros have a backup client for Windows that will automatically backup over the internet (via VPN or similar)? Imagine you are trying to keep family members in their 70s systems backed up. I know this isn't the answer to the question you asked as phrased, but I strongly encourage using an off the shelf "cloud" backup solution like Backblaze. $4.583/month/computer with a 2 year plan. Yes it's cheaper to DIY in absolute dollars, but in sanity-bux trying to handle this over the internet safely.
|
# ? Apr 16, 2020 19:36 |
|
H110Hawk posted:I know this isn't the answer to the question you asked as phrased, but I strongly encourage using an off the shelf "cloud" backup solution like Backblaze. $4.583/month/computer with a 2 year plan. Yes it's cheaper to DIY in absolute dollars, but in sanity-bux trying to handle this over the internet safely. I've been looking for a mass backup solution for my NAS and have been avoiding it as I used Crashplan ages ago and it was miserable. Is Backblaze a good solution? Wiil I spend 2020 and 2021 uploading my data and have a lovely interface to pull things down?
|
# ? Apr 16, 2020 19:45 |
|
Well poo poo, I bought one EFAX to complete upgrading my zpool from 3TB to 4TB disks and THEN I read this. It’s replacing now, if it gets all the way through am I clear? This is just a boring rear end file server with low churn aside from a couple of 20GB VMs to run stuff that doesn’t have freebsd ports. Is there a good list of known good SKUs? I think the seagate exos line lists whether a drive is CMR or SMR in the datasheets, but a) seagate, b) $$$.
|
# ? Apr 16, 2020 21:07 |
|
TraderStav posted:I've been looking for a mass backup solution for my NAS and have been avoiding it as I used Crashplan ages ago and it was miserable. Is Backblaze a good solution? Wiil I spend 2020 and 2021 uploading my data and have a lovely interface to pull things down? B2 works for me with the synology.
|
# ? Apr 17, 2020 06:07 |
|
xarph posted:Well poo poo, I bought one EFAX to complete upgrading my zpool from 3TB to 4TB disks and THEN I read this. I have 2 x 6TB EFAX drives. I have them mirrored. I replaced the existing 2TB drives this year and it took about 11 hours or so to remirror the data with no problems. If you get all the way through and there are no errors or strange latency problems then it should be fine. For my setup I keep adding data to the storage which shouldn't create an issue. The only change is that I now use it as a work server while on lockdown, but I haven't noticed any performance issues.
|
# ? Apr 17, 2020 09:18 |
|
SMR drive watch https://blocksandfiles.com/2020/04/16/toshiba-desktop-disk-drives-undocumented-shingle-magnetic-recording/ blocks & files posted:Western Digital, Seagate and Toshiba – have now confirmed to Blocks & Files the undocumented use of SMR technology in desktops HDDs and in WD’s case, WD Red consumer NAS drives. It just gets worse. Yay! (the new news being Toshiba, but Seagate hasn't been mentioned in this thread yet, so here's more on them: https://blocksandfiles.com/2020/04/15/seagate-2-4-and-8tb-barracuda-and-desktop-hdd-smr/) HalloKitty fucked around with this message at 15:57 on Apr 17, 2020 |
# ? Apr 17, 2020 10:46 |
So, the only models that're known to not use SMR as of this post are WD EFRX and Toshiba X300 drives?`That's mighty slim pickings. Weirdly, Toshiba X300 6TB drives are a lot cheaper than 6TB EFRX drives here in Denmark.
|
|
# ? Apr 17, 2020 12:33 |
|
The WD white label EMAZ doesn't. Those are what are usually find in the shuckable external drive. I can't believe that they'd gently caress the users that spend the money on legit Red drives, and not the external drives where performance is expected to be slower.
|
# ? Apr 17, 2020 15:47 |
|
sharkytm posted:The WD white label EMAZ doesn't. Those are what are usually find in the shuckable external drive. I can't believe that they'd gently caress the users that spend the money on legit Red drives, and not the external drives where performance is expected to be slower. Yup. It's totally rear end-backwards. Turns out, the real winners are the shuckers. Cheaper AND better drives.
|
# ? Apr 17, 2020 15:55 |
|
So, 8tb and higher WD Reds are confirmed to be not-SMR, correct?
|
# ? Apr 17, 2020 17:55 |
|
Devian666 posted:I have 2 x 6TB EFAX drives. I have them mirrored. I replaced the existing 2TB drives this year and it took about 11 hours or so to remirror the data with no problems. If you get all the way through and there are no errors or strange latency problems then it should be fine. My zpool replace operation completed successfully sometime overnight. Pool is fine. The story has hit Ars: https://arstechnica.com/gadgets/2020/04/caveat-emptor-smr-disks-are-being-submarined-into-unexpected-channels/
|
# ? Apr 17, 2020 19:16 |
|
xarph posted:My zpool replace operation completed successfully sometime overnight. Pool is fine. Cue the class-action lawsuit in 3... 2... 1...
|
# ? Apr 17, 2020 19:23 |
|
sharkytm posted:Cue the class-action lawsuit in 3... 2... 1... I wouldn't be surprised. You can gently caress over consumers like that all day and no one is gonna complain all that much, but you start changing "known quantity" SKUs that are meant for businesses and that's gonna get a lot more attention.
|
# ? Apr 17, 2020 19:43 |
|
Steakandchips posted:So, 8tb and higher WD Reds are confirmed to be not-SMR, correct? I'd like to see a script that does drive-fillup+whatever benchmarking to incur SMR-rewrite behavior, then we could test all the drives.
|
# ? Apr 17, 2020 20:09 |
ChiralCondensate posted:I'd like to see a script that does drive-fillup+whatever benchmarking to incur SMR-rewrite behavior, then we could test all the drives. pre:zpool create tank raidz3 /dev/ada0 /dev/ada1 /dev/ada2 /dev/ada3 /dev/ada4 \ && camdd -i /dev/random,bs=1M,depth=`sysctl -n hw.ncpu` -o file=/tank/random.bin -m 1024G \ && zfs scrub Then again, that seems like a problem for Linux. BlankSystemDaemon fucked around with this message at 20:53 on Apr 17, 2020 |
|
# ? Apr 17, 2020 20:47 |
|
|
# ? May 17, 2024 07:13 |
|
D. Ebdrup posted:By the sounds of it it's as simple as this oneliner. Just use urandom instead. I also think they've improved random materially in the last decade to make exhaustion less likely.
|
# ? Apr 17, 2020 21:21 |