|
The more data there is, the more that has to be read and written to calculate the new parity bits and/or move the data over. It's just a lot of data to copy around. Even on home HDDs you see it. You have a 5TB disk and a 5TB backup. Your 5TB disk fails, good on you for having a backup. But eep, now you've gotta copy 5TB from one Seagate to another and if you're not nervous, you will be. Youuuu, will be. I devolved into a Yoda reference and I don't even particularly care for Star Wars, huh, don't know why that happened.
|
# ? Aug 29, 2015 00:55 |
|
|
# ? Apr 27, 2024 10:37 |
|
RFC2324 posted:Can someone explain what is wrong with large RAID 5 arrays? I know that it increases chance of failure, but shouldn't the fault tolerance of RAID 5 counter that? Nope. Say you have 10 drives in a big ol' RAID5. Drive 1 packs in, you are now no more protected than a RAID 0 made up of 9 drives. If any one of them has any kind of issues, you're going to lose everything. RAID 6 lets you have one fail out, another fail out halfway through the rebuild, and still keep your data alive.
|
# ? Aug 29, 2015 01:39 |
|
RFC2324 posted:Can someone explain what is wrong with large RAID 5 arrays? I know that it increases chance of failure, but shouldn't the fault tolerance of RAID 5 counter that? A RAID5 of 20 drives is (roughly) 4 times as likely to sustain more failures than it can handle compared to a set of 5 drives. That may push it past what's an acceptable risk to the business actual calculation is (1-(1-P)^19) / (1-(1-P)^4) but they're drat close for small values of P
|
# ? Aug 29, 2015 01:42 |
|
RFC2324 posted:Can someone explain what is wrong with large RAID 5 arrays? I know that it increases chance of failure, but shouldn't the fault tolerance of RAID 5 counter that? Assume it takes 1 hour to rebuild any array. If a disk has failed, you need to replace it, and rebuild, before your data is safe again. If another fails during rebuild, everything is lost. Imagine an array of 3 disks, and the chance any ONE disk will fail within a rebuild. Imagine an array of 1000 disks, and the chance any ONE disk will fail within a rebuild. Imagine an array of 1000000 disks.
|
# ? Aug 29, 2015 02:01 |
|
RFC2324 posted:Can someone explain what is wrong with large RAID 5 arrays? I know that it increases chance of failure, but shouldn't the fault tolerance of RAID 5 counter that? Even thinking about a Raid 5 rebuild gives me an anxiety attack now... the more data/disks there are the longer and more taxing on the rest of the disks the rebuild will be. Biggest array I ever built was a RAID 6+0 of 45 4TB drives in a 45Drives Storinator. 3 Groups of 14 in Raid 6, all 3 groups striped together with 3 hotspares.
|
# ? Aug 29, 2015 02:37 |
|
Skandranon posted:Assume it takes 1 hour to rebuild any array. Depending on the drive size, 1 hour sounds pretty quick. I've heard of some horror stories with low powered raid controls and "large" RAID6 arrays (3TBx8) where it takes DAYS to rebuild. Large being the size of the single drives. In most use cases you are best using RAID6 or RAID10. RAID5 only makes sense right now for 3-4 drives and even then I'm going to question the choice, but at least you can justify it. For RAID6 you better have 5+ drives. Last month I just inherited a 4 drive RAID6. If you don't know, you get the same speed, same tolerance, and faster from RAID10 at exactly 4 drives. The controller supports 10, I don't get why this thing is configured this way, but it would take too much effort to rebuild it since its a backup target only. It would cost money to get enough space to migrate stuff off, and the difference isn't worth spending money on unless I'm going to replace a brand new NAS (which might be under capacity) I found out the art department has been using the USB "backup" drives as their primary storage because they ran out of space, and they need 4TB total of space. My backup NAS is only 2TB. I'm not even sure I can get the budget for another NAS, It's going to be hard to talk to the CFO that I need money and that the previous guy who bought a device 3 months before leaving didn't think about backing this up and it's going to cost more to get a viable solution. I think I'll need two which is even better. (this came up because art said they ran out of space). This is going to come from my server budget isn't it? I really need to replace my servers, they were purchased in 2008 and are out of warranty, its approved for Q2.
|
# ? Aug 29, 2015 02:42 |
|
pixaal posted:Depending on the drive size, 1 hour sounds pretty quick. I've heard of some horror stories with low powered raid controls and "large" RAID6 arrays (3TBx8) where it takes DAYS to rebuild. Large being the size of the single drives. In most use cases you are best using RAID6 or RAID10. RAID5 only makes sense right now for 3-4 drives and even then I'm going to question the choice, but at least you can justify it. For RAID6 you better have 5+ drives. The one hour was just to hold the time constant. As time gets longer, your odds of a failure during rebuild get worse.
|
# ? Aug 29, 2015 02:45 |
|
pixaal posted:you get the same speed, same tolerance, and faster from RAID10 at exactly 4 drives. Technically not. If you lose the first drive of both mirrors you lose data, so a 1 in 3 that a double-failure kills your data. You'd still be making the right choice in the majority of circumstances because of the massive speed difference, especially at rebuild (raw copy rather than parity calculation).
|
# ? Aug 29, 2015 02:53 |
|
Skandranon posted:The one hour was just to hold the time constant. As time gets longer, your odds of a failure during rebuild get worse. RAID is not a backup, it lets the users keep accessing data while you start switching servers to your mirrored backup (assuming you have budget for such a thing). Worst case you still have a compressed backup on your NAS or on tape right? RIGHT? Sure you may have a large chunk of downtime if you need to order another set of drives because 3/8 failed and gently caress it the other 5 are probably garbage anyway. But you can at least recover from that backup and are out of business. keseph posted:Technically not. If you lose the first drive of both mirrors you lose data, so a 1 in 3 that a double-failure kills your data. You'd still be making the right choice in the majority of circumstances because of the massive speed difference, especially at rebuild (raw copy rather than parity calculation). While correct, and its a 1/3 chance that the 2nd drive is the pair the recover speed is so much quicker that it should be the correct choice as you say. The larger the array the more "correct" 10 is over 6. A 10 drive array can lose 5 disks in RAID10 in a "perfect" case, but worst cast it can only lose 1, 2nd will kill the entire thing, but that is only 1/9. Raid 6 you can lose 2, 3rd kills you. Expand that to 100 disks and you really see why 10 is great. My previous place had everything in RAID50, I haven't seen anything recommend it, is it just a poo poo show and 10 is better in almost every case? pixaal fucked around with this message at 02:59 on Aug 29, 2015 |
# ? Aug 29, 2015 02:54 |
|
pixaal posted:RAID is not a backup, it lets the users keep accessing data while you start switching servers to your mirrored backup (assuming you have budget for such a thing). Worst case you still have a compressed backup on your NAS or on tape right? RIGHT? Sure you may have a large chunk of downtime if you need to order another set of drives because 3/8 failed and gently caress it the other 5 are probably garbage anyway. But you can at least recover from that backup and are out of business. I don't think I said it was backup, I was trying to demonstrate how the failure rate goes up with the number of disks.
|
# ? Aug 29, 2015 03:00 |
|
Skandranon posted:I don't think I said it was backup, I was trying to demonstrate how the failure rate goes up with the number of disks. I wasn't arguing, was agreeing.
|
# ? Aug 29, 2015 03:05 |
|
Follow up question: Think SSDs, with their increased reliability and speed, will bring back RAID 5 or extend the life of RAID 6? (one of the articles I ran across looking up raid 5 reliability was some guy saying he had predicted RAID 5 would be abandoned in 2009, claiming he expects RAID 6 to be abandoned in 2019)
|
# ? Aug 29, 2015 03:11 |
|
RFC2324 posted:Follow up question: Think SSDs, with their increased reliability and speed, will bring back RAID 5 or extend the life of RAID 6? (one of the articles I ran across looking up raid 5 reliability was some guy saying he had predicted RAID 5 would be abandoned in 2009, claiming he expects RAID 6 to be abandoned in 2019) The increased reliability doesn't change much, but the speed does. However, the number of disks is still the major factor. A 20 disk SSD array isn't any better an idea, and would be prohibitively expensive. RAID 6 won't 'die', it will just no longer provide the same relative redundancy as 3 disk RAID5 or 6-8 disk RAID6 does now. There will be 3 parity drive solutions, or possibly even more advanced redundancy architectures by then.
|
# ? Aug 29, 2015 03:20 |
|
The numbers on RAID5 have been run, and you basically have 100% odds of hitting an unrecoverable read error on a parity block. 10 4tb drives have 96% odds of failure. SSDs may bring a resurgence, but I've always seen a better use for SSDs in distributed clustered filesystems (which end up striping/mirroring anyway, just on s different level) than traditional RAID, though that probably says more about the stuff I work with these days than anything.
|
# ? Aug 29, 2015 03:30 |
|
evol262 posted:The numbers on RAID5 have been run, and you basically have 100% odds of hitting an unrecoverable read error on a parity block. 10 4tb drives have 96% odds of failure. Relatedly, with a chassis that contains more than a couple SSDs and you start pumping multiple GB/sec of writes, you start hitting CPU speed caps on parity calculation speed of single controllers, and parallel controllers get extremely complex inside the firmware, driving the cost way up. Or you could just run R10 with multiple hot spares that work just fine even with a much simpler and cheaper controller and don't need a RAID7 or RAID8 standard.
|
# ? Aug 29, 2015 04:04 |
|
Never run without hot spares with data you care about. Ideally you would also have pre-fail detection as well. What sounds better? Having a drive fail in the middle of the night, not getting swapped until the next day with a several hour rebuild still to go before things are in a non-degraded state . All the while, your controller's CPU is pegged for parity calculations the entire time. or Array detects a drive is suspect in the middle of the night and starts to copy the contents of that drive to your hot spare, only hitting parity if it has an unrecoverable block. You wake up the next morning and see that your array properly removed the suspect drive, never even going into a degraded state, and did it a ton faster because it didn't need to calculate parity for the whole array. Then you calmly call your support and have the new drive delivered in your 4 hour window, restoring full hot spare functionality.
|
# ? Aug 29, 2015 05:48 |
|
pixaal posted:My previous place had everything in RAID50, I haven't seen anything recommend it, is it just a poo poo show and 10 is better in almost every case? RAID50 is typically sets of three drives (2data, 1parity) then grouped together in striped sets. It behaves slightly slower than R10 due to parity calculations, but with additional storage capacity. (Take 6 disks: with r10 you have 3 disks for data, with r50 you have four). R50 is a middle of the road option for people who want some additional performance over r5 with some additional storage over r10. I've used it in smaller database volumes that were going to be tight on space.
|
# ? Aug 29, 2015 08:40 |
|
Apparently earlier in the week one of our customers got us to troubleshoot why their email wasn't getting sent to their copier. If that wasn't bad enough, here are the last two emails we sent to them: Do you have it internally aliased to something else? I'm not seeing a konica@[companyname].net showing up in your user list. I tried the number [telephone] and it was a law office. (they are not a law office)
|
# ? Aug 29, 2015 12:30 |
|
On a conference call on Saturday morning - it's my turn on call and I was ready for it to not be a great weekend. I do not care. Tracking down an executive - "he's at the hospital with his girlfriend, her father had a heart attack, he'll be dialing in in a second" Yep that about sums up this job.
|
# ? Aug 29, 2015 13:06 |
|
Did the dad live
|
# ? Aug 29, 2015 14:29 |
|
Luna Was Here posted:Did the dad live Are you the executive, trying to find out I you should buy flowers before coming back upstairs?
|
# ? Aug 29, 2015 15:54 |
|
So do we assume that the executive also has a wife? :-D
|
# ? Aug 30, 2015 01:40 |
|
ConfusedUs posted:That qualifies as "checking" in my book. Until a) People tell you to turn that poo poo off, systems should only bother you when you need to take action, we don't need to hear about successes. Some unknowable amount of time later, the emails stop getting sent and no one notices. b) There's some thumbs.db file or some poo poo that can't be backed up, so everyone is used to seeing emails titled "WARNING: BACKUP FAILURE ON SNOOPY." and stops actually reading them. c) People construct filters to auto delete the email. (Or, in one of the most facepalmy incidents of my career, they started reporting them as spam. Causing their own server to get blacklisted.)
|
# ? Aug 30, 2015 02:21 |
|
Hey, hosting people? If I click the "upgrade package request" button, maybe warn me that clicking the button will immediately shut the server down and ask me if I really want to continue? An entire server's worth of people are now asking me why their teamspeak suddenly disconnected. Thanks guys.
|
# ? Aug 30, 2015 04:52 |
|
Not pissing me off: I got my whitebox home lab server up and running and it came with IPMI so I never had to connect a keyboard or monitor to the thing! Pissing me off: What the hell, VMware, I can't run write operations with PowerCLI (like, you know, creating a loving VM using New-VM) on the free ESXi version? Jeez, I know you had to gimp it to get people to actually buy the product but that's just ridiculous.
|
# ? Aug 30, 2015 08:47 |
|
Casull posted:Not pissing me off: I got my whitebox home lab server up and running and it came with IPMI so I never had to connect a keyboard or monitor to the thing!
|
# ? Aug 30, 2015 09:13 |
|
anthonypants posted:Which version of ESXi? I'm trying out 6.0 at home.
|
# ? Aug 30, 2015 09:38 |
|
Casull posted:IPMI What board did you purchase?
|
# ? Aug 30, 2015 09:51 |
|
Agrikk posted:What board did you purchase? Asrock C2750D4I with a pre-included Intel Avoton 8-core processor. It's got IPMI, two ethernet ports, and it's mini-ITX so I could follow this guide to build the thing up. I didn't use the case that guy used, though, but I used a Cooler Master Elite 130. It's also sporting a 250GB Samsung 850 EVO SSD for storage. The cool thing is that the Avoton core can get away with passive cooling, the entire thing is quiet enough to live in the living room, and it's got low power consumption. The total, including tax, was about $800USD or so.
|
# ? Aug 30, 2015 10:29 |
poo poo not pissing me off: Didn't get a page all weekend.
|
|
# ? Aug 31, 2015 07:25 |
|
it bothers me when people do stupid command line poo poocode:
|
# ? Aug 31, 2015 07:29 |
|
Gwaihir posted:Actually reading logs and error messages qualifies as entrapment.
|
# ? Aug 31, 2015 07:47 |
|
reddit liker posted:it bothers me when people do stupid command line poo poo The first one at least does something, did you ask what they're thinking with the second?
|
# ? Aug 31, 2015 08:56 |
|
reddit liker posted:it bothers me when people do stupid command line poo poo
|
# ? Aug 31, 2015 09:02 |
|
The Fool posted:1. gently caress Printers Backquoting a bit but since I deal with phones, it's that you can't avoid going to them physically most of the time, and that they don't have 'errors' , but symptoms that are a pain to decipher. Also they collect skin, makeup, and face particles on them.
|
# ? Aug 31, 2015 12:37 |
|
I'm going to have all future applicants write a few pages of English so I don't have to deal with this poo poo in the future: Yes, I’m working on it now. The part didn’t will correctly and they need the comments to be changed. I have the files in my test side to fixed the issues. Born and raised in the USA.
|
# ? Aug 31, 2015 14:14 |
|
poo poo somewhat pissing me off: schedule a 1hr virtual training session for a go live tomorrow. 20mins into our scheduled 1hr, no one has attended. apparently there was something more important that they got scheduled for and no one bothered to notify me. I guess they're going to get the short short version
|
# ? Aug 31, 2015 14:19 |
|
theperminator posted:The first one at least does something, did you ask what they're thinking with the second? The second is common for "ps -ef | grep foo | grep -v grep | ..." to pass it into awk+kill or something. grep [f]oo is cleaner, though
|
# ? Aug 31, 2015 14:54 |
|
reddit liker posted:it bothers me when people do stupid command line poo poo I was in a Red Hat Training class and the trainer kept using "cat file | grep words".
|
# ? Aug 31, 2015 15:01 |
|
|
# ? Apr 27, 2024 10:37 |
|
Lightning Jim posted:I was in a Red Hat Training class and the trainer kept using "cat file | grep words". I had this habit for a long time. It wasn't until my sophomore year of college that someone asked "What the hell are you doing?" and I broke the habit.
|
# ? Aug 31, 2015 15:36 |