Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
MC Fruit Stripe
Nov 26, 2002

around and around we go
The more data there is, the more that has to be read and written to calculate the new parity bits and/or move the data over. It's just a lot of data to copy around.

Even on home HDDs you see it. You have a 5TB disk and a 5TB backup. Your 5TB disk fails, good on you for having a backup. But eep, now you've gotta copy 5TB from one Seagate to another and if you're not nervous, you will be. Youuuu, will be.

I devolved into a Yoda reference and I don't even particularly care for Star Wars, huh, don't know why that happened.

Adbot
ADBOT LOVES YOU

Methylethylaldehyde
Oct 23, 2004

BAKA BAKA

RFC2324 posted:

Can someone explain what is wrong with large RAID 5 arrays? I know that it increases chance of failure, but shouldn't the fault tolerance of RAID 5 counter that?

Nope. Say you have 10 drives in a big ol' RAID5. Drive 1 packs in, you are now no more protected than a RAID 0 made up of 9 drives. If any one of them has any kind of issues, you're going to lose everything.

RAID 6 lets you have one fail out, another fail out halfway through the rebuild, and still keep your data alive.

keseph
Oct 21, 2010

beep bawk boop bawk

RFC2324 posted:

Can someone explain what is wrong with large RAID 5 arrays? I know that it increases chance of failure, but shouldn't the fault tolerance of RAID 5 counter that?

A RAID5 of 20 drives is (roughly) 4 times as likely to sustain more failures than it can handle compared to a set of 5 drives. That may push it past what's an acceptable risk to the business


actual calculation is (1-(1-P)^19) / (1-(1-P)^4) but they're drat close for small values of P

Skandranon
Sep 6, 2008
fucking stupid, dont listen to me

RFC2324 posted:

Can someone explain what is wrong with large RAID 5 arrays? I know that it increases chance of failure, but shouldn't the fault tolerance of RAID 5 counter that?

Assume it takes 1 hour to rebuild any array.

If a disk has failed, you need to replace it, and rebuild, before your data is safe again. If another fails during rebuild, everything is lost.

Imagine an array of 3 disks, and the chance any ONE disk will fail within a rebuild.

Imagine an array of 1000 disks, and the chance any ONE disk will fail within a rebuild.

Imagine an array of 1000000 disks.

theperminator
Sep 16, 2009

by Smythe
Fun Shoe

RFC2324 posted:

Can someone explain what is wrong with large RAID 5 arrays? I know that it increases chance of failure, but shouldn't the fault tolerance of RAID 5 counter that?

Even thinking about a Raid 5 rebuild gives me an anxiety attack now... the more data/disks there are the longer and more taxing on the rest of the disks the rebuild will be.

Biggest array I ever built was a RAID 6+0 of 45 4TB drives in a 45Drives Storinator.
3 Groups of 14 in Raid 6, all 3 groups striped together with 3 hotspares.

pixaal
Jan 8, 2004

All ice cream is now for all beings, no matter how many legs.


Skandranon posted:

Assume it takes 1 hour to rebuild any array.

If a disk has failed, you need to replace it, and rebuild, before your data is safe again. If another fails during rebuild, everything is lost.

Imagine an array of 3 disks, and the chance any ONE disk will fail within a rebuild.

Imagine an array of 1000 disks, and the chance any ONE disk will fail within a rebuild.

Imagine an array of 1000000 disks.

Depending on the drive size, 1 hour sounds pretty quick. I've heard of some horror stories with low powered raid controls and "large" RAID6 arrays (3TBx8) where it takes DAYS to rebuild. Large being the size of the single drives. In most use cases you are best using RAID6 or RAID10. RAID5 only makes sense right now for 3-4 drives and even then I'm going to question the choice, but at least you can justify it. For RAID6 you better have 5+ drives.

Last month :yotj: I just inherited a 4 drive RAID6. :ughh: If you don't know, you get the same speed, same tolerance, and faster from RAID10 at exactly 4 drives. The controller supports 10, I don't get why this thing is configured this way, but it would take too much effort to rebuild it since its a backup target only. It would cost money to get enough space to migrate stuff off, and the difference isn't worth spending money on unless I'm going to replace a brand new NAS (which might be under capacity) I found out the art department has been using the USB "backup" drives as their primary storage because they ran out of space, and they need 4TB total of space. My backup NAS is only 2TB.

I'm not even sure I can get the budget for another NAS, It's going to be hard to talk to the CFO that I need money and that the previous guy who bought a device 3 months before leaving didn't think about backing this up and it's going to cost more to get a viable solution. I think I'll need two which is even better. (this came up because art said they ran out of space).

This is going to come from my server budget isn't it? I really need to replace my servers, they were purchased in 2008 and are out of warranty, its approved for Q2.

Skandranon
Sep 6, 2008
fucking stupid, dont listen to me

pixaal posted:

Depending on the drive size, 1 hour sounds pretty quick. I've heard of some horror stories with low powered raid controls and "large" RAID6 arrays (3TBx8) where it takes DAYS to rebuild. Large being the size of the single drives. In most use cases you are best using RAID6 or RAID10. RAID5 only makes sense right now for 3-4 drives and even then I'm going to question the choice, but at least you can justify it. For RAID6 you better have 5+ drives.

Last month :yotj: I just inherited a 4 drive RAID6. :ughh: If you don't know, you get the same speed, same tolerance, and faster from RAID10 at exactly 4 drives. The controller supports 10, I don't get why this thing is configured this way, but it would take too much effort to rebuild it since its a backup target only. It would cost money to get enough space to migrate stuff off, and the difference isn't worth spending money on unless I'm going to replace a brand new NAS (which might be under capacity) I found out the art department has been using the USB "backup" drives as their primary storage because they ran out of space, and they need 4TB total of space. My backup NAS is only 2TB.

I'm not even sure I can get the budget for another NAS, It's going to be hard to talk to the CFO that I need money and that the previous guy who bought a device 3 months before leaving didn't think about backing this up and it's going to cost more to get a viable solution. I think I'll need two which is even better. (this came up because art said they ran out of space).

This is going to come from my server budget isn't it? I really need to replace my servers, they were purchased in 2008 and are out of warranty, its approved for Q2.

The one hour was just to hold the time constant. As time gets longer, your odds of a failure during rebuild get worse.

keseph
Oct 21, 2010

beep bawk boop bawk

pixaal posted:

you get the same speed, same tolerance, and faster from RAID10 at exactly 4 drives.

Technically not. If you lose the first drive of both mirrors you lose data, so a 1 in 3 that a double-failure kills your data. You'd still be making the right choice in the majority of circumstances because of the massive speed difference, especially at rebuild (raw copy rather than parity calculation).

pixaal
Jan 8, 2004

All ice cream is now for all beings, no matter how many legs.


Skandranon posted:

The one hour was just to hold the time constant. As time gets longer, your odds of a failure during rebuild get worse.

RAID is not a backup, it lets the users keep accessing data while you start switching servers to your mirrored backup (assuming you have budget for such a thing). Worst case you still have a compressed backup on your NAS or on tape right? RIGHT? Sure you may have a large chunk of downtime if you need to order another set of drives because 3/8 failed and gently caress it the other 5 are probably garbage anyway. But you can at least recover from that backup and are out of business.

keseph posted:

Technically not. If you lose the first drive of both mirrors you lose data, so a 1 in 3 that a double-failure kills your data. You'd still be making the right choice in the majority of circumstances because of the massive speed difference, especially at rebuild (raw copy rather than parity calculation).

While correct, and its a 1/3 chance that the 2nd drive is the pair the recover speed is so much quicker that it should be the correct choice as you say. The larger the array the more "correct" 10 is over 6. A 10 drive array can lose 5 disks in RAID10 in a "perfect" case, but worst cast it can only lose 1, 2nd will kill the entire thing, but that is only 1/9. Raid 6 you can lose 2, 3rd kills you. Expand that to 100 disks and you really see why 10 is great.

My previous place had everything in RAID50, I haven't seen anything recommend it, is it just a poo poo show and 10 is better in almost every case?

pixaal fucked around with this message at 02:59 on Aug 29, 2015

Skandranon
Sep 6, 2008
fucking stupid, dont listen to me

pixaal posted:

RAID is not a backup, it lets the users keep accessing data while you start switching servers to your mirrored backup (assuming you have budget for such a thing). Worst case you still have a compressed backup on your NAS or on tape right? RIGHT? Sure you may have a large chunk of downtime if you need to order another set of drives because 3/8 failed and gently caress it the other 5 are probably garbage anyway. But you can at least recover from that backup and are out of business.

I don't think I said it was backup, I was trying to demonstrate how the failure rate goes up with the number of disks.

pixaal
Jan 8, 2004

All ice cream is now for all beings, no matter how many legs.


Skandranon posted:

I don't think I said it was backup, I was trying to demonstrate how the failure rate goes up with the number of disks.

I wasn't arguing, was agreeing.

RFC2324
Jun 7, 2012

http 418

Follow up question: Think SSDs, with their increased reliability and speed, will bring back RAID 5 or extend the life of RAID 6? (one of the articles I ran across looking up raid 5 reliability was some guy saying he had predicted RAID 5 would be abandoned in 2009, claiming he expects RAID 6 to be abandoned in 2019)

Skandranon
Sep 6, 2008
fucking stupid, dont listen to me

RFC2324 posted:

Follow up question: Think SSDs, with their increased reliability and speed, will bring back RAID 5 or extend the life of RAID 6? (one of the articles I ran across looking up raid 5 reliability was some guy saying he had predicted RAID 5 would be abandoned in 2009, claiming he expects RAID 6 to be abandoned in 2019)

The increased reliability doesn't change much, but the speed does. However, the number of disks is still the major factor. A 20 disk SSD array isn't any better an idea, and would be prohibitively expensive. RAID 6 won't 'die', it will just no longer provide the same relative redundancy as 3 disk RAID5 or 6-8 disk RAID6 does now. There will be 3 parity drive solutions, or possibly even more advanced redundancy architectures by then.

evol262
Nov 30, 2010
#!/usr/bin/perl
The numbers on RAID5 have been run, and you basically have 100% odds of hitting an unrecoverable read error on a parity block. 10 4tb drives have 96% odds of failure.

SSDs may bring a resurgence, but I've always seen a better use for SSDs in distributed clustered filesystems (which end up striping/mirroring anyway, just on s different level) than traditional RAID, though that probably says more about the stuff I work with these days than anything.

keseph
Oct 21, 2010

beep bawk boop bawk

evol262 posted:

The numbers on RAID5 have been run, and you basically have 100% odds of hitting an unrecoverable read error on a parity block. 10 4tb drives have 96% odds of failure.

SSDs may bring a resurgence, but I've always seen a better use for SSDs in distributed clustered filesystems (which end up striping/mirroring anyway, just on s different level) than traditional RAID, though that probably says more about the stuff I work with these days than anything.

Relatedly, with a chassis that contains more than a couple SSDs and you start pumping multiple GB/sec of writes, you start hitting CPU speed caps on parity calculation speed of single controllers, and parallel controllers get extremely complex inside the firmware, driving the cost way up. Or you could just run R10 with multiple hot spares that work just fine even with a much simpler and cheaper controller and don't need a RAID7 or RAID8 standard.

bull3964
Nov 18, 2000

DO YOU HEAR THAT? THAT'S THE SOUND OF ME PATTING MYSELF ON THE BACK.


Never run without hot spares with data you care about. Ideally you would also have pre-fail detection as well.

What sounds better? Having a drive fail in the middle of the night, not getting swapped until the next day with a several hour rebuild still to go before things are in a non-degraded state . All the while, your controller's CPU is pegged for parity calculations the entire time.

or

Array detects a drive is suspect in the middle of the night and starts to copy the contents of that drive to your hot spare, only hitting parity if it has an unrecoverable block. You wake up the next morning and see that your array properly removed the suspect drive, never even going into a degraded state, and did it a ton faster because it didn't need to calculate parity for the whole array.

Then you calmly call your support and have the new drive delivered in your 4 hour window, restoring full hot spare functionality.

Agrikk
Oct 17, 2003

Take care with that! We have not fully ascertained its function, and the ticking is accelerating.

pixaal posted:

My previous place had everything in RAID50, I haven't seen anything recommend it, is it just a poo poo show and 10 is better in almost every case?

RAID50 is typically sets of three drives (2data, 1parity) then grouped together in striped sets.

It behaves slightly slower than R10 due to parity calculations, but with additional storage capacity. (Take 6 disks: with r10 you have 3 disks for data, with r50 you have four).

R50 is a middle of the road option for people who want some additional performance over r5 with some additional storage over r10. I've used it in smaller database volumes that were going to be tight on space.

anthonypants
May 6, 2007

by Nyc_Tattoo
Dinosaur Gum
Apparently earlier in the week one of our customers got us to troubleshoot why their email wasn't getting sent to their copier. If that wasn't bad enough, here are the last two emails we sent to them:

Do you have it internally aliased to something else? I'm not seeing a konica@[companyname].net showing up in your user list.

I tried the number [telephone] and it was a law office.

(they are not a law office)

MC Fruit Stripe
Nov 26, 2002

around and around we go
On a conference call on Saturday morning - it's my turn on call and I was ready for it to not be a great weekend. I do not care.

Tracking down an executive - "he's at the hospital with his girlfriend, her father had a heart attack, he'll be dialing in in a second"

Yep that about sums up this job.

Luna Was Here
Mar 21, 2013

Lipstick Apathy
Did the dad live :ohdear:

Ynglaur
Oct 9, 2013

The Malta Conference, anyone?

Luna Was Here posted:

Did the dad live :ohdear:

Are you the executive, trying to find out I you should buy flowers before coming back upstairs?

GOOCHY
Sep 17, 2003

In an interstellar burst I'm back to save the universe!
So do we assume that the executive also has a wife? :-D

sfwarlock
Aug 11, 2007

ConfusedUs posted:

That qualifies as "checking" in my book.

Every backup application worth two cents will email you success and/or failure for any operation performed, natively. And of course you can monitor in other ways too.

Until

a) People tell you to turn that poo poo off, systems should only bother you when you need to take action, we don't need to hear about successes. Some unknowable amount of time later, the emails stop getting sent and no one notices.

b) There's some thumbs.db file or some poo poo that can't be backed up, so everyone is used to seeing emails titled "WARNING: BACKUP FAILURE ON SNOOPY." and stops actually reading them.

c) People construct filters to auto delete the email. (Or, in one of the most facepalmy incidents of my career, they started reporting them as spam. Causing their own server to get blacklisted.)

Kazinsal
Dec 13, 2011



Hey, hosting people? If I click the "upgrade package request" button, maybe warn me that clicking the button will immediately shut the server down and ask me if I really want to continue?

An entire server's worth of people are now asking me why their teamspeak suddenly disconnected.

Thanks guys.

Casull
Aug 13, 2005

:catstare: :catstare: :catstare:
Not pissing me off: I got my whitebox home lab server up and running and it came with IPMI so I never had to connect a keyboard or monitor to the thing!
Pissing me off: What the hell, VMware, I can't run write operations with PowerCLI (like, you know, creating a loving VM using New-VM) on the free ESXi version? Jeez, I know you had to gimp it to get people to actually buy the product but that's just ridiculous.

anthonypants
May 6, 2007

by Nyc_Tattoo
Dinosaur Gum

Casull posted:

Not pissing me off: I got my whitebox home lab server up and running and it came with IPMI so I never had to connect a keyboard or monitor to the thing!
Pissing me off: What the hell, VMware, I can't run write operations with PowerCLI (like, you know, creating a loving VM using New-VM) on the free ESXi version? Jeez, I know you had to gimp it to get people to actually buy the product but that's just ridiculous.
Which version of ESXi?

Casull
Aug 13, 2005

:catstare: :catstare: :catstare:

anthonypants posted:

Which version of ESXi?

I'm trying out 6.0 at home.

Agrikk
Oct 17, 2003

Take care with that! We have not fully ascertained its function, and the ticking is accelerating.

What board did you purchase?

Casull
Aug 13, 2005

:catstare: :catstare: :catstare:

Agrikk posted:

What board did you purchase?

Asrock C2750D4I with a pre-included Intel Avoton 8-core processor. It's got IPMI, two ethernet ports, and it's mini-ITX so I could follow this guide to build the thing up. I didn't use the case that guy used, though, but I used a Cooler Master Elite 130. It's also sporting a 250GB Samsung 850 EVO SSD for storage.

The cool thing is that the Avoton core can get away with passive cooling, the entire thing is quiet enough to live in the living room, and it's got low power consumption. The total, including tax, was about $800USD or so.

skooma512
Feb 8, 2012

You couldn't grok my race car, but you dug the roadside blur.
poo poo not pissing me off: Didn't get a page all weekend.

RISCy Business
Jun 17, 2015

bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork bork
Fun Shoe
it bothers me when people do stupid command line poo poo

code:
cat file | grep words
ps aux | grep -v grep

evobatman
Jul 30, 2006

it means nothing, but says everything!
Pillbug

Gwaihir posted:

Actually reading logs and error messages qualifies as entrapment.

theperminator
Sep 16, 2009

by Smythe
Fun Shoe

reddit liker posted:

it bothers me when people do stupid command line poo poo

code:
cat file | grep words
ps aux | grep -v grep

The first one at least does something, did you ask what they're thinking with the second?

anthonypants
May 6, 2007

by Nyc_Tattoo
Dinosaur Gum

reddit liker posted:

it bothers me when people do stupid command line poo poo
Then you'll love this: http://www.smallo.ruhr.de/award.html

Partycat
Oct 25, 2004

The Fool posted:

1. gently caress Printers
2. gently caress Phones
3. gently caress Backups
4. gently caress Wireless
5. gently caress Email

That's roughly how I feel about it right now. List is subject to change without notification.

Backquoting a bit but since I deal with phones, it's that you can't avoid going to them physically most of the time, and that they don't have 'errors' , but symptoms that are a pain to decipher. Also they collect skin, makeup, and face particles on them.

Bob Morales
Aug 18, 2006


Just wear the fucking mask, Bob

I don't care how many people I probably infected with COVID-19 while refusing to wear a mask, my comfort is far more important than the health and safety of everyone around me!

I'm going to have all future applicants write a few pages of English so I don't have to deal with this poo poo in the future:

Yes, I’m working on it now. The part didn’t will correctly and they need the comments to be changed. I have the files in my test side to fixed the issues.

Born and raised in the USA.

Migishu
Oct 22, 2005

I'll eat your fucking eyeballs if you're not careful

Grimey Drawer
poo poo somewhat pissing me off: schedule a 1hr virtual training session for a go live tomorrow. 20mins into our scheduled 1hr, no one has attended.

apparently there was something more important that they got scheduled for and no one bothered to notify me.

I guess they're going to get the short short version

evol262
Nov 30, 2010
#!/usr/bin/perl

theperminator posted:

The first one at least does something, did you ask what they're thinking with the second?

The second is common for "ps -ef | grep foo | grep -v grep | ..." to pass it into awk+kill or something.

grep [f]oo is cleaner, though

Lightning Jim
Nov 18, 2006

Just a mad weather-ologist :science:

reddit liker posted:

it bothers me when people do stupid command line poo poo

code:
cat file | grep words
ps aux | grep -v grep

I was in a Red Hat Training class and the trainer kept using "cat file | grep words".

Adbot
ADBOT LOVES YOU

captkirk
Feb 5, 2010

Lightning Jim posted:

I was in a Red Hat Training class and the trainer kept using "cat file | grep words".

I had this habit for a long time. It wasn't until my sophomore year of college that someone asked "What the hell are you doing?" and I broke the habit.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply