Enterprise Storage Megathread: Why is my NAS a SAN?

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > Enterprise Storage Megathread: Why is my NAS a SAN?

«‹›206 »

szlevi: Sep 10, 2010; [[ POKE 65535,0 ]]

Syano posted:

Oh thats pretty killer. Cool this is what I needed to know. Now to make a purchase!

Yeah, that's the cool part - downside is you can lose a LOT of storage space as well as bandwidth for mirroring things between boxes...

# ? Jan 16, 2011 10:35

Adbot: ADBOT LOVES YOU

# ? Apr 20, 2024 01:22

szlevi: Sep 10, 2010; [[ POKE 65535,0 ]]

qutius posted:

MPIO protects against more than just a controller failing and the partner taking over disk ownership, which is more of an high availability thing.

MPIO will protect against any failure on the fabric - so target/initiator port, cable, switch, switch port, etc. Some of these failures would be protected against HA too, but MPIO is needed at the driver level too.

But maybe I'm splitting hairs...

Well, MPIO can be configured several ways... generally speaking it's typically not for redundancy but rather for better bandwidth utilization over multiple ports - if you are running on a dual or quad eth card then it cannot protect you from any failure anyway.
Also vendor's DSM typically provides LUN/volume location awareness as well - when you have larger network it makes a difference.

# ? Jan 16, 2011 10:40

szlevi: Sep 10, 2010; [[ POKE 65535,0 ]]

InferiorWang posted:

I started to spec up a server to run the VSA software on. We're a Dell server shop, so I figured I'd stick with them. I've asked for a quote for an R710 with embedded ESXi 4.1, the free one. I'm looking at doing a simple RAID5 SATA 7.2k array. Since this would be DR and only the most critical services would go live, I'm guessing using SATA isn't a horrible choice in this case. No oracle or mssql except for a small financial package using mssql, which has 10 users at the most at any one time. GroupWise(no laughing) and all of our Novell file servers would be brought online too. 32GB of RAM. Anyone see anything completely wrong with that hardware setup?

take a look at those sweet-priced R715s - they come with 12-core Opterons, for the same price you can get 24 core in a node... and R815s are only 2-3G away and up to 2.2GHz they come with 3rd and 4th CPU included for free, making it 48 cores total per node.

quote:

Also, our Dell rep has a bad habit of ignoring what you write in an email. I gave him the equote# and asked to change the support options and to NOT give me the promotion pricing he mentioned. So, he gives me a quote with the wrong support option and with the promotion pricing.

Try a channel partner, seriously - Dell often acts weird until a reseller shows up.

# ? Jan 16, 2011 10:44

szlevi: Sep 10, 2010; [[ POKE 65535,0 ]]

Xenomorph posted:

Pretty basic I guess, but I just ordered a Dell NX3000 NAS, loaded with 2TB drives.

If you can wait new NX models, sporting the new Storage Server 2008 R2 etc, are coming in 2-3 weeks.

# ? Jan 16, 2011 10:46

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

szlevi posted:

Err,

1. FOM is exactly for that, right, not to manage anything....
2. ...maybe you're confusing it with P4000 CMC?
3. Wait, that's free too...

Can you stop spewing vendor acronyms and say something?

# ? Jan 16, 2011 19:55

Syano: Jul 13, 2005

Looks to me like he said something 9 times in a row... though I didnt read any of it because its annoying as hell...

# ? Jan 16, 2011 23:11

Moey: Oct 22, 2010; I LIKE TO MOVE IT

InferiorWang posted:

I started to spec up a server to run the VSA software on. We're a Dell server shop, so I figured I'd stick with them. I've asked for a quote for an R710 with embedded ESXi 4.1, the free one. I'm looking at doing a simple RAID5 SATA 7.2k array. Since this would be DR and only the most critical services would go live, I'm guessing using SATA isn't a horrible choice in this case. No oracle or mssql except for a small financial package using mssql, which has 10 users at the most at any one time. GroupWise(no laughing) and all of our Novell file servers would be brought online too. 32GB of RAM. Anyone see anything completely wrong with that hardware setup?

We now have 3 R710's and they have been running great so far. Similar disk array as you are looking at, all running ESX 4.1

Biggest thing we currently have virtualized on one is a crap program that runs on top of an Advantage DB, around 80+ users working on it daily (not too much I/O), and its been running great.

# ? Jan 17, 2011 05:36

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

szlevi posted:

Well, MPIO can be configured several ways... generally speaking it's typically not for redundancy but rather for better bandwidth utilization over multiple ports - if you are running on a dual or quad eth card then it cannot protect you from any failure anyway.
Also vendor's DSM typically provides LUN/volume location awareness as well - when you have larger network it makes a difference.

I picked this post because MPIO typically is for redundancy in conjunction with distributing load over a lot of front end ports on an array. Even if you're using iSCSI with a quad port gigabit card and you lose half your switches you'll still be able to get traffic to your storage. Every IP storage design we've ever done has involved a pair of physical switches just to provide redundancy.

quote:

Err,

1. FOM is exactly for that, right, not to manage anything....
2. ...maybe you're confusing it with P4000 CMC?
3. Wait, that's free too...

I have no idea what FOM is; but does it also send the correct instructions to VMware to attach the volumes? Will it also allow you to automatically re-IP your virtual machines? Does it handle orchestration with external dependencies as well as reporting/event notification? Does it also provide network fencing and set up things like a split off 3rd copy to avoid interrupting production processing during DR testing? Does it handle priorities and virtual machine sequencing? Does it integrate with DRS and DPM?

quote:

Which is exactly HA, with a remote node, right.
DR would be if he would recover from it, as in Disaster Recovery.

HA is typically referred to for localized failures. i.e. one of my server's motherboards just died on me and I want to bring my applications up from that failure quickly.

When we talk in terms of DR, we typically speak in one of two things:

1. Someone just caused a major data loss
2. My data center is a smoking hole in the ground

HA does NOT protect you against #1 and you're still going to tape. That said, in the event of #1 you're not going to necessarily do a site failover (unless you're like one of my customers who had a major security breach.)

In the event of #2; we're talking about moving all of the business processes and applications to some other location which goes far above and beyond typical HA.

quote:

They exist but not for free.

Which features that are worth it are you talking about? I guess you could say a centralized management console (which I believe actually is still free with Xen, as well as live migration.)

Also, for the love of whatever's important to you, consolidate your posts into one. There's no reason to reply to 6 different people separately.

# ? Jan 17, 2011 08:31

Mausi: Apr 11, 2006

Why Copy/Paste when there's a perfectly good quote button :downs:

Speaking of DR vs HA and explaining it to management level cluelessness, I have a meeting this week where several very important people want to tell me that implementing VMware SRM will break their business continuity protocols (which are based on tape restore at remote site) :downsgun:

# ? Jan 17, 2011 10:28

szlevi: Sep 10, 2010; [[ POKE 65535,0 ]]

Misogynist posted:

Can you stop spewing vendor acronyms and say something?

One expects certain knowledge from the party who's engaged in a discussion/argument about the #2 most popular iSCSI SAN system in a topic called Enterprise Storage Megathread...

...but hey, ask and you shall receive:

FOM: Failover Manager

P4000 CMC: HP StorageWorks SAN/iQ 9.0 Centralized Management Console

# ? Jan 18, 2011 08:00

szlevi: Sep 10, 2010; [[ POKE 65535,0 ]]

Syano posted:

Looks to me like he said something 9 times in a row... though I didnt read any of it because its annoying as hell...

Too bad - you might even have learned something from them at the end..

# ? Jan 18, 2011 08:02

szlevi: Sep 10, 2010; [[ POKE 65535,0 ]]

1000101 posted:

I picked this post because MPIO typically is for redundancy in conjunction with distributing load over a lot of front end ports on an array. Even if you're using iSCSI with a quad port gigabit card and you lose half your switches you'll still be able to get traffic to your storage. Every IP storage design we've ever done has involved a pair of physical switches just to provide redundancy.

You're right - hey, even I designed our system the same way - but I think you're forgetting the fact these iSCSI-boxes are the low-end SAN systems, mainly bought by SMB; you can argue about it but almost every time I talked to someone about MPIO they all said the only reason they use it is the higher throughput and no, they didn't have a second switch...

quote:

I have no idea what FOM is; but does it also send the correct instructions to VMware to attach the volumes? Will it also allow you to automatically re-IP your virtual machines? Does it handle orchestration with external dependencies as well as reporting/event notification? Does it also provide network fencing and set up things like a split off 3rd copy to avoid interrupting production processing during DR testing? Does it handle priorities and virtual machine sequencing? Does it integrate with DRS and DPM?

Well, that's the whole point: some I'd think you do at SAN level - eg detecting you need to redirect everything to the remote synced box - and some you will do in your virtual environment (cleaning up after the fallback etc.) I don't run VMware on Lefthand so I'm not the best argue about details but I read about it enough in the past few months to know what does it supposed to do..

quote:

HA is typically referred to for localized failures. i.e. one of my server's motherboards just died on me and I want to bring my applications up from that failure quickly.

Well, that's kinda moot point to argue about when Lefthand's big selling point is the remote sync option (well, up to ~5ms latency between sites) - they do support HA over remote links (and so does almost every bigger SAN vendor if I remember correctly.)

quote:

When we talk in terms of DR, we typically speak in one of two things:

1. Someone just caused a major data loss
2. My data center is a smoking hole in the ground

HA does NOT protect you against #1 and you're still going to tape. That said, in the event of #1 you're not going to necessarily do a site failover (unless you're like one of my customers who had a major security breach.)

Correct but HA has never been against data corruption, I have never argued that - data corruption or loss is where your carefully designed snapshot rotation should come in: you recover it almost immediately.
OTOH if I think about it a lagged, async remote option might be even enough for corruption issues...

quote:

In the event of #2; we're talking about moving all of the business processes and applications to some other location which goes far above and beyond typical HA.

Not anymore. Almost every SAN vendor offers some sort of remote sync with failover - I'd consider these HA but it's true that lines are blurring more and more.

quote:

Which features that are worth it are you talking about? I guess you could say a centralized management console (which I believe actually is still free with Xen, as well as live migration.)

The very feature we're talking about here: failover clustering.

quote:

Also, for the love of whatever's important to you, consolidate your posts into one. There's no reason to reply to 6 different people separately.

I think it's more polite especially when I wrote about different things to each person - they don't need to hunt down my reply to them...

# ? Jan 18, 2011 08:57

three: Aug 9, 2007; i fantasize about ndamukong suh licking my doodoo hole

szlevi posted:

You're right - hey, even I designed our system the same way - but I think you're forgetting the fact these iSCSI-boxes are the low-end SAN systems, mainly bought by SMB; you can argue about it but almost every time I talked to someone about MPIO they all said the only reason they use it is the higher throughput and no, they didn't have a second switch...

Plenty of large businesses use iSCSI.

I think you need to learn to stop when you're wrong.

# ? Jan 18, 2011 14:12

Syano: Jul 13, 2005

szlevi posted:

Too bad - you might even have learned something from them at the end..

I have learned something. Ive learned that you think you know a lot about SANs and that you are a TERRIBLE poster. iSCSI only used by SMBs? Seriously?

# ? Jan 18, 2011 14:44

Mausi: Apr 11, 2006

I suspect he's only talking about HP Lefthand. Well I hope he is, because that huge a generalisation would be pretty loving stupid. And he'd still be wrong, but whatever.

# ? Jan 18, 2011 15:14

Boner Buffet: Feb 16, 2006

szlevi posted:

take a look at those sweet-priced R715s - they come with 12-core Opterons, for the same price you can get 24 core in a node... and R815s are only 2-3G away and up to 2.2GHz they come with 3rd and 4th CPU included for free, making it 48 cores total per node.

Unfortunately, the VMWare bundle we're going to go with only allows licenses for up to 6 cores per host.

# ? Jan 18, 2011 15:36

Syano: Jul 13, 2005

It was a tough decision for us too. Vmware really made the choice between the Advanced and Midsized acceleration kits a tough one. What kills you is by stepping up to the midsize you lose the 3 host limit which is awesome but then you get saddled with the 6 core per socket limit.

# ? Jan 18, 2011 16:04

szlevi: Sep 10, 2010; [[ POKE 65535,0 ]]

three posted:

Plenty of large businesses use iSCSI.

Um, sure... your point is? :allears:

quote:

I think you need to learn to stop when you're wrong.

I think you need to learn to read first before you post nonsensical replies...

# ? Jan 18, 2011 16:38

szlevi: Sep 10, 2010; [[ POKE 65535,0 ]]

Mausi posted:

I suspect he's only talking about HP Lefthand.

Correct.

quote:

Well I hope he is, because that huge a generalisation would be pretty loving stupid.

Well, talking about generalisation after someone cited empirical evidence IS pretty fuckin stupid, y'know.

quote:

And he'd still be wrong, but whatever.

Even if I put aside the fact that you're not making any argument - trolling? - I'm still all ears how someone's experience can be wrong... :allears:

# ? Jan 18, 2011 16:41

szlevi: Sep 10, 2010; [[ POKE 65535,0 ]]

Syano posted:

I have learned something. Ive learned that you think you know a lot about SANs and that you are a TERRIBLE poster. iSCSI only used by SMBs? Seriously?

See, I told you: read the posts before you make any more embarrassingly stupid posts - you didn't and now you really look like an idiot with this post...

...someone has issues, it seems. :raise:

# ? Jan 18, 2011 16:44

szlevi: Sep 10, 2010; [[ POKE 65535,0 ]]

InferiorWang posted:

Unfortunately, the VMWare bundle we're going to go with only allows licenses for up to 6 cores per host.

EMC will rip you off at every turn, that's for sure - I didn't even even include them in my RFP list and I got pretty nasty replies from their people in another forum when I said their pricing is absolutely ridiculous around $100-200k, even if it as complete as VMware is when it comes to virtualization.
Of course he came back arguing the usual TCO-mantra but unless they give me a written statement that 3-5 years from now, when they will push me for a forklift upgrade, I will get all the licenses transferred I will never consider them around $150k, that's for sure.

# ? Jan 18, 2011 16:51

Mausi: Apr 11, 2006

InferiorWang posted:

Unfortunately, the VMWare bundle we're going to go with only allows licenses for up to 6 cores per host.

The core limitation is a bit of a red herring as the vast majority of x86 virtualization work done runs into memory ceilings long before cpu, the notable exception to this is stuff like 32bit Terminal Services.

szlevi posted:

Even if I put aside the fact that you're not making any argument - trolling? - I'm still all ears how someone's experience can be wrong...

Well unless your definition of SMB scales up to state government then your assertion draws from limited experience. Lefthand kit is used, in my experience, in both state government of some countries as well as enterprises, albeit outside the core datacentre.

szlevi posted:

Of course he came back arguing the usual TCO-mantra but unless they give me a written statement that 3-5 years from now, when they will push me for a forklift upgrade, I will get all the licenses transferred I will never consider them around $150k, that's for sure.

I'm not certain about EMC, but it's basically a constitutional guarantee from VMware that, if you buy regular licensing and it is in support for a defined (and very generous) window around the release of a new version of ESX, you will be given the gratis upgrade. This happened with 2.x to 3, and happened again with 3.x to 4. There is absolutely no indication internally that this will change from 4.x to 5.
My experience with licensing from EMC is that they will drop their pants on the purchase price, then make it all back on support & maintenance.

Mausi fucked around with this message at 17:15 on Jan 18, 2011

# ? Jan 18, 2011 17:04

Boner Buffet: Feb 16, 2006

Mausi posted:

The core limitation is a bit of a red herring as the vast majority of x86 virtualization work done runs into memory ceilings long before cpu, the notable exception to this is stuff like 32bit Terminal Services.

I'm not entirely sure why I wrote "unfortunately" anyway, since I'm only going to have 2 processors with 2 cores each anyway in my hosts...

# ? Jan 18, 2011 17:17

Mausi: Apr 11, 2006

I don't want to turn this into the virtualisation love-in anyway, so here's the megathread for that if you haven't seen it yet.

http://forums.somethingawful.com/showthread.php?threadid=2930836

But 4 cores / 32 Gb is fine for a consolidation footprint; Depending on the workloads you're going after it's usually between 4Gb and 16Gb per core.

# ? Jan 18, 2011 17:29

FISHMANPET: Mar 3, 2007; Sweet 'N Sour
Can't
Melt
Steel Beams

Is being a really lovely poster a bannable offense? Because I'm loving tired of szlevi making GBS threads up this thread with his inability to copy/paste.

# ? Jan 18, 2011 17:50

ozmunkeh: Feb 28, 2008; hey guys what is happening in this thread

More than likely going to put in some HP P4300 (Lefthand) boxes. Any objections to using a couple Juniper EX2200 switches?

We have some ExtremeNetworks x450e switches and my first instinct is to check out the rest of their range - the x350 perhaps - but I was recommended the Junipers so figured I'd ask here.

# ? Jan 18, 2011 18:31

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

This is longer than I intended it to be.

szlevi posted:

You're right - hey, even I designed our system the same way - but I think you're forgetting the fact these iSCSI-boxes are the low-end SAN systems, mainly bought by SMB; you can argue about it but almost every time I talked to someone about MPIO they all said the only reason they use it is the higher throughput and no, they didn't have a second switch...

Maybe in the low end market where sales guys let customers make bad decisions this is true.

I've been hard pressed to find a lot of small businesses that actually need to have more than ~1gbp/s of bandwidth to the storage. I've been to plenty of shops running 1500+ user exchange databases over a single gigabit iSCSI link with a second link strictly for failover.

What you're seeing is not the norm in any sane organization.

I guess the exception is in dev/QA environments. I'll give you that!

quote:

Well, that's the whole point: some I'd think you do at SAN level - eg detecting you need to redirect everything to the remote synced box - and some you will do in your virtual environment (cleaning up after the fallback etc.) I don't run VMware on Lefthand so I'm not the best argue about details but I read about it enough in the past few months to know what does it supposed to do..

Redirecting data to a remotely synced up site doesn't provide you everything. If I move 2000 or 20 virtual machines from one physical location to another physical location then there are good odds I have a shitload more work to do than just moving the systems.

The parts you do at the SAN level would be getting the data offsite. Once the data is at the new site you have a host of questions to answer. Stuff like:

Am I using spanned VLANs? If not how am I changing my IP addresses?
How are my users going to access the servers now?
Since I only have so much disk performance at my DR site, how do I prioritize what applications I bring online first?
What about data that I'm not replicating that needs to be restored from tape/VTL?
Do I need to procure additional hardware?
...

What about testing all of this without impacting production OR replication?

quote:

Well, that's kinda moot point to argue about when Lefthand's big selling point is the remote sync option (well, up to ~5ms latency between sites) - they do support HA over remote links (and so does almost every bigger SAN vendor if I remember correctly.)

This is synchronous replication and can really only happen within about 60-75 miles. Hardly appropriate for a good portion of disaster recovery scenarios people typically plan for (hurricanes, earthquakes, extended power failures.)

Yes I can use SRDF/S or VPLEX and move my whole datacenter about an hour's drive away. Is this sufficient for disaster recover planning and execution? Probably not if say hurricane katrina comes and blows through your down and knocks out power in a couple hundred mile radius.

quote:

Correct but HA has never been against data corruption, I have never argued that - data corruption or loss is where your carefully designed snapshot rotation should come in: you recover it almost immediately.
OTOH if I think about it a lagged, async remote option might be even enough for corruption issues...

Are you a sales guy?

I said that there are a lot more components to DR than just "failing over" as a generic catch-all.

Depending on async replication to protect you against data corruption issues is insane. "Oh poo poo I've got about 30 seconds to pull the plug before it goes to the remote site!" There's no "might even" about it. Depending on that is a good way to cause a major data loss.

No one does this.

Also snapshots shouldn't be the only component of your disaster recovery plan.

quote:

Not anymore. Almost every SAN vendor offers some sort of remote sync with failover - I'd consider these HA but it's true that lines are blurring more and more.

HA != DR which was the point I was trying to make earlier. Lets take one of my customers as an example:

Site A and Site B are approximately 50 miles apart. Site C is approximately 2000 miles from Site B.

Site A and Site B are basically a giant geocluster. Same exact layer 2 networks spanned between datacenters and sync replication between the facilities. Site B is intended to provide HA to Site A.

Site B also does asynchronous replication to site C. This is intended to provide DR recovery in the event of a regional outage (say, an earthquake.) Coincidentally this site also house all of the datadomain remote backups.

So site B provides HA to site A; but site C is strictly for disaster recovery purposes. You plan completely different for either event.

In the event of a major disaster a whole lot of things need to happen at site C. Examples include firewall reconfiguration; attaching volumes to servers and accessing the data and the lengthy restore process from the datadomains. ESX for example won't just attach a snapshot copy of a LUN, you need to instruct it that its okay to do so.

quote:

The very feature we're talking about here: failover clustering.

No, we're talking about "oh poo poo my local site is gone and I need to move my business to a remote site"

Depending on a number of factors it may not be as simple as just clicking a button in an interface and saying "well there we go! We are up and running!"

Failover clustering won't handle ANY of the orchestration thats required for bringing a shitload of systems online in DR nor does it actually provide much functionality for just testing without impact to production.

quote:

I think it's more polite especially when I wrote about different things to each person - they don't need to hunt down my reply to them...

It needlessly fills up the thread with a bunch of posts and makes it harder to follow what you're saying. In a way its less polite to keep doing this.

Anyway the TL;DR portion was that not every product your company sells fills every niche/use case and you should really look at core customer requirements before you run off screaming gospel.

Mausi posted:

Well unless your definition of SMB scales up to state government then your assertion draws from limited experience. Lefthand kit is used, in my experience, in both state government of some countries as well as enterprises, albeit outside the core datacentre.

There's a company hosting around 6000 virtual machines for their production/public facing infrastructure that is entirely supported by scaled out LeftHand boxes and they have over 10,000 employees.

iSCSI is a very popular storage protocol because its cheap and easy and completely appropriate for a lot of applications. So I'd say his definition of SMB should include the enterprise space as well.

# ? Jan 18, 2011 20:42

adorai: Nov 2, 2002; 10/27/04 Never forget; Grimey Drawer

Mausi posted:

But 4 cores / 32 Gb is fine for a consolidation footprint; Depending on the workloads you're going after it's usually between 4Gb and 16Gb per core.

We're at 12 gigs per physical core (6gb per logical core) currently, and I'll bet we get to double that before we need any more CPU, unless we deploy some kind of virtual desktop infrastructure or get serious about virtualizing citrix.

As far as iSCSI goes, we are running about 100 VMs from SQL to webservers to exchange for a 500 person company on iSCSI, and we push approximately 2 gigabits per second in total over iSCSI. That includes VMFS and iSCSI connections inside the guests. Any company that spends even $1 more for FC is dumb, spend the money on 10 gig ethernet and call it a day.

adorai fucked around with this message at 01:53 on Jan 19, 2011

# ? Jan 19, 2011 01:49

Noghri_ViR: Oct 19, 2001; Your party has died.
Please press [ENTER] to continue to the
Las Vegas Bowl

So did anyone pay attention to the emc new product announcements today. What are your thoughts on them? I was about to rule them in out new purchase but now I'm going to take a second look

# ? Jan 19, 2011 04:25

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

szlevi posted:

One expects certain knowledge from the party who's engaged in a discussion/argument about the #2 most popular iSCSI SAN system in a topic called Enterprise Storage Megathread...

It looks like your grasp of statistics is as bad as your posting!

# ? Jan 19, 2011 15:36

szlevi: Sep 10, 2010; [[ POKE 65535,0 ]]

Mausi posted:

Well unless your definition of SMB scales up to state government then your assertion draws from limited experience. Lefthand kit is used, in my experience, in both state government of some countries as well as enterprises, albeit outside the core datacentre.

I'm not following you - I said SMB; what does my SMB experience has to do with gov...?

quote:

I'm not certain about EMC, but it's basically a constitutional guarantee from VMware that, if you buy regular licensing and it is in support for a defined (and very generous) window around the release of a new version of ESX, you will be given the gratis upgrade. This happened with 2.x to 3, and happened again with 3.x to 4. There is absolutely no indication internally that this will change from 4.x to 5.
My experience with licensing from EMC is that they will drop their pants on the purchase price, then make it all back on support & maintenance.

EMC will rip you naked with the "pay-as-you-go" nonsense vs all-inclusive and cheaper iSCSI licenses, that was my point, let alone not having to repurchase all of them when you bring in new generation of boxes (well, at least EQL does allow mix'n'match all generations.)

# ? Jan 19, 2011 16:01

szlevi: Sep 10, 2010; [[ POKE 65535,0 ]]

FISHMANPET posted:

Is being a really lovely poster a bannable offense? Because I'm loving tired of szlevi making GBS threads up this thread with his inability to copy/paste.

It seems some think the way to show how "tough noobs" they are they have to attack someone - let me reply in the same manner...

...do you really think anyone give a sh!t about your whining? Report me for properly using the quote button or not but stop whining, kid.

(USER WAS PUT ON PROBATION FOR THIS POST)

# ? Jan 19, 2011 16:04

szlevi: Sep 10, 2010; [[ POKE 65535,0 ]]

1000101 posted:

This is longer than I intended it to be.

Maybe in the low end market where sales guys let customers make bad decisions this is true.

I have nothing to do with sales. I work for the same company for years, in a very specialized market (high-end medical visualization/3D) but I do help SMBs time by time (typically as a favor, not as a paid consultant, mind you.)
Almost every time the first thing I suggest is to make switches and network connections redundant - because in almost every case they are not. Aain, it's just empirical evidence but a very common problem nowadays, I think.

quote:

I've been hard pressed to find a lot of small businesses that actually need to have more than ~1gbp/s of bandwidth to the storage. I've been to plenty of shops running 1500+ user exchange databases over a single gigabit iSCSI link with a second link strictly for failover.

No offense but that's the typical problem with all "regular" storage architects: they can only think about Exchange/OLTP/SQL/CRM/SAP etc.

Let's step back for a sec: you know what most SMBs say? The "network is slow" which in reality means they are not happy with file I/O speeds. Your 1Gb/s is 125MB/s theoretical - which is literally nothing when you have 10-20 working with files especially when they use some older CIFS server (eg Windows 2003 or R2).

quote:

What you're seeing is not the norm in any sane organization.

What you claim as "sane" organization has nothing to do with any average SMB, that's for sure.

quote:

I guess the exception is in dev/QA environments. I'll give you that!

Funny, our R&D dept is a very quirky environment, that's for sure - next couple of months I have to upgrade them to 10GbE to make sure they can develop/validate tools for a workflow requiring 600-700MB/s sustained speed (essentially all high-rez volumetric dataset jobs.)

quote:

Redirecting data to a remotely synced up site doesn't provide you everything. If I move 2000 or 20 virtual machines from one physical location to another physical location then there are good odds I have a shitload more work to do than just moving the systems.

True but that's what these vendors promise when they market their sync replication and failover.

quote:

The parts you do at the SAN level would be getting the data offsite. Once the data is at the new site you have a host of questions to answer. Stuff like:

Am I using spanned VLANs? If not how am I changing my IP addresses?
How are my users going to access the servers now?
Since I only have so much disk performance at my DR site, how do I prioritize what applications I bring online first?
What about data that I'm not replicating that needs to be restored from tape/VTL?
Do I need to procure additional hardware?
...

What about testing all of this without impacting production OR replication?

This is synchronous replication and can really only happen within about 60-75 miles. Hardly appropriate for a good portion of disaster recovery scenarios people typically plan for (hurricanes, earthquakes, extended power failures.)

Yes I can use SRDF/S or VPLEX and move my whole datacenter about an hour's drive away. Is this sufficient for disaster recover planning and execution? Probably not if say hurricane katrina comes and blows through your down and knocks out power in a couple hundred mile radius.

I'm not sure why are you asking me but here's how I understand their claims: your site A goes down but your FOM redirects everything to your site B, sync'd all the time, without any of your VMs or file shares noticing it (besides being a bit slower.) This is in lieu w/ your virtual HV and its capabilities, of course.
Heck, even Equallogic supports failover albeit it won't be completely sync'd (they only do async box-to-box replication.)
Am I missing something?

quote:

Are you a sales guy?

Not at all.

quote:

I said that there are a lot more components to DR than just "failing over" as a generic catch-all.

Depending on async replication to protect you against data corruption issues is insane. "Oh poo poo I've got about 30 seconds to pull the plug before it goes to the remote site!" There's no "might even" about it. Depending on that is a good way to cause a major data loss.

No one does this.

I disagree.

quote:

Also snapshots shouldn't be the only component of your disaster recovery plan.

Never said that - I said it's good against a crazy sysadmin or idiotic user, that's it.

Will cont, have to pick up my daughter.

# ? Jan 19, 2011 16:23

HorusTheAvenger: Nov 7, 2005

szlevi posted:

No offense but that's the typical problem with all "regular" storage architects: they can only think about Exchange/OLTP/SQL/CRM/SAP etc.

Let's step back for a sec: you know what most SMBs say? The "network is slow" which in reality means they are not happy with file I/O speeds. Your 1Gb/s is 125MB/s theoretical - which is literally nothing when you have 10-20 working with files especially when they use some older CIFS server (eg Windows 2003 or R2).

In my experience, the "network is slow" generally is a complaint about the internet, not the storage. From what I've seen in SMBs, users typically open an Excel/Word/whatever document then spend hours in that 5-50MB file, only saving when they finish. 125MB/s lets 2 users open their 50MB word document every second. Considering users aren't opening docs all at the same time, that 1GB link should be able to support all users in a typical-sized SMB. If you're dealing with large filesets on CIFS like tons of video, you're probably not a typical SMB.

# ? Jan 19, 2011 19:23

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

szlevi posted:

No offense but that's the typical problem with all "regular" storage architects: they can only think about Exchange/OLTP/SQL/CRM/SAP etc.

I look at the core business processes and the applications that support them. I build from there.

If these core applications don't work/behave then the storage is a waste of money and I have done my customer a huge disservice.

Or are you proposing we do "cowboy IT" and just assume that more bandwidth and more disks is the solution to every problem?

quote:

Let's step back for a sec: you know what most SMBs say? The "network is slow" which in reality means they are not happy with file I/O speeds. Your 1Gb/s is 125MB/s theoretical - which is literally nothing when you have 10-20 working with files especially when they use some older CIFS server (eg Windows 2003 or R2).

I don't understand what you're saying? Are you saying that 100+MB/sec isn't sufficient or are you saying that the older CIFS server isn't sufficient? 10-20 people working on gigabit is ~6-12MB/sec which is generally pretty close to the performance of local disks so its not going to be a plumbing issue. This of course assumes that 100% of the users are trying to use 100% of the bandwidth at the same time.

It sounds more to me like there may not be enough disks to reach that peak performance number so maybe I'd look there instead of saying "dude you need to add more bandwidth via MPIO!"

The rare exception might be media production in which case yes, 125MB/sec isn't enough. Of course we're talking about core business applications (like for some it might be Exchange) and that's the primary driver for storage.

I'll say it again since you seem to have missed it the first time:

...you should really look at core customer requirements before you run off screaming gospel.

quote:

True but that's what these vendors promise when they market their sync replication and failover.

What vendors? If we look at your favorite, EMC, they promote SRDF, MirrorView, and RecoverPoint as a way to get data somewhere else. They speak nothing of the recovery of that data but they do provide enough of a framework to make it happen. It's still up to me to do things like replay transaction logs or start services or whatever has to happen.

I guess you could look at NetApp but they also offer some nice application integration tools to easily restore.

quote:

I'm not sure why are you asking me but here's how I understand their claims: your site A goes down but your FOM redirects everything to your site B, sync'd all the time, without any of your VMs or file shares noticing it (besides being a bit slower.) This is in lieu w/ your virtual HV and its capabilities, of course.
Heck, even Equallogic supports failover albeit it won't be completely sync'd (they only do async box-to-box replication.)
Am I missing something?

For this to work you have to assume that the remote site has the same IP address space in it, routers, switches, other dependencies, firewalls, load balancers, etc. You also have to consider whether or not your end users can even access the servers in the new location and how that access happens.

You're basically missing the other 98% when we talk about moving poo poo into a DR site.

So yeah, my VMs might come up at the other side but what can they talk to? FOM sounds like its great if you're scaling out in the same datacenter for availability. If I need to move all of my virtual machines to a remote site 600 miles away it sounds like a completely inappropriate solution without layer 2 stretching.

quote:

I disagree.

If you actually do believe this then you're nuts and probably have no business managing, touching, or consulting on storage and especially disaster recovery.

quote:

Never said that - I said it's good against a crazy sysadmin or idiotic user, that's it.

Will cont, have to pick up my daughter.

It's not nuts. Lets assume the systems admin is a moron and he goes to replace a failed disk in the array "oops that whole raid group is gone!" and there went his snapshots.

At that point you're going to the part of your DR plan which involves restoring from tape. Depending on your storage you may end up re-initializing replication after the restore (highly likely.)

# ? Jan 19, 2011 19:31

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

1000101 posted:

At that point you're going to the part of your DR plan which involves restoring from tape. Depending on your storage you may end up re-initializing replication after the restore (highly likely.)

I don't get it. If you had replication in the first place, why would you need to go to tape?

# ? Jan 19, 2011 21:45

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

Misogynist posted:

I don't get it. If you had replication in the first place, why would you need to go to tape?

If you're replicating to another node in the same datacenter then yeah hit the failover button because thats why you're doing that.

If its offsite though things might change.

Lets assume I got 500GB at site A being replicated offsite to another datacenter and my link to that datacenter is 10mbp/s.

Some reasons to not fail over completely might be that its only 1 of 3 or 4 raid groups so only one major application is actually down. Do we move all of our operations over to the DR site for this one outage?

When you answer that question, consider the costs of re-syncing arrays if you do fail everything over. We had a customer do this to a 800GBish data volume by accident over a 6mbp/s MPLS circuit. Thankfully it was NetApp so we snapmirrored everything to a tape; overnighted the tape and gave the remote array a good starting point. Instead of copying 800GB of data we only had to do about a day and a half worth of changes.

Other vendors you might be shipping entire disk arrays back and forth.

If my whole business is run off that one raid group then yeah it may qualify as a disaster so lets go ahead and move everything offsite.

I see it happen a lot when people talk about DR planning. The focus is always on getting the data somewhere and maybe getting the data restored. I end up worrying about all the crap that comes afterwards though. Stuff like "okay I moved my poo poo to the houston datacenter but how do the users access it now?" My general assumption is that small to medium businesses aren't likely to have stretched VLANs or anything like that so there are probably a lot of other things that need to happen at the DR site to bring your apps online.

# ? Jan 19, 2011 22:43

Syano: Jul 13, 2005

I am super interested in EMCs new VNXe line. Its like babbys first SAN. But in all seriousness it is the first kit we have seen that can compete price wise with the Dell Powervault stuff.

EDIT: Just went through a customer presentation on the kits. Man these are nice. Dedupe, NFS/CIFS/iSCSI, WORM, all for under 10k? Yes please.

Syano fucked around with this message at 18:31 on Jan 20, 2011

# ? Jan 20, 2011 17:56

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

Syano posted:

I am super interested in EMCs new VNXe line. Its like babbys first SAN. But in all seriousness it is the first kit we have seen that can compete price wise with the Dell Powervault stuff.

EDIT: Just went through a customer presentation on the kits. Man these are nice. Dedupe, NFS/CIFS/iSCSI, WORM, all for under 10k? Yes please.

Just be aware that the lowest-end model has one controller and one power supply (this is still EMC we're talking about), so be careful which of your production workloads you opt to run from it.

I feel less like they're trying to take on the Dell PowerVault stuff and more like this was a preemptive strike against IBM's StorWize V7000, which hasn't yet made it into the low end.

# ? Jan 20, 2011 18:46

Adbot: ADBOT LOVES YOU

# ? Apr 20, 2024 01:22

Syano: Jul 13, 2005

Yeah I was more or less sort of salivating over the price point even though I knew it wasnt going to be for us. I wouldnt dare order without dual controllers and PSUs.

I was about a week from pulling the trigger on an HP4000 starter SAN for a new project before news of the VNXe dropped on Tuesday. My VAR was quoting me somewhere in the 30s for the HP kit. I should be able to beat that in a comparable VNXe config plus get a ton more features. The only thing I think I might be missing out on is network raid. I guess its really all going to depend on what software licenses I choose on top of the hardware.

# ? Jan 20, 2011 18:58

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > Enterprise Storage Megathread: Why is my NAS a SAN?

«‹›206 »