Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
bigmandan
Sep 11, 2001

lol internet
College Slice
I'm fairly new to SANs and I've been charged with contacting vendors to implement a solution at the ISP I work for. We have an ageing infrastructure and no unified storage solution at all. Our mixed physical/virtual environment is mostly DNS, Web, MySQL, Mail and RADIUS on some old Dell 2950's and some generic 1u servers. The Radius, MySQL and Mail servers are fairly IO intensive (mostly writes) where everything else is pretty low on requirements and . We'll need about 12-24TB of storage to start and we'll want to do offsite replication for DR. Network speed between our sites will mostly be a non issue as we own the fibre network between them and we can easily support multiple 10 Gbe links.

So far I've contacted EMC, NetApp and Dell to start the initial exploratory talks. When dealing with them, considering the above, what should I be expecting and is there anything I should watch out for? Also considering the above are there any other vendors I should investigate?

Adbot
ADBOT LOVES YOU

bigmandan
Sep 11, 2001

lol internet
College Slice

NevergirlsOFFICIAL posted:

do you need fibrechannel because if not look at nimble

We'll likely use iSCSI but FC is not out of the question at this point. I'll have to take a look at Nimble. How is their pricing compared to EMC, NetApp, Dell, etc. ?

bigmandan
Sep 11, 2001

lol internet
College Slice
So I've had a few meetings with Dell, Nimble and VMware (still waiting on NetApp to get back to us). And some of my colleagues like the idea of VMware vSAN. Based on our workload I think this is a bad idea, but they don't seem to think so. Also from what I understand management and scaling out/up vSAN is a pain in the rear end . How can I convince them that vSAN is a bad idea? I'm having a hard time articulating why.

bigmandan
Sep 11, 2001

lol internet
College Slice
I think the main problem is that going the vSAN route would fit our needs right now but some of my colleagues don't seem to realize is that we would very quickly outgrow what vSAN provides. Going with a "normal" SAN makes sense long term. Additionally, our read/write ratio is pretty drat close to 1:1. I'm fairly new to SANs in general but, based on my research and dealings with vendors, I think that alone would justify a SAN array. So far I think two Nimble cs220's (one for replication to satisfy DR) would fit our needs now and for the next few years based on our growth. Equallogic arrays would work as well, but I like the flexibility Nimble provides (on paper it seems that way).

Am I on the right track here or am I way off base?

Just to make sure what I'm thinking is sane I'll provide a few details of our environment:
We're an ISP. We're looking to consolidate majority of our physical servers with virtualization. Currently we have no unified storage solution. Replication to offsite is going to be a must have. Current performance across all servers, both physical and virtual, is about 50 MB/s avg, 100 peak. 1:1 read/write ratio, averaging 1k IOPS, peaking at around 2k. Performance is limited due to directly attached storage, either mirrored or RAID5. A lot of our production hardware is older than 7 years. Most services are the usual things an ISP has: DNS, mail, web servers, radius, etc. Mail accounts for half our IO. After we finish consolidation we'll end up with about 45-50 VM's. 6 DNS, 2 Mail, 1 MySQL (20 schemas or so), 2 RADIUS, 4-6 virtual desktops and the rest being Web servers serving various functions (customer vhosts, internal sites, etc..). Most servers are Debian, with a few Win2k8 servers that we needed for specific applications.

bigmandan
Sep 11, 2001

lol internet
College Slice

NippleFloss posted:

Your IO requirements are really really low. You could probably run that on just about anything. Even VSAN would work just fine, though if you're concerned about growth it might be more problematic long term. Things like replacing a failed drive will require putting the host in maintenance mode and evacuating all VMs, which is a lot of hassle for something that would be handled very easily by a dedicated storage array with hot spares and hot swappable drives. Data also isn't guaranteed to be local to the node hosting the VM either, which adds latency. And the requirement for write mirroring to SSD on another node adds still more latency which can definitely be felt in VDI environments. VDI is fairly write intensive and very latency sensitive so all things being equal I would choose the lowest latency solution possible, which is going to be an array that does not have to distribute IO over a backplane and which acknowledges writes when then hit NVRAM, rather than SSD (both are fast, but NVRAM will be an order of magnitude faster).

Like everyone has said, by the time you spec out hardware for a proper VSAN deployment you're in dedicated SAN territory anyway and you might as well get one and accrue the other benefits that come with it.

Thanks for the info. Our VDI is pretty minimal at the moment but it's good to know about the write intensity and latency.

bigmandan
Sep 11, 2001

lol internet
College Slice

Wicaeed posted:

So I got to sit down for an hour with Nimble and go through a webex presentation about their product.

If half of what they are claiming is true, this should be pretty simple sell to Management, as long as it doesn't break the bank ($80k)

I sat through that same presentation a few days ago. It is pretty drat impressive. Ballpark figure for the cs220 was about 50-60k (Canadian monopoly money)

bigmandan
Sep 11, 2001

lol internet
College Slice

Wicaeed posted:

:stare: drat that's quite a bit more expensive than Moey assumed near the top of this page (comparing it to an EMC VNX5200 + DAE for $22k) putting it (probably) right back into the territory of poo poo-that-I-want-but-couldn't-ever-get-budgeting-for

Was that for a single unit?


Was that before or after blowjobs?

I'm going to talk to our VAR and see if he can throw a quick and dirty quote my way based on what we want (probably two CS220 shelves, one as primary one as a backup). At this point I'm not holding my breath.

It was for a single unit with 10 GbE and 3 year support, suggested retail no discounts.

bigmandan
Sep 11, 2001

lol internet
College Slice
After a few months of back and forth with a few vendors it looks like we'll be settling on two Dell Compellent SC4020 SANs. They'll be configured with two flash tiers and one platter tier for a total of ~25TB and around 17,000* sustained IOPS / 35,000 burst. Going with two as the owner wants to do replication (probably semi-sync). Since we're also getting some servers and switches we got some pretty drat good discounts.

I'm getting pretty excited. We had some disk failures not too long ago, so getting this up and running will give our team some peace of mind.

* ~1800 IOPS worst case if doing r/w from tier 3

bigmandan
Sep 11, 2001

lol internet
College Slice

NippleFloss posted:

You can't really talk about IOPs in a vacuum like that. An IOP isn't an independent unit of measure like a liter or joule, it's wholly dependent on the workload involved. In a SAN environment the workloads a generally heavily mixed due the IO blender effect, which makes it especially hard to discuss IOPs thresholds in any meaningful way. Even a single application workload like SQL can have very different IO profiles depending on what type of activity is being directed at it, so saying "this array will give you x number of SQL IOPS" is an over-simplification.

Those numbers are based on our expected workloads with about 50/50 R/W ratio. Forgot to mention that.

bigmandan
Sep 11, 2001

lol internet
College Slice
I just got a notification that our two Compellent SC4020's (among other hardware) should be arriving tomorrow. Can't wait to get these suckers racked and running. It'll be a few days before Dell sends their rep for install though.

bigmandan
Sep 11, 2001

lol internet
College Slice

Amandyke posted:

Likely installation services were purchased along with the hardware. So the Dell CE will likely rack and stack, cable and power on the arrays. They will probably run some health checks on it as well before turning the keys over, so to speak.

We're pretty comfortable with racking and cabling the equipment, but the setup services include config of the storage units on 4 new hosts we are getting as well. This will be our fist SAN in our environment so having the setup and configuration done for us will be good.

On another note, has any here had experience with Storage Centre Live Volumes? I've read the documentation on it and watch the video Dell put out on it. Seems pretty interesting on paper but I'd like to hear what it's like using it in a production environment.

bigmandan
Sep 11, 2001

lol internet
College Slice

devmd01 posted:

How much of each tier did you buy? Make sure you have them explain auto-tiering and storage profiles, its pretty straightforward.

Dell Compellent SC4020
6X 400GB SLC Wi (One Hot Spare) 1TB Usable R10
6X 1.6TB eMLC Ri (One Hot Spare) 6.4TB Usable R5-5 (7.4TB Flash, 29.60% Capacity)
24X 1TB 7.2K NLS (Two Hot Spare) 17.6TB Usable R5-9

The auto-tiering and storage profiles are pretty straight forward. The thing I was asking about was the Live Volumes (replication), specifically I'm interested in HA Synchronous Live Volumes.

bigmandan
Sep 11, 2001

lol internet
College Slice
The rest of items for our SC4020's were received today! I decided to check everything out before we bring it over to our data centre and was quite surprised on how heavy the SSD's drives are compared to consumer drives. One thing I found interesting was that the platter drives in the disk shelf came pre-installed, but SSD's for the controller head were not.

bigmandan
Sep 11, 2001

lol internet
College Slice
Got our storage arrays all racked and ready to go!



I've asked this before, but has anyone had any experience with the Dell compellent synchronous live volumes? I'd like to hear some experiences with using it in a production environment.

bigmandan
Sep 11, 2001

lol internet
College Slice

KS posted:

The SC8000 controllers are in Dell chassis and have been out for over two years. This looks like the new-ish 4020 that integrates the controllers and a disk shelf into one 2u unit.


Synchronous replication is a big-boy feature and you need to make sure your network is rock solid. Remember, the remote array has to acknowledge the write before it completes. Any kind of latency and you can kiss performance goodbye. A single storage switch plus a small SAN and talk of sync replication are usually not things that go together well.

There are very specific use cases for it, like split metro clusters. Async replication is good enough for DR and backup. What's your use case?

The picture I have only shows one cabinet. Duplicate everything there (-1 server) in another cabinet and that's our initial setup (3 hosts, 2 switches, 2 storage arrays). Eventually the second storage array will be offsite (with multiple 10 Gbps links), but we are waiting for the DR site to built. Once that's done, one of the storage arrays will move over and then we'll add 3 more hosts in a new VM cluster. The idea is that we want to be able to fail over to the DR site if there is a ever a communications outage to our main data centre (we are building a redundant ring within the city). Network latency in general should not be an issue as we can easily provision 10 or 40 Gbps links if needed (we prefer 10 right now because 40 Gbps optics are expensive as gently caress right now).

One of the reasons I was asking about synchronous live volumes was:

"Since the disk signatures of the datastores remain the same between sites, this means that during a recovery, volumes do not need to be resignatured, nor do virtual machines need to be removed and re-added to inventory." (Dell Compellent Best Practices with VMware vSphere 5.x)

I understand we can get by with async replication but the above feature seems pretty enticing as it seems it would reduce administration headaches when dealing with a fail-over.

Also I think I need to get out and exercise. Racking the SC220 disk trays gave me quite the workout.

bigmandan
Sep 11, 2001

lol internet
College Slice

KS posted:

For ASync, a product like SRM breaks the replication relationship and re-signatures the datastores automatically. It also has far more robust DR handling than a stretched cluster.

Here's the VMware whitepaper with metro cluster requirements. Check out page 12 for the "When to Use/When not to use" discussion.

There is also an entry in the VMware Storage HCL for "iscsi metro cluster storage." It appears the Compellent is not on it.

Thanks for this link!

bigmandan
Sep 11, 2001

lol internet
College Slice
Speaking of Dell storage, we have had our Dell Compellent arrays (sc4020) up and running for about a month now and drat these things are fast. Management so far seems pretty easy and the replication between the two is working quite well. Only issue I really have is that Enterprise Manager is really loving picky about which version of java that should be installed (7u45).

bigmandan
Sep 11, 2001

lol internet
College Slice

devmd01 posted:

Make sure you set up tiering properly and educate anyone that touches it on how the tiering works. Backup volumes don't need to live in the ssd tier!

Myself and one other guy are the only ones touching it. And yea only a few things need the "flash optimized" profile. A lot of our use cases will be the "low with progression" profile (tier 3 -> 2) or just tier 3. The destination volume for replication also only gets tier 3.

bigmandan
Sep 11, 2001

lol internet
College Slice
That seems like a pretty decent deal. What's your use case going to be?

bigmandan
Sep 11, 2001

lol internet
College Slice
Anyone have some recommendations for networked backup storage? Does not have to be really fast, but needs to have 16TB+ and do CIFS, NFS and/or SFTP (NVSD, DDB or RDA are also an option) for use with vRanger. We're entertaining reusing some old servers and adding new controllers and disks, but I'd like to explore new options as well.

bigmandan
Sep 11, 2001

lol internet
College Slice

Moey posted:

Can you throw a VM in front of it? I thought vRanger ran off Windows Server?

For my on site backups, I went with some Dell MD3200i filled with 12x4tb disks. After setting up a Dynamic Disk Pool with all of them, I have just over 28tb usable. Others have mentioned getting EqualLogic for cheaper than the PowerVaults lately.

Here is a paper about Dynamic Disk Pools.
http://www.dell.com/learn/us/en/04/shared-content~data-sheets~en/documents~dynamic_disk_pooling_technical_report.pdf

Thanks for the link.

I have vRanger running in a VM already. Currently it's backups are being stored on our storage array (by the way of a linux VM with NFS), Not ideal obviously but hopefully that'll fixed soon with whatever we decide on.

bigmandan
Sep 11, 2001

lol internet
College Slice

Moey posted:

I found that the purpose built backup appliances all want you to turn off any compression your backup software is using (so they can do their own compression/dedup). Since I had been happy with PHD Virtual/Unitrends, I ended up going with a big dumb array just for block storage, and let my backup software handle the rest.

I think we're leaning towards dumb arrays. Just need a huge chunk of storage to throw backups on. The backup software handles compression and such, and we're not doing that much data at moment

As far as a budget, I wish I knew... so far it's always been find several options and choose the best one that fits our needs. Pricing is usually a secondary concern...

bigmandan
Sep 11, 2001

lol internet
College Slice

Thanks Ants posted:

If you want to roll your own with loads of disk then a Dell R730xd with Windows Storage Server isn't a terrible option.

Would probably use Debian or some other distro. We're mostly a *nix shop and the owner wants to keep it that way as much as possible. Hated that fact we needed Windows servers for vRanger and Dell Enterprise Manager (for our compellent arrays).

bigmandan
Sep 11, 2001

lol internet
College Slice

MrMoo posted:

Just noticed this, BTRFS starting to appear in NAS software, http://rockstor.com

That looks pretty interesting. I'll have to give it try on some spare hardware i have lying around.

bigmandan
Sep 11, 2001

lol internet
College Slice

SpaceRangerJoe posted:

What do you guys use for naming conventions on virtual disks and luns? I got a DAS SAS storage device from Dell the other day, but I don't have any great ideas for naming. I'm thinking some combination of raid level, storage pool and connected endpoint. That's probably overkill. The appliance is only connected to two hosts right now with no shared storage pool between the hosts.

There is a raid 10 on SSD and some 15k SAS that will be raid 5.

I'm not sure if its the best naming convention, but we do something like LUN###-<role>-<storage_characteristic+> and it seems to be serving us well.

Some examples:

code:
LUN013-vm-storage			stores vmware images, tier 3
LUN013-vm-fast				stores vmware images, flash optimized storage tier
LUN013-vm-critical			stores vmware images, tier 3, frequent snapshots/replays
LUN013-files-fast-critical		file storage, flash optimized, frequent snapshots/replays
Adjust the storage characteristic identifiers to fit your environment.

bigmandan
Sep 11, 2001

lol internet
College Slice
So we had a cache battery controller "failure" on one of our Dell SC4020s'. It's less than a year old. Apparently there is a known issue with the firmware on the cache controller. Reseating the battery did not work, so the suggestion from co-pilot is to reseat the controller. Even though this is a redundant system, I'm still wary of reseating the controller during normal hours, so I get to do some maintenance tonight. I'm just hoping this is not indicative of a larger issue.

bigmandan
Sep 11, 2001

lol internet
College Slice

sanchez posted:

What does Dell say? If it's a known issue they should have something, I wouldn't pull a controller without their recommendation.

They said to pull the controller.

bigmandan
Sep 11, 2001

lol internet
College Slice
So the Compellent Array that had a bad controller I posted about, now also had SSD a drive fail. I'm kinda surprised to see failure like this in something that's not even been running for a year. A Dell tech should be here in about 3 hours with the parts at least.

bigmandan
Sep 11, 2001

lol internet
College Slice
I have learned that keeping on top of storage reclamation is probably a good idea.

Over the weekend we came pretty close to being completely full on our tier 3 storage. After cleaning up some old data I thought I had cleaned up mostly everything, but noticed usage on our Compellent arrays didn't change (after replay and DP)... File deletion does not zero blocks out. This is something I already knew but it didn't really click until I saw the space discrepancy. I ended up having to use a combination of 'esxcli storage vmfs unmap` and dd within our linux guests (thick disks) to free up the blocks on the array.

Here is the dd script i used:

code:
#!/bin/bash

for i in {1..1000};  # ~1 TB to free up
	do dd if=/dev/zero bs=1M count=1024 of=/home/reclaim/zero.$i.bin;
done

rm /home/reclaim/*.bin
I'm wondering if there is a better way of managing storage reclamation than this.

bigmandan
Sep 11, 2001

lol internet
College Slice

Bob Morales posted:

We have (I think) a Dell MD3200 that only has the external SAS connectors, and only 2 of those. We have 2 servers running VMware (just the lowest paid version). Basic Linux and Windows file servers, no big databases or anything. I think we have like 1.5TB worth of stuff.

It's coming up on 3 years old, what are my options? Ideally I would like an all-SSD based solution, is that possible for < $15,000? Is there a good solution that makes good use of SSD-caching?

I'd like to go to something iSCSI-based (there are no network ports in our current PowerVault), but are there really any advantages or disadvantages to doing that? I don't really see adding any more hosts in the near future. We're hopefully getting rid of about half the VM's we run by moving to a cloud-based ERP.

What's the main drive for upgrading? Depending on what that is will dictate what solution will work for you. Do you need more performance, capacity, features, etc.. ?

bigmandan
Sep 11, 2001

lol internet
College Slice

Bob Morales posted:

Would like more of everything but it's mainly "this is 3-4 years old and we should get a new one"

It's been awhile since I looked at pricing from Nimble, but I think their entry solutions are near that price point. A quote I have from about a year ago was ~20k CDN for a step above entry.

bigmandan fucked around with this message at 16:37 on Sep 21, 2015

bigmandan
Sep 11, 2001

lol internet
College Slice

NippleFloss posted:

Well, it just happened. I'd guess Compellent goes away, some EMC product lines get trimmed, and there's a bigger push towards hyperconverged. Also guessing that ScaleIO becomes Dell only.

Any reason why you think Compellent will go away? We have a few units and i'm wondering if I missed the writing on the wall somewhere?

bigmandan
Sep 11, 2001

lol internet
College Slice

Mr-Spain posted:

We are looking into our first san/array. We will need about 30-40TB to start off with and scale from there. So far I have a couple of quotes, Nimble, Tegile, EMC and Dell/Compellant.

I've got some general pricing back already, most of the offerings seem pretty good, but so far the most bang for the buck have been the Compellant systems. Is there any real reason to stay away from them as a vendor? I can post up specs and pricing. I think alot of it has to do in the space and who they are competing against. The disks in their solutions are mostly 1.8tb 10K offerings. Any thoughts?

Estimated Data would be 10TB video (hardly ever accessed), 10TB call center recordings, which after written would be randomly accessed. The last 10 or so would be VM and fileserving, with a few 16 GB or so SQL databases. Thanks!

We have two SC4020's (~25TB each) we got about a year and a half ago and they have been serving us very well. Our usage is mostly VM storage, mail storage (ISP, so lots of accounts) and various databases (some as small as only a few GBs and some in the 50 GB range). We only had one disk failure so far, but it seems it was due to faulty firmware on the drive. We have 3 tiers setup and overall the performance has been pretty good. Our Dell rep was also very aggressive with getting discounts, but we were also buying servers and switches at the same time, so your mileage may vary there.

bigmandan
Sep 11, 2001

lol internet
College Slice

Looks like they didn't adhere to The Tao of Backup

bigmandan
Sep 11, 2001

lol internet
College Slice
Over the weekend I had to replace a controller for a Compellent SC4020. The physical swap was easy enough but the controller that was sent out had a REALLY old OS/firmware loaded. What should have been a 2-3 hour event ended up being 8 hours as I had to work with Dell support to manually upgrade the controller through the various stages...

bigmandan
Sep 11, 2001

lol internet
College Slice
Well the original controller wasn't "failed" yet, but they suspected it might soon so they suggested a replacement. Apparently it was a fluke that we got a controller with an older OS. Over all though we have had very little issues with the SC4020 and I have found dealing with their support to generally be fine.

One thing I found out during the upgrade processes was that the controllers are running FreeBSD.

bigmandan
Sep 11, 2001

lol internet
College Slice

evil_bunnY posted:

LMAO, how is your reaction not "you're welcome tomorrow at 9AM with an updated controller"

It was the weekend, I had already swapped the controller out, and at that point I just wanted to get it over with and not have to open up another maintenance window.

bigmandan
Sep 11, 2001

lol internet
College Slice

Spring Heeled Jack posted:

Holy crap, SAN discussions are down to HPE Nimble AF40 and a Dell Compellent SC5020. My coworker is leaning towards Comepllent since its a more 'mature' solution or something? :negative:

I know in my heart of hearts that this is a bad choice, but please give me a reason to sink Compellent. It just seems like an old as poo poo design that they slapped flash disks into.

Yeah it's an older design but it's loving rock solid. We have a pair of SC4020's and they have given us very very few issues. Performance is pretty good for us still, even when using 3 storage tiers. (we have lots of at rest data so it made sense at the time to get tiered)

bigmandan
Sep 11, 2001

lol internet
College Slice

YOLOsubmarine posted:

Compellent may end up getting axed from the line card entirely fwiw. Dell/EMC has too many storage products in their combined portfolio and some of them will go away. I’d peg Compellent as a likely casualty since the tech is old as hell and not that...compelling.

The Nimble stuff is going to be easier to manage, lower touch, and consistently low latency. The information in Infosight is also great and Dell has nothing like that for Compellent. Your support experience with Nimble will also be better.

Whats lacking in Dell Storage Manger compared to Infosight? I haven't used Nimble stuff before.

Adbot
ADBOT LOVES YOU

bigmandan
Sep 11, 2001

lol internet
College Slice

YOLOsubmarine posted:

Infosight is Nimble cloud based call home telemetry system. Every array sends telemetry data every 5 minutes and then Nimble does a bunch of analytics on the back end to do predictive analysis and trending. So for instance if they find a bug they can identify arrays that have the workload pattern or configuration that makes them a likely candidate to hit than bug and notify the owners as well make the updated code with the bug fix available to them first. Or they can identify when it seems likely that an upgrade will be required based on trends in cache utilization, cpu, memory, disk, etc, and notify the customer. If you call with a performance issue they will have a substantial amount of information available to assist you without requiring that you run special tools to gather data.

There’s basically a lot of analytic data they can pull since they’ve got a ton of samples from every array in their customer base to make smarter decisions from a support and engineering perspective. HP bought Nimble in large part due to the Infosight platform and not the actual array technology. The arrays are good, but the infrastructure they’ve built around Infosight is really neat and can be expanded in a lot of interesting ways.

Sounds like a better version of Support Assist then. We do get notifications and auto generated tickets from Dell when there is a potential issue. For example, one of the cache batteries was showing warning signs a while back. A ticket was automatically generated and a new battery shipped out before it became an actual issue.

Not sure how much analysis they do on the data though.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply