Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

bmoyles posted:

Say you need 100TB of storage, nothing super fast or expensive, but easily expandable and preferably managed as a single unit. Think archival-type storage, with content added frequently, but retrieved much less frequently, especially as time goes on. What do you go for these days?
Sun's Amber Road system looks pretty nifty:

http://www.sun.com/storage/disk_systems/unified_storage/

You can probably save some money if you go with their disk arrays instead. At my work I recently put together a J4400 array which is 24TB raw (24x1TB), and after setting up ZFS with RAIDZ2 + 2 hotspares, it is about 19TB usable. You can daisy chain up to 8 (192 disks) J4400's together so you can expand to roughly ~150TB. One J4440 was about $20k after getting two host cards, dual SAS HBA's, and gold support plus you will need a server to hook it up to. I would find a decent box and load it up with a boatload of memory for the ZFS ARC cache.

http://www.sun.com/storage/disk_systems/expansion/4400/

Adbot
ADBOT LOVES YOU

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

Misogynist posted:

You can also use an x4600 as an interface to a bunch of Thumpers (up to 6 at 48x1TB each). They tend not to advertise this functionality much. If you're going to do this, I recommend OpenSolaris/SXCE over Solaris 10 because of the substantial improvements in native ZFS kernel CIFS sharing.
Are you sure you don't mean hooking up a bunch of J4500 arrays to the X4600? The J4500 looks very much like a Thumper:




Though I don't see why you would need something like an X4600 when you can get a X4440 instead. With the J4500 array, you will daisy chain each expansion tray together instead of having a dedicated external SAS port in the server for each tray. One cool thing is that the SAS HBA's support MPxIO so you can be connected to both host cards in the tray.

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

BonoMan posted:

A good rack-mountable NAS with 4 to 8 TB and under 2 grand?
You can do this if you roll your own with a 3U Supermicro case, 1.5TB drives, a decent Intel motherboard/processor, and OpenSolaris so you can use ZFS. Once you get OpenSolaris installed, it is pretty trival to make a ZFS pool and you can do something like a RAIDZ2 which similar to RAID 6 in redundancy. You can then share out the ZFS pool you just created to Windows hosts with a CIFS share.

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

Misogynist posted:

Just note that if you take this route and expect AD integration, you had better be very familiar with LDAP and Kerberos (or at least know enough to troubleshoot when the tutorial you're following misses a step), because it's not much more straightforward than it is in Samba. OpenSolaris is an amazing OS for Unix/Linuxy people, but Sun bet the storage farm on the 7000 series' secret sauce, not the OpenSolaris CLI.
Well I suppose if you are getting stuck on that you can easily install Sun's SAMBA package and you can use SWAT to create shares instead.

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

quote:

So the real question is, can I cheap out and use the 7200.11/7200.12 drives for the X4540 without any issue? They're literally half the cost of the ES.2 disks. Also, I'm not worried about support since we've confirmed that issues not caused by third-party disks are still supported.

Why don't just use the same model drives that Sun uses? According to the Sun System Handbook the X4540 either uses 1TB Hitachi HUA721010KLA330 disks or 1TB Seagate ST31000340NS disks. Though, I can't find anything that says you can or cannot mix disks in the X4500/X4540. I don't see why you couldn't though.

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

lilbean posted:

Those are the ES.2 disks, which are over twice the cost of the 7200.12 disks. The mix and match is fine, I'm just more worried about the consumer firmware-based drives causing issues.
$159 for a 1TB enterprise-quality disk isn't that much money. Plus I'd be careful of using other disks unless you're sure that stuff like ZFS cache flushes work correctly.

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

complex posted:


There are two here because we're using multiple paths.

What is this 1MB LUN, and do we need it? If not, what can I tell our SAN admin in order to stop this madness?
I'm not sure why you have 1MB LUN's present, I did see a Sunsolve article saying this but I'm not sure if it's the same thing:

quote:

For Sun hosts connected to EMC Symmetrix Serial DMX storage, Sun engineer can inform customer that EMC engineer can request a modification of DMX Storage BIN file. Once EMC engineer agrees to help on this and the BIN file is modified, request a reconfigure reboot of the Solaris host. Then the Gatekeeper/VCM database Volumes (VCMDB) LUN related errors would disappear.

For EMC CLARiiON CX700 Array's presented special LUN, "LUNZ", get help from EMC engineer to Disable "arraycommpath" setting on CX700 array for each Solaris Server which can be done via command "navicli". Once EMC engineer completes the settings, initiate reboot of Solaris Operating System, and then the "LUNZ" LUN would disappear as well.
I would call Sun and see if they can help. Secondly, I'm not sure if you have multipathing working correctly. If you are using mpxio, you should only see one device. Make sure that in /kernel/drv/fp.conf that mpxio-disable is set to "no". If not then you should run "stmsboot -e" to enable multipathing. You may need to do a devfsadm -Cv after a reboot to clean up the old devices. More info here:

http://docs.sun.com/app/docs/doc/820-1931

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

Misogynist posted:

I just moved part of my ESXi development environments off of local storage and onto a 48TB Sun x4500 I had lying around, shared via ZFS+NFS on OpenSolaris 2009.06 over a 10GbE link.

I was worried about performance because it's SATA disk, but holy poo poo this thing screams with all those disks. I have never seen a Linux distro install so fast ever in my life. The bottleneck seems to be the 10GbE interface, which apparently maxes out around 6 gig.

If I can find some sane way to replicate this to another Thumper, I will be a very, very happy man.
Are you using these 10GbE cards? They have been working extremely well for us but on some servers we added another card and turned on link aggregation. There is also this new feature in Solaris 10 Update 7 that may or may not be in OpenSolaris:

quote:

Large Segment Offload Support for Intel PCI Express 10Gb NIC Driver

This feature introduces Large Segment Offload (LSO) support for the ixgbe driver and some ixgbe driver bug fixes. LSO is an important feature for NIC, especially for 10-Gb NIC. LSO can offload the segmentation job on Layer 4 to the NIC driver. LSO improves transmit performance by decreasing CPU overhead. This feature is enabled by default.

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

TobyObi posted:

What are your issues with the Sun kit?
Probably because he works for EMC? We had no issues with our 6140 controller that we recently upgraded to a 6580. Though I found that these controllers are actually made by LSI and that IBM uses the same hardware in some of their storage lines. I recently built a poor-man's Sun Storage 7310 with a 24TB J4400 array and an existing 8-core X4100 M2 server with 32GB of memory and it works well.

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades
On the topic of ZFS, we had a flaky drive the other day in our J4400 storage array and we decided to offline the drive and assign a hotspare to it. It took 50 hours to resliver a 1TB volume. Granted, this is a giant raidz2 pool with 24 disks and two hotspares so I kind of expected a long rebuild time. I'm thinking about going a step further and doing raidz3 because to me 50 hours is a pretty big window for Murphy's law to kick in and gently caress poo poo up.

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

FISHMANPET posted:

poo poo, I think you might be so close to a low end thumper with that budget, and not that I know anything about enterprise storage, I think it would do most of what you need.

Bonus points for getting 24T for your budget.

Fake Edit: Oh poo poo, looks like they got rid of the 500GB model on their site, so the 1TB disk model is $50k US.
You can just do what I did and buy a J4400 array. It is 24TB raw and you just have to do the ZFS yourself. I basically have a poor man's open storage system. You can daisy chain several more arrays to it for expandability. It cost us about $20K for the array with two external SAS cards and gold support. The J4200 should be half the cost since it just has 12 drive bays.

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

EoRaptor posted:

I have absolutely no problem with more space, as long as the feature set is met within budget, which the sun boxes all seem to do.

Which leads to two questions:

I know zfs does de-duplication, but can this de-duped data be backed up, or am I still working with the full set?

How is the management of the boxes? I'm okay with command line stuff, but other people will need to pick up slack from me if I'm not around, so a management interface that's not horrible is a must.
I would assume that a backup program like NetBackup would ignore the ZFS de-deduplication and will backup all the files. With ZFS you can create a snapshot, and then use the "zfs send" command to send that snapshot to another host (more info here). It looks like they added a dedup option to zfs send, though this is pretty new. In fact, I am pretty sure that you will need to be on the developer build of Opensolaris to even get ZFS de-duplication so if this for your enterprise, I would just err on the side of caution.

For management, you are stuck with the command line for everything. If you want a pretty web interface with good analytics, then check out the Sun Storage 7000 systems. I think I read somewhere that they did add de-duplication.

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

Cyberdud posted:

It's funny how it's advertised as a VMWARE READY NAS. I don't get it.


Any explanation on why that QNAP couldn't run vmware?

Do you really think that you are going to get good performance off an NFS server running embedded Linux on an Intel atom? This might work okay for 1 or two hosts, but you're talking 10. The Sun storage system that adorai recommended will run miles around this, not to mention you will get ZFS which was a far superior file system.

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

Misogynist posted:

OpenSolaris was definitely the best option, but with that project dead in the water, your best bets are probably Nexenta and FreeBSD in that order.
What's wrong with x86 Solaris 10? AFAIK, I don't think it's going to stop working if you don't register with Sun after the 90-day trial. Since you will need a Sunsolve account to get patch clusters, you can at least keep it sorta updated by downloading the latest release when it comes out and preform a Live Upgrade.

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

Misogynist posted:

I was really pissed that the Unified Storage line didn't completely take off and dominate the industry in the low- to mid-end. If Sun was in a better position when that was released, and the IT world wasn't terrified of Sun being acquired and the vendor support stopping, they would have made a killing on it. The Fishworks analytics stuff is still the best in the industry.
I'm really pissed that loving Oracle is EOLing the X4540 in just a few days. They also silently killed their JBOD's like the J4400/J4500 arrays as well. Unified Storage is nice and all but is way more pricey to get 48TB raw than it is to go with an X4540. Also, now you can only buy Oracle "Primer" support which is roughly three times the price of Sun Gold support. gently caress you Oracle. :arghfist:

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

lilbean posted:

Wait what? The 4540 was one of the best things Sun ever made. Goddamnit.
Well you have until November 9th to place the order. :) Part of the issue is that the X4500/X4540 are AMD-based and Oracle wants to go all Intel now. I have no idea why they didn't come out an Intel version on the Thumper/Thor but it is probably because they can squeeze out fatter margins with the Unifed Storage line. I used to be the biggest defender of Sun, but ever since Oracle took over they have been doing everything in their power to drive me away in disgust.

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

FISHMANPET posted:

They just came out with the 9/10 release, and no dedup. The previous release was 11 months ago. Solaris 11 is coming at us at lightning speed. Not sure how long you're going to be waiting for dedup in Solaris 10.
The funny thing is that ZFS v22 is in 9/10 but they are assholes and marked duplication (21) as reserved:

code:
$ zpool upgrade -v
This system is currently running ZFS pool version 22.

The following versions are supported:

VER  DESCRIPTION
---  --------------------------------------------------------
 1   Initial ZFS version
 2   Ditto blocks (replicated metadata)
 3   Hot spares and double parity RAID-Z
 4   zpool history
 5   Compression using the gzip algorithm
 6   bootfs pool property
 7   Separate intent log devices
 8   Delegated administration
 9   refquota and refreservation properties
 10  Cache devices
 11  Improved scrub performance
 12  Snapshot properties
 13  snapused property
 14  passthrough-x aclinherit
 15  user/group space accounting
 16  stmf property support
 17  Triple-parity RAID-Z
 18  Snapshot user holds
 19  Log device removal
 20  Compression using zle (zero-length encoding)
 21  Reserved
 22  Received properties

For more information on a particular version, including supported releases,
see the ZFS Administration Guide.

Some people seem to think this is due to the settlement with NetApp over the ZFS lawsuit rather than a technical limitation. Also, Solaris 11 Express is out and is supported by Oracle if you are brave enough to put it into production.

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

wang souffle posted:

Is there a recommendation on mixing SAS and SATA drives in a ZFS build? I assume highly unrecommended, but how else can you use SAS hard drives for primary storage and SATA SSD drives for ZIL/L2ARC? The SATA-->SAS interposer solution sounds pretty hacky.
AFAIK, SAS controllers are downwards comparable with SATA disks but not the other way around. At work I have a Franken-open-storage clone that consists of an X4100 server + an external J4400 SATA storage away. On the X4100 I have two SAS disks for the OS and two Intel SATA SSD drives for the L2ARC cache. The Intel SSDs share the same SAS controller with the SAS disks and I have no problems with this configuration.

edit: I should also add that the X4100 uses two external SAS controllers with multipathing enabled to talk to the J4400 array.

Bluecobra fucked around with this message at 01:11 on Jan 13, 2011

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades
I just found out that Uncle Larry silently killed off the StorageTek 25xx line of storage. I guess $1,000 2TB disks weren't profitable enough for them. Now I have to go muck around eBay for disks and brackets to upgrade a 2510 array that is less than three years old at one of our remote sites. :argh:

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades
I wonder if Oracle is going to ax their LSI-based StorageTek gear now:

quote:


On June 29, 2011, Oracle announced that it has agreed to acquire Pillar Data Systems, a leading provider of innovative SAN Block I/O storage systems. The proposed transaction is subject to customary closing conditions and is expected to close in July 2011. Until the transaction closes, each company will continue to operate independently, and it is business as usual.

Pillar's advanced SAN storage technology with leading Quality of Service provides customers with superior performance, an easy-to-use interface and a scalable architecture. Nearly 600 customers running 1,500 systems store their mission-critical data on Pillar. Pillar storage products are extremely efficient with 80% utilization at performance, approximately twice the industry average.

The combination of Oracle and Pillar's products are expected to help Oracle deliver to customers a complete line of storage products that runs Oracle software faster and more efficiently. Customers can optimize the value of their Oracle applications, database, middleware and operating system software by running on Oracle's storage solutions.

Oracle President, Mark Hurd and I will be hosting an Oracle Storage Strategy Update on June 30, 2011. Register at https://www.oracle.com/storage for the live event and Webcast to learn more about how Oracle is redefining storage. More information about the proposed combination of Pillar and Oracle can be found at https://www.oracle.com/pillardata.

I have a StorageTek 6140 and a 6580 that I can't wait to get off of. These bastards want over $50K for just one 16-disk 2TB 7,200 RPM tray. Even Netapp is half the price and they have 24-disk trays. Oracle can just die in a fire. Just wait until you Pillar guys have to deal with Oracle. :)

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

Drighton posted:

Edit: After some troubleshooting with the switches, narrowed it down to a configuration problem on the controller, and the support rep recommended an additional VLAN, so looks like I have a weekend project coming up.
Are both your switches handling only iSCSI traffic? If so, it would be better to remove the LAG between them, and create a separate iSCSI subnet for each switch so you have a truly redundant switch fabric. This would require dual 10GbE NICs in each server. I made a crappy little diagram to illustrate what I mean. Also, you should be using jumbo frames (MTU=9000) and every server/controller/switchport in that VLAN would need to be configured for that MTU size.

Only registered members can see post attachments!

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

FISHMANPET posted:

The justification from my boss (not the one who set it up) is that since we're going to mirror the machines (with a nightly ZFS send/receive) that it doesn't matter if a machine goes down because of hard drive death. Nevermind that the act of syncing back 30TB of data is sure to kick of a couple dead disks in your backup array, but gently caress, what do I know.
It sounds like your boss doesn't value your time. RAIDZ1 will protect you from one disk dying but in my experience it seems like cheap SATA disks take a long time to re-silver in big pools. I used to have a 24x1TB enclosure with one RAIDZ2 + 2 hot spares and it would take 3-4 days to re-silver. In your case, if a second drive fails during the re-silver you will have to rebuild the array from scratch and re-sync all that data which will take some time. I should also mention that in my case of having a giant 24-disk raidz2 that performance wasn't stellar even with a 32gb of system memory and two SSD L2ARC drives. Part of the problem is that anytime there was a read or a write, a given file would be spread across 22 disks which adds to slowness. What I would do is benchmark the pool as it is right now and try to simulate a disk re-silver. I would then consider re-creating the pool with two raid-z1 devices + two hotspares (see below) and compare the results.

code:

  pool: horse_porn
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        horse_porn  ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            c1t1d0  ONLINE       0     0     0
            c1t2d0  ONLINE       0     0     0
            c1t3d0  ONLINE       0     0     0
            c1t4d0  ONLINE       0     0     0
            c1t5d0  ONLINE       0     0     0
            c1t6d0  ONLINE       0     0     0
            c1t7d0  ONLINE       0     0     0
          raidz1-1  ONLINE       0     0     0
            c1t8d0  ONLINE       0     0     0
            c1t9d0  ONLINE       0     0     0
            c1t10d0  ONLINE      0     0     0
            c1t11d0  ONLINE      0     0     0
            c1t12d0  ONLINE      0     0     0
            c1t13d0  ONLINE      0     0     0
            c1t14d0  ONLINE      0     0     0
          spares
          c1td15d0    AVAIL
          c1td16d0    AVAIL
If you haven't already take a look at the ZFS entries at the Solaris Internals wiki:

http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide
http://www.solarisinternals.com/wiki/index.php/ZFS_Configuration_Guide
http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

Serfer posted:

Does anyone have any experiences that would break the magic spell Compellent has put on me?
In my experience, their support on Solaris is lovely. The last time I called Copilot about a Solaris 10 iSCSI problem, they told me to call "Solaris" for help. Also, when we had bought our first Compellent, the on-site engineer(s) couldn't figure out how to get iSCSI working on SUSE and restored to Googling for the solution. I ended up figuring it out on my own. Based on my anecdotal evidence, it seems like this product works best for Windows shops.

I should also mention that we had one of their lovely Supermicro controllers die in our London office and it took them 8 days to get a replacement controller in the office. This was with 24x7 priority onsite support. That being said, I don't think the product is that bad but is probably not very well suited for a Unix/Linux shops. We just had a bunch of unfortunate problems with it so now it is called the "Crapellent" around the office.

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

FISHMANPET posted:

So maybe some ZFS masters can answer this for me.

I'm trying to replicate System A to System B.

I've got a pool, datapool, with filesystems in it. I want to replicate the same heirachy to System B, which also has a pool, datapool. If I try and do a zfs send piped into recieve through SSH, it sends the snapshot of the main file system, datapool, but then says this:
code:
cannot receive new filesystem stream: destination has snapshots (eg. [email]datapool@2011.11.16.AM[/email])
I could do a send for each file system, but I'd rather not have to modify my cron job each time a new file system is made.

If I had to, I could write a script that checks what filesystems exist under datapool on both sides, and creates any of System B if they don't exist there, but it doesn't seem like that should be necessary.

Am I missing something here?

Instead of having a pool named datapool on both systems, call one datapool1 and the other datapool2.

http://download.oracle.com/docs/cd/E19082-01/817-2271/gbchx/index.html posted:

You can use the zfs send command to send a copy of a snapshot and receive the snapshot in another pool on the same system or in another pool on a different system that is used to store backup data.

Bluecobra fucked around with this message at 14:46 on Nov 17, 2011

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

FISHMANPET posted:

That doesn't fix it. I think the problem is that I'm doing
code:
zfs send datapool | zfs receive backup
and that doesn't work because it tries to put datapool into backup, I have to do
code:
zfs send datapool/depta | zfs receive backup
and do that for each filesystem.
Have you tried using "zfs send -R datapool"? That should send all the filesystems inside the datapool pool. I would take a look at "Sending and Receiving Complex ZFS Snapshot Streams" in the ZFS admin guide.

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades
It sounds like you have to destroy the pool on the remote server before you run zfs send or just send incremental data instead. Another thing you can do is use rsync. You can use zfs send for your initial copy, and then use rsync to copy over what has changed. You can then setup ZFS snapshots on each system so they both have independent snapshots of the data.

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

Serfer posted:

Compellent is pretty big, and it's owned by Dell now. It's also not "white box" equipment, you're just showing that you have absolutely no idea what you're talking about. It's like saying EMC Avamar is a white box just because it's just a Dell 2950.
Here is what a Compellent SC40 controller is:

http://www.supermicro.nl/products/system/3U/6035/SYS-6035B-8R_.cfm

All you need to make it into a Compellent Controller is to put in their PCI-E cards, their fugly bezel, a CPU, processor, and their software. I guess one good point about them is that you can mix and match cards into the same controller since it's just a Supermicro server.

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

Serfer posted:

Here's what an EMC Avamar is, http://www.dell.com/us/dfb/p/poweredge-2950/pd by your logic all you need to do to make it into one is their software, and their weird bezel.

I think newer ones use http://www.dell.com/us/enterprise/p/poweredge-r810/pd but my point stands.

It's not that using off the shelf parts makes a whitebox, it's the support behind it (and to some extent, the name) that makes it not a whitebox. If I were to build my own SAN from the same parts, it would definitely be a whitebox, but purchasing it from Compellent, with their software, their support, and their name, makes it more than the sum of its parts.
I disagree. My definition of a whitebox is to use your own case or barebones kit from a place like Newegg to make a desktop or server. If Compellent chose to use a standard Dell/HP/IBM server instead, it would be miles ahead in build quality. Have you ever had the pleasure of racking Compellent gear? They have to be some of the worst rail kits ever, thanks to Supermicro. The disk enclosure rail kits don't even come assembled. It took me about 30 minutes to get just one controller rack mounted. Compare that with every other major vendor, racking shouldn't take more than a couple of minutes due to railkits that snap into the square holes.

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

FISHMANPET posted:

Someone earlier said that Compellent iSCSI didn't play nicely with Solaris, can anyone coroborate that or provide some more information? That would be a serious deal breaker for us, and I'd hate to dump all our money into a vison that is worse than what we have now.
I had a lot of goofy issues with iSCSI and ZFS in Solaris 10, but I wouldn't blame it on Compellent though. At the time we were using Solaris 10, Update 9 with the latest patch cluster. Every single time you reboot the server, ZFS would mark the Compellent iSCSI LUN as degraded and I would have to manually clear the error to bring the filesystem online. The problem is that when the server boots, ZFS tries to mount all the devices before the network is up so it can't mount the iSCSI LUN and therefore thinks it is faulty/offline. I had Oracle create an IDR (which took them months to come up with) but the issue would still happen occasionally. I think the problem was fixed in OpenSolaris so it might be fixed in Solaris 11. I also had another server in which I would have abysmal disk performance over iSCSI with multipathing turned on. Compellent and Oracle couldn't help me and the problem is still unresolved. As a workaround, we had to disable multipathing on the server. Funnily enough, we had a identical Compellent SAN/switch at another site that didn't have this problem but I couldn't figure it out for the life of me. Of course, if you are going to use Fibre Channel (which I recommend) for your Compellent then all of this would be a non-issue.

Also be aware that with ZFS you can't mount replays on the same server if the same filesystem is already mounted. For instance, say that you have a pool called rpool mounted on your server. Say that on the Compellent that you want to mount a replay of rpool from last week. You would think you can do a zpool import and call it rpool_old and be done right? Well you can't because replays have the same device IDs, and ZFS won't let you have a two devices with duplicate IDs on the same system. In order to actually use the replay of rpool, you will need to either destroy the first rpool or mount it on a different system.

One thing I would mention is that in my experience, Compellent's Copilot support isn't that great when it comes to answering questions on Solaris. Oracle's support isn't all that great either since all the good engineers left Sun a long time ago. I would recommend having good Solaris admins on staff to deal with any funky Solaris issues because you will certainly pull your hair out trying to get support to fix them.

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

Beelzebubba9 posted:

Homemade SAN stuff
If I were in your shoes, I would definitely implement something that uses ZFS as the back-end storage. You can use Solaris 10/11, OpenSolaris and the various forks out there, and FreeBSD. If you want to use something point and click it looks like FreeNAS is based on FreeBSD and has ZFS but I have never used it. From a command-line perspective, it is stupidly easy to administer ZFS considering all the commands start with either zpool or ZFS. I am most comfortable with doing this on Solaris, but thanks to Uncle Larry you would need to pay $1,000/per CPU a year for the privilege of running Solaris 10/11. Given that OpenSolaris hasn't been updated for years now, you are left with the OpenSolaris forks and FreeBSD. If I were to do this, I would pick FreeBSD since ZFS has been on it for some time now and the community support is pretty good. The only things I don't know about is how well iSCSI exports or NFS is handled on FreeBSD which may make want to veer back to one of the OpenSolaris forks.

When building the server, I would put in as much memory as I can. I would also get two SSD drives for a mirrored ZFS intent log to improve synchronous write performance. For disks, I would go with enterprise-grade SAS drives and stay far away from the WD Green Drives. Once all of that is out of the way make sure you have a good plan on how you want to lay out the vdevs. If I were to use this controller, I would be able to hook up 8 disks per controller. If I had two of these cards and 16 2TB drives, I would probably set up my vdevs like this:

code:
#zpool status dpool
  pool: dpool
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        dpool	    ONLINE       0     0     0
          raidz2    ONLINE       0     0     0
            c1t0d0  ONLINE       0     0     0
            c1t1d0  ONLINE       0     0     0
            c1t2d0  ONLINE       0     0     0
            c1t3d0  ONLINE       0     0     0
            c1t4d0  ONLINE       0     0     0
            c1t5d0  ONLINE       0     0     0
            c1t6d0  ONLINE       0     0     0
          raidz2    ONLINE       0     0     0
            c2t0d0  ONLINE       0     0     0
            c2t1d0  ONLINE       0     0     0
            c2t2d0  ONLINE       0     0     0
            c2t3d0  ONLINE       0     0     0
            c2t4d0  ONLINE       0     0     0
            c2t5d0  ONLINE       0     0     0
            c2t6d0  ONLINE       0     0     0

        spares
          c1t7d0    AVAIL   
          c2t7d0    AVAIL   
 
	cache
	  c3t0d0    ONLINE
	  c3t1d0    ONLINE
The above config would result in roughly 17-18TB of usable space. If you good with RAID-Z1 instead, you would get 21-22TB of space. RAID-Z1 is similar to RAID 5 in terms of redundancy and RAID-Z2 is similar to RAID 6. The above config there are also two hotspares as as well.

Beelzebubba9 posted:

I was going to use BackBlaze’s parts list as a loose guide (save the case). I’d like to use a Sandy Bridge based CPU for AES-NI support, and SuperMicro doesn’t make any server class Socket 1155 motherboards. Does anyone have a suggestion for a S1155 motherboard that would be suitable for use in a SAN?
Supermicro actually does, but for single processors only. The reason for this is that all the Ivy Bridge stuff isn't out yet so you are going to have to wait a few months for the dual processor boards to come out. If you want dual processors now, I suggest getting a 1336-based motherboard instead.

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

szlevi posted:

Have you got any of your gear yet? I'm in the process of getting my quotes in and I'm curious what do you think... I'm concerned that Compellent is still built on Server 2008 R2 file servers.
What controller are you talking about? I saw our SC40 boot up from the serial console before and it looked like it was running some flavor of BSD.

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

Spamtron7000 posted:

I got a call at 2:43am one night last month from EMC support in India to ask me to troubleshoot a Centera call home issue. That company has seriously poo poo the bed.
Hey at least you don't get emails every other day about a failed Sun Storagetek ASR activation that was attempted four years ago. :sigh:

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

FISHMANPET posted:

I need to attach a pile of disks (goal is around 20TB) to a Solaris server so ZFS can manage them. Where can I get either a server that will hold ~12 drives + a system drive with a dumb pass through controller, or a drive enclosure that I can pass through the disks into an 1068E controller or something equivalent?
If money is no object, you can buy a Sun Fire X4270 which can either use 12 3.5" disks or 24 2.5" disks in 2U. There is an internal USB port that you can stick a flash drive in and boot Solaris off of. One thing you should take in account is licensing costs. You get the right to use Solaris with Sun hardware without having to buy software maintenance. If you plan to use Solaris 10/11 and go the non-Sun hardware route, you have to pay Uncle Larry $1K per year per CPU socket. (unless you are using the server for development purposes)

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades

the spyder posted:

Storage nightmare!

Today I poked at our main NAS. Bad idea. We have two most identical, replicated NAS for our main company storage. Each is a Dual Quad with 96x1.5TB drives. They run NexantaStor and Napp-It. For some time my users have been complaining about terrible performance. Turns out the primary box has 24gb of ram for 90TB of space.... 71.5 of that being used.... With twice daily snapshots turned on... Well that makes since. I ordered 96TB of ram for each one today and ZIL+Log SSD's for when I have a chance to rebuild them.
How is your ZFS storage pool configured? If you have all 96 drives all in one raidz2 group, performance will be terrible since you are writing files across all 96 disks at once. You should have multiple raidz1/raidz2 groups. Per the ZFS Best Practices guide:

quote:

The recommended number of disks per group is between 3 and 9. If you have more disks, use multiple groups.

Bluecobra fucked around with this message at 15:04 on Mar 21, 2012

Adbot
ADBOT LOVES YOU

Bluecobra
Sep 11, 2001

The Future's So Bright I Gotta Wear Shades
I had a fun day yesterday. We hit some bug on our FAS3210 in where the controller battery discharged to the point where it decided to shutdown both controllers in the middle of the day. :v:

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply