Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us $3,400 per month for bandwidth bills alone, and since we don't believe in shoving popup ads to our registered users, we try to make the money back through forum registrations.
«3 »
  • Post
  • Reply
madsushi
Apr 19, 2009

Baller.

Corvettefisher posted:

Going off this http://www.vmware.com/files/pdf/sto...otocol_perf.pdf (they only used GIG and some of the charts all cap at 125MB/s)

You're quoting an ESX 3.5 PDF as proof that iSCSI>NFS? NFS performance was greatly improved in 4.x, and in fact, that was around the time VMWare started recommending NFS for new deployments.

Adbot
ADBOT LOVES YOU

madsushi
Apr 19, 2009

Baller.

1000101 posted:

True; though if you're running gigabit you're more likely to get consistently better performance out of iSCSI over NFS. This is mostly due to how ESX 4.X+ handles MPIO to iSCSI LUNs.

That said, if you're not really hitting the limits of gigabit its a wash anyway.

Here's the way I see it (in today's data center):

If you're not that good at storage, NFS wins. It's just easy.

If you're good enough to configure MPIO the right way and you actually have redundant links and everything, you're probably in a big enough environment that you can just spring for a 10Gb NIC per server and then you can completely forget that your storage network even exists. No single host can tough 10Gb under 99.9% of circumstances, and it's not like most SANs can even serve up data that fast.

madsushi
Apr 19, 2009

Baller.

If on a SAN (NetApp), we use NetApp's VSC.

If not on a SAN, we use PHDVirtual's esXpress, because it simply works.

madsushi
Apr 19, 2009

Baller.

Most people running home labs are using the free version of ESXi, so there's no 60-day limit. The timer only applies for the paid features (vCenter and everything it enables, etc).

madsushi
Apr 19, 2009

Baller.

Misogynist posted:

add a horizontal scrollbar to my browser window

Trying to flush out all of the posters with small monitors.

madsushi
Apr 19, 2009

Baller.

Pantology posted:

This applies to vSphere and XenServer as well. Datacenter licenses are licensed to the socket, you can run whatever hypervisor you want on top.

Except that you're buying a VMWare license -and- a Windows Datacenter license at that point.

madsushi
Apr 19, 2009

Baller.

Noghri_ViR posted:

So I've got a small little VMware install of 3 hosts and 1 vCenter. I'm doing the upgrade from 4.1 to 5.0 today and I ran into a little hiccup and got the following error message:



Now after googling I found the following VMware KB article:

http://kb.vmware.com/selfservice/mi...ernalId=2006664

Which told me my datatstores were not mounted identically. So I ran the SQL query at the bottom of the article and got this:


So am I right in thinking that the trailing slash on the 18 host is the one causing the issue because they are all mounted via IP address?

Yes. You can also check the volumes by browsing to /vmfs/ via an SSH session and looking at the volume UUID. That's definitely your problem. I had the same issue where one was mounted with uppercase letters and one was mounted with lowercase.

madsushi
Apr 19, 2009

Baller.

.crt is probably PEM format. Try asking for it from GoDaddy for Apache, then tell Citrix it's PEM.

madsushi
Apr 19, 2009

Baller.

Here's how you negotiate with vendors: YOU figure out what you need (space, IOPS, HA, future capacity, etc) and YOU figure out your budget, then you take that to the vendor and see if they can meet the need you defined for the price you specified. Don't make this like a Diablo II trade game, where everyone stands around for 30 minutes just repeating "wug" and "i dunno, wuw".

madsushi
Apr 19, 2009

Baller.

Bitch Stewie posted:

I have to ask, why would you give any vendor your budget upfront?

My personal opinion is that if you have $20k to spend on widgets, and you tell a vendor that you have $20k to spend on widgets, they'll come up with a solution that costs $20k.

What you have just described is how every business interaction should work.

You tell them "I have $20k and I need a 20 lbs of widgets" and they say "OK, here's 20 lbs of widgets" and everyone is happy. Believe it or not, even if a vendor creates a solution that satisfies your needs and fits within your budget, their actual cost is lower than what they charge you. That difference is called "profit", and it's how business works.

While we can argue the fine details over how much profit it's "ethical" to collect, the fact is that they satisfied your need for the price you asked for.

madsushi
Apr 19, 2009

Baller.

Here are my Google Reader Tech RSS entries:

http://chasechristian.com/i/techsubs.xml

There's some Exchange, some VWMare, some storage, some networking.

madsushi
Apr 19, 2009

Baller.


Would this be a headless server?

madsushi
Apr 19, 2009

Baller.

When people talk about NFS and dedupe vs iSCSI and dedupe, the key thing is what VMWare sees.

If you have a volume and a LUN via iSCSI and you get great dedupe, that extra space can only be utilized by the SAN for things like snapshots. The space isn't actually made available to the LUN and the underlying OS (VMWare).

If you have an NFS volume and you get great dedupe, VMWare sees that free space, so it can utilize that space to oversusbscribe your storage. Now you're actually using all that free space!

Plus NFS was the only way to exceed 2 TB datastores prior to the recent ESXi 5.

madsushi
Apr 19, 2009

Baller.

Mierdaan posted:

This seems like a myopic and virtualization-admin-centric way of viewing things. If the SAN is deduping and aware of that ratio, that allows you to create more LUNs and more datastores; there's no reason to look at this from a one-LUN standpoint, is there?

It's definitely the virtualization-admin way of thinking, see thread title.

So: you just deduped your 2TB volume down to 1TB. With NFS, the story ends here, as VMWare sees 1TB of free space. Yay!

So: you just deduped your 2TB volume with a 2TB LUN in it down to 1TB. You have 1TB of space in the volume. With iSCSI, you're now left with:

*Shrink the volume
*Make a new volume with the shrunk space
*Make a new LUN in the new volume
*Move VMs to that new LUN (which sucks w/o sMotion)
*Enjoy your lower dedupe ratio since everything isn't in the same volume and increased management since now you have 2 volumes and 2 LUNs to manage

*Make a new, smaller LUN in that volume
*Move your VMs to that new LUN (which sucks w/o sMotion)
*Dedupe again
*Repeat several times until your volume is actually "full"
*Enjoy your several miscellaneous-sized LUNs that will cause you all sorts of fun


For VMWare, there are very clear advantages to just using a huge NFS datastore to host all of your VMs: better dedupe ratios, easier management, fewer constraints.

madsushi
Apr 19, 2009

Baller.

Ran into an issue at a client and I wanted to see if anyone else had experience with this. It's pretty esoteric.

VMFS Datastore (call it VMFSDS) that is spanned across 4x 500GB LUNs (call these LUNs 1-4) via extents. LUNs 1 and 2 are iSCSI, LUNs 3 and 4 are FC. This is all on ESXi 4.1u1 I believe.

LUN 2 was mistakenly presented to a Windows server (running DPM) via iSCSI. I think since VMWare was using it, Windows either didn't see it or couldn't touch it. Fast-forward to a power outage where Windows seized the LUN first and then changed the partition type from VMFS to SFS.

Windows changing the partition type of a VMFS volume is fairly common and well-documented by both VMWare and 3rd parties (Yellow Bricks has a great guide). However, even after recreating the partition as VMFS, we still had issues with the datastore VMFSDS saying that there was missing data. The files all appeared to be intact (because the file table is stored on LUN 1, right?) but when trying to access them, we had problems.

Rebooting the hosts did not fix the issue. Ended up on the phone with VMWare all weekend working on the problem, sent them the first 1.2GB of each LUN, etc. At the end of the day, it looks like metadata table on LUN 2 was too badly damaged to repair.

Has anyone run into an issue like this? I see lots of issues with single VMFS volumes getting swapped to SFS, but none in the situation where the LUN is just an extent of an existing datastore. We have already assumed all data is lost, but I wanted to see if anyone else had ever heard of this.


Some notes:
Using extents on different LUNs is like putting your data on a RAID 0 -- if either LUN fails, all your data is gone
VMWare snapshots are NOT BACKUPS

madsushi
Apr 19, 2009

Baller.

three posted:

It seems what happened to you is a result of poor administration (improperly presenting a LUN to the wrong system), although it doesn't sound like it's your fault.

That is absolutely the cause, and it wasn't my fault because I was brought in to consult after the issue had already occurred.

madsushi
Apr 19, 2009

Baller.

The way the licensing works is that you have to own the hardware your licenses run on. Either the MSP or whoever is using their SPLA license and they own the hardware, or you are using your regular licenses and you own the hardware. A lot of companies will do some shady leasing stuff to change "ownership" (like the MSP leases the hardware to you for $1 but retains control) but that gets into muddy waters.

madsushi
Apr 19, 2009

Baller.

Corvettefisher posted:

thumb rule 1 vCPU per gig of ram, I do 1 vcpu:2GB ram.

Every time I think you've made your dumbest post on this thread, you surprise me.

The right way to do this is: 1 vCPU. Always.

If you hit 100% on 1 vCPU regularly, bump it up to 2 vCPU.

If you hit 100% on 2 vCPU regularly, bump it up to 3 vCPU. Repeat as necessary.

If I gave every 4GB RAM VM of mine 2 vCPUs, I would probably double my vCPU count.

madsushi
Apr 19, 2009

Baller.

Shaocaholica posted:

They are desktop machines so lets imagine 1-2 cores 45sec out of every min and full on as-many-cores-as-I-can-get the other 15sec for an 8 hour workday.



e: the workflow I'm evisioning is for rendering CG where the apps have become extremely parallel so the faster the artists can work interactively the better. The catch is that the heavy work comes very intermittantly. Move some sliders, check render, move sliders, etc. I'm guessing the downtime when adjustments are made account for 80% of the total time and 20% is spent waiting for results. That could wildly vary depending on how many cpus the vm has.

VMware is pretty bad at this type of thing. Honestly, one of its biggest weaknesses is handling bursty CPU load effectively. Until we can change core count on the fly, there's not an easy solution for this outside of just doing really low VM density on your physical hosts so that you can safely assign a lot of cores without worrying about contention.

madsushi
Apr 19, 2009

Baller.

Here's the list of cool things I just sent out to the team:

1) Licensing - No more vRAM, back to the old license-per-socket model. Yay!

2) Licensing - Storage vMotion and share-nothing vMotion are both included in STANARD. Yay!

3) Share-nothing vMotion - Speaking of, you can vMotion between two hosts WITHOUT shared storage. The catch: you still need vCenter, and it takes a LONG time (has to copy the whole VMDK).

4) VMWare Tools - No longer requires a reboot to update (after you install the 5.1 tools, heh).

5) vSphere Web Client - The Web Client is back.

6) Backups - Based on EMC Avamar tech, VMWare can now back VMs up directly to disk and includes dedupe and does not require agents.

7) Networking - vSphere will now support LACP. Yay!

8) Networking - You can now do port-mirroring (for monitoring devices) and IPFIX/Netflow to remote hosts for monitoring.

9) Graphics - The ability to leverage specific NVIDIA GPUs for 3D performance.

10) Storage - Parallel storage vMotions. Previously all storage vMotions were serial (one at a time).

madsushi
Apr 19, 2009

Baller.

FISHMANPET posted:

So the reason I ask, is a coworker who is a programmer but knows just enough to be dangerous did some research on my plans to use FT between the R710 and the R520, and found a vSphere 4 FT FAQ that included the following:


So I understand what EVC and what it does, but EVC has been around for a long time yet the FAQ makes no mention of it and basically says the CPUs have to be in the same family. So either that's still true, in which case it would be nice to find a vSphere 5 FT FAQ that mentions that, or it's not true and EVC can keep CPU properly masked, in which case it would be nice to find a document that says "Hey, we removed this limitation!"

Just so we're all clear here, you know that FT is not usually a good idea? Unless you happen to have a 1-core VM that for some reason needs an insane amount of uptime, the overhead and configuration you'll have to deal with for FT is usually wasted.

madsushi
Apr 19, 2009

Baller.

FISHMANPET posted:

So I know that it requires the VM ve only single vCPU, and that since the VM is always running on two machines it uses twice the resources. It's also not a replacement for redundant services (clustered file servers, multiple domain controllers, etc etc). Is there anything else I'm missing?

Twice the resources is a massive understatement. It's costing you host CPU cycles and a massive amount of network traffic between the hosts to keep the VM in lockstep. In addition, you're also looking at a massive amount of complexity. You can't vMotion a FT VM.

madsushi
Apr 19, 2009

Baller.

sanchez posted:

In place upgrades of esxi 4.0 to 5 on hosts with local storage (gently caress local storage), has anyone experienced any kind of disaster? Coworkers have done a few but I'm nervous all the same since the client has no vm level backups and rebuilding all of their servers in a weekend would suck if the datastore gets eaten. I'd be using the iso.

I've done it about a dozen times and never had to restore from backup. Note how I used the word "backup" there, because I took one before every single upgrade. Don't chance it, just bring in a big USB drive and use the CLI to copy the VMDKs to it. You could also use WinSCP from a desktop or something too.

madsushi
Apr 19, 2009

Baller.

Nev posted:

Does anyone know how long it usually takes Dell to make customized ESXi isos available? I have a Poweredge R710 that I want to put 5.1 on.

Dell and HP usually trail by 3-4 weeks, although they had 5.0U1 out just a week after it was released.

madsushi
Apr 19, 2009

Baller.

Moey posted:

Are the Dell customized ISOs really worth using?

For my installs I just used the normal ISO from VMware. Only time I had to add in additional drivers manually was adding in a dual 10GbE Emulex card.

Edit: Looks like they package updated drivers into them. I should probably switch over to that once the Dell 5.1 ISOs come out.

Less about the drivers for me and more about the CIM providers so you can monitor things like hardware RAID status (individual drives).

madsushi
Apr 19, 2009

Baller.

Erwin posted:

I can see the individual drive statuses on all of my Dell hosts, and they're all the standard VMware image. Unless I'm not looking for the right thing?

Here's the HP ESXi 5.1 download - https://my.vmware.com/web/vmware/de...2&productId=285

Looks like VMWare is hosting it now, which is new!

If you can see the individual drive statuses, then either you installed the Dell driver bundle separately or you're not using a RAID array, or something.

madsushi
Apr 19, 2009

Baller.

Pantology posted:

Had an interesting question from one of our recruiters today. She's wondering if there are any key words or phrases that you'd expect seasoned VMware people to use that less-experienced people might not. Some kind of shibboleth she can look for in resumes and cover letters, and during the initial phone screen.

Two that came to mind were spelling it "VMware," and referring to the vSphere client as the "VIC." I'm not wild about either--the former only catches the truly clueless, and the latter just means you remember 3.x. Anyone have anything better?

You could ask them what 'ESX' stands for (Elastic Sky X) or ask them if VMware snapshots are a good idea (lol) or about VMFS block sizing (if they say "thank god VMFS-5 fixed that poo poo") or what the best way to do VMware bandwidth aggregation is (if they say "thank god vSphere 5.1 has LACP").

madsushi
Apr 19, 2009

Baller.

Goon Matchmaker posted:

Anyone have any good recommendations for a home ESXi 5.1 server? I looked at savemyserver in the OP but I'd rather avoid the ancient, watt sucking, monsters they have for sale. I was thinking build my own, but I don't know what parts would be good for such a thing.

The HP Microserver (N40L) is a very popular home ESXi server, very cheap, low power, and there's a lot of great documentation/blogs with info on setting one up.

madsushi
Apr 19, 2009

Baller.

FISHMANPET posted:

See, this strikes me as a really bad idea. Those snapshots are there for some reason, perhaps the guy that runs the VMware setup at the company knows?

See, you got these two statements backwards. It should read:

Those snapshots strike me as a really bad idea. What reason is the guy that runs VMWware at the company still employed?

madsushi
Apr 19, 2009

Baller.

BelDin posted:

I'm sitting in the VMWare ICM class right now and was wondering: What are general thoughts on reliability of vDS? I know with v4, it was a little too new to put in production for my tastes due to the constant tweaking in updates. We're looking at a deployment of View with about 1200 machines and a few (12 to start) blade servers, which makes vDS very attractive from a management standpoint.

vDS has pretty good reliability now, but I also like to maintain a non-vDS management network just in case.

madsushi
Apr 19, 2009

Baller.

Roving Reporter posted:

A question on (over) provisioning CPUs and RAM here. Currently using a dedicated server with Xeon E3-1230, and 8GB ECC.

I'm looking to run Server 2012 or 2008 R2 on a steady basis, with the hope of experimenting with other OSes on the side, such as Ubuntu Server, FreeBSD and whatnot. Perhaps a Windows 7 or XP guest as well, but I could do those on Workstation locally if need be.

Any recommendations or literature on this? ESXi manuals are somewhat unhelpful unless I've been looking at the wrong ones.

Edit: The Memory Management Guide seemed like a pretty good start.

Linux VMs: 1 core, 512 MB RAM, 20 GB disk - that's a basic Linode and will be sufficient for 99% of your lab use-cases.

Windows VMs: 1 core, 1 GB RAM, 40 GB disk - this is about as low as you want to go

Try to stack as many roles together as you can. The more vCPUs you provision, the worse your performance is going to be.

Today I am running:

Server 2012 DC - 1 vCPU, 1 GB RAM
Server 2012 Exchange 2013 - 2 vCPU, 2 GB RAM (Exchange 2013 just sucks with 1 vCPU)
Windows 8 VM - 1 vCPU, 1 GB RAM (just poking around with settings/Metro)
Windows 2003 VM - 2 vCPU, 4 GB RAM (Minecraft server)
Ubuntu - 1 vCPU, 512 MB RAM
OnTAP-E - 1 vCPU, 512 MB RAM
CentOS - 1 vCPU, 512 MB RAM

I'm running all of that on a 4-core AMD proc (Phenom X-2 Black?) and 16 GB of RAM. Performance is pretty good.

madsushi
Apr 19, 2009

Baller.

adorai posted:

I have two netapp HA pairs at seperate sites. I have two vCenters each with a datacenter, and the vCenters are linked. I have SRM installed. I have created protection groups, have my SRA configured,etc.. When I test my SRM failover, my datastores (which are NFS) are cloned properly but are only mounted on a single host at my DR site. This is less than optimal. Does anyone have any idea where I should start looking to determine why my other hosts are not mounting the datastores as well?

I assume you've tried manually mounting the NFS datastores on your DR hosts? My guess would be NFS permissions... seem to always get misconfigured. Could also be DNS, etc. I would confirm you can manually mount the datastores at your DR site just to confirm that's all in order.

madsushi
Apr 19, 2009

Baller.

Typically you have CIM drivers installed that will pass the detailed hardware information to ESXi. HP and Dell have ESXi ISOs that come with these drivers preinstalled. But, that's what you're looking for: CIM drivers for the LSI raid controller, which may or may not exist.

madsushi
Apr 19, 2009

Baller.

BangersInMyKnickers posted:

We're setting up storage replication between our primary and secondary NetApp units for a DR plan. Everything is in NFS volumes, so the plan is to replicate the changes nightly when activity is low. If the building burns, we mount up the the volumes on the backup hosts, import the VMs, and get back online in a couple hours.

The question I have is should I be concerned with trying to quiesce traffic before replication kicks off? The NetApp units generates a volume delta while the replication is happening so you're moving stable data. My assumption is that the state of the VMDKs as I bring them up (in the hopefully non-existent occasion that I actually have to do this) is that they will just think they had a hard crash at the time of replication, and everything we run including databases seems pretty resilient to hard crashes these days. Sure, in the case of databases there is going to be a little data loss because the log marker hasn't incremented after a little bit of data was written out. But that's maybe a few seconds worth of data and we're going to be doing 12 hour replication schedules which means we're going to be losing on average 6 hours worth of stateful data anyhow.

Is my gut right on this or do I have my head up my own rear end and really need to get the traffic quiesced with the NetApp VMware plugins?

Don't quiesce the data. You're just going to have awful performance during the VMware snapshot creation/deletion and it doesn't buy you anything at all.

Let NetApp take the snapshots at will (via VSC or a snapvault schedule) and then replicate it like that. You get a 'crash-consistent' backup that is going to work. Can you remember the last time that a VM failed to come up after you did a hard power/reset on it? The answer is 'never'.

madsushi
Apr 19, 2009

Baller.

Misogynist posted:

You've clearly never worked with either Oracle or XFS.

If your Oracle DBs are located on your VMware NFS LUNs, then you have been doing something very wrong. All of your databases should be stored on separate iSCSI/FC LUNs mapped directly to the VMs. The data on the NFS volume should only be OS drives or non-DB application data.

XFS, I have not worked with.

If you're relying on VMware HA for your high availability, then you're already relying on non-quiesced disks being available and working properly. Which, if designed properly, they always will.

madsushi
Apr 19, 2009

Baller.

For me, it's not a performance issue, it's the ability of my backup software (usually NetApp's SnapManager for SQL/Oracle/Exchange) to back up those NetApp volumes independently, so that it can quiesce the data.

madsushi
Apr 19, 2009

Baller.

Misogynist posted:

Yes, we are literally running Windows 2000 servers in production in the year 2012 because migrating them is a very simple proposition

I had to actually do some VM inception to get the last 2000 box virtualized since we had already upgraded to ESXi 5.0.

-Make ESXi 4.0 guest inside of a VM on an ESXi 5.0 host
-Mount my NFS datastore to the ESXi 4.0 guest
-Attach the old version of VMware converter to the 4.0 guest
-P2V using the old converter/ESXi 4.0
-Attach the VM to the ESXi 5.0 host

madsushi
Apr 19, 2009

Baller.

Corvettefisher posted:

Yeah that would be great, I was just seeing if there were any Dell/HP people out there wanting to chip in.

I know HP servers/blades pretty well. I also know enough to stay away from Dell.

madsushi
Apr 19, 2009

Baller.

BangersInMyKnickers posted:

We started running de-dupes on the OS partition VM volume on our NetApp about a month ago. The initial pass of 1.8TB took about 11 days and brought it down to about 650GB actual usage, so a pretty good dedupe ratio. Since then, he's been trying to run dedupes off and on as the change delta percentage starts creeping up but when they fire off they take another 10-11 days to complete which seems way too long. Other volumes containing CIFS shares and upwards of a TB take about 30 minutes or so (but I suspect the block fingerprinting has very little matching, requiring less of the block by block inspection pass, so a very different beast). Both the vswaps and pagefile (inside the boot volumes) reside there as well and he is under the impression that this would be destroying performance. I'm not that convinced since the vswap should be full of zeros since they've never been used and the pagefiles aren't being encrypted or dumped at reboot so that data should be relatively stable. Ideally I would like to move all the pagefiles from SATA to FC and possibly the vswap while I am at it, but we don't have the capacity to handle it right now until some more budget frees up and frankly I'm not convinced this is the source of our problem.

Any thoughts?

I have no idea why your dedupes would be taking that long. I've done a 10TB dedupe job in 24 hours before on SATA disk.

What version of ONTAP are you running? The 8+ builds have a new dedupe version that's better/faster. What type of disk/aggregates is the OS partition VM volume on? Maybe it's on like 4 disks by itself? It doesn't matter if it's CIFS or VMs or swaps etc, it's done at a block-level.

Adbot
ADBOT LOVES YOU

madsushi
Apr 19, 2009

Baller.

skipdogg posted:

Anyone have any experience with running ESXi from a SD Card on HP DL3xx G8 servers? I'm assuming it works well as long as you stick to supported cards. Any gotchas? We have a remote office in another country where hard drives are stupid expensive and could save quite a bit of money by not having the ESXi hosts with local drives. Boot from SAN is another option, but I'd like to explore the SD card first.

It was broke as hell in ESXi 4.1 / G6 servers, but works much better now in 5+ / G7+ environments. I like it.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply
«3 »