Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us $3,400 per month for bandwidth bills alone, and since we don't believe in shoving popup ads to our registered users, we try to make the money back through forum registrations.
«290 »
  • Post
  • Reply
Bob Morales
Aug 18, 2006

This post is good to go


Two years ago you would have went with 6.5 as it had been out for a bit. Now I'm not so sure it would be a big deal.

It'd be like going to 6.7 instead of 7.0 right now.

What else are you using with it? I know my manager hates 6.7 because it doesn't integrate with our Nimble (which is older) like it used to.

Adbot
ADBOT LOVES YOU

Internet Explorer
Jun 1, 2005





Oven Wrangler

It's been a while, but isn't that just the HTML5 interface that doesn't have plug-ins? I thought the Nimble stuff should still work in the Flash client.

Bob Morales
Aug 18, 2006

This post is good to go


We upgraded our Nimble software so the newest html5 plugin doesn't have the same functionality

The flash client is deprecated now, so that plugin doesn't get updated

I guess it's more bumbles fault but whatever. You just have to pop into the Nimble to create datastores now instead of being able to do it in vsphere

Moey
Oct 22, 2010

I LIKE TO MOVE IT


Yeah, I played with the Nimble vCenter plug in years ago, I just use the web interface on the arrays to do that. It's not difficult at all.

SlowBloke
Aug 14, 2017


Wicaeed posted:

I have somewhat stupidly volunteered myself for a VMware upgrade Project of our aged vCenter 6.0 installation.

The advisor recommendations are saying we should install the 6.5.0 GA version of vCenter, but I don't see any mention of vCenter 6.7.

We do have some older hosts that can only go to 6.0.0 U2 version of VMware, however these should be compatible with vCenter 6.7 according to the VMware docs.

Am I missing anything super obvious as to why 6.7 wouldn't be showing as a recommended upgrade for us?

I do have a VMW support ticket created as well, just figured SA may have a quicker turnaround than VMware support nowadays...

There is no major issue going 6.0 to 6.7(unless you went external PSC or you are running the vcenter install on windows), there are issues going directly from 5.5 to 6.7(you need an intermediate 6.5 step to keep host compatibility). The current vmware upgrade path can be found at https://www.vmware.com/resources/co...rade&solution=2 (insert vcenter in the text field).
Also never do an upgrade with a GA build, always at least on Update1. Current VCSA 6.7 build is Update 3g.

TheFace
Oct 4, 2004

Fuck anyone that doesn't wanna be this beautiful


Wicaeed posted:

I have somewhat stupidly volunteered myself for a VMware upgrade Project of our aged vCenter 6.0 installation.

The advisor recommendations are saying we should install the 6.5.0 GA version of vCenter, but I don't see any mention of vCenter 6.7.

We do have some older hosts that can only go to 6.0.0 U2 version of VMware, however these should be compatible with vCenter 6.7 according to the VMware docs.

Am I missing anything super obvious as to why 6.7 wouldn't be showing as a recommended upgrade for us?

I do have a VMW support ticket created as well, just figured SA may have a quicker turnaround than VMware support nowadays...

Go to 6.7, and even if you go 6.5 you sure as poo poo shouldn't go GA! LATEST UPDATES ALWAYS!!!

greatapoc
Apr 4, 2005


We've just recently bought a bunch of new Dell R640 servers to replace our aging HP c7000 blade chassis hosting our Hyper-V infrastructure. We're experiencing poor network performance on the new servers though and I'm trying to track down the source of it. Our original environment was 2012r2 hosts. We had to inplace upgrade these to 2016 to raise the functional level before adding the new 2019 Dell hosts to the cluster and then live migrating all the guests over to the new hardware. We've still got some VMs stuck on one of the old hosts but we have a plan around that. Anyway, this gives us something to work with and compare for the poor network issues we're seeing.

Differences I can see between the old host and the new: jumbo frames disabled on the new, receive and transmit buffers set to 512 on the new and auto on the old. Old host has three 1GB NICs in a LACP team to two Cisco switches, new host has 2 10GB NICs in a LACP team to the same switches. iPerf test from a VM on the old host to my PC saturates the 1Gbps link into my PC but the new host only pushes about 500mbps. Transfer between VMs on the same host is around 2gbps, transfer between VMs on different hosts is around 600mbps. VMQ is enabled, NICs are Intel X710 on the Dells.

Not sure what else to mention. I was going to try changing the NIC team from LACP to "Switch Embedded Teaming" tonight in an outage window as well as enabling jumbo frames to see if it makes a difference. Does anyone have any ideas of things to look at?

BangersInMyKnickers
Nov 3, 2004

I have a thing for courageous dongles



The x710's are unstable trashfires, especially on bonded links. I spent almost a year chasing my tail on them and trying every combination of driver and firmware imaginable and going through every level of Dell and Intel support they could throw me at until they finally relented and replaced the NDCs with Qlogics that worked flawlessly. I will say that the x710 silicon is fast as hell, can pull of full line speed on 64k frames, but that doesn't mean diddly when the loving thing is getting reinitialized five times a second because they're panicking and flapping the link. You can improve the situation by getting jumbo frames back on and cranking up your ring buffers to max (I think it was 4096, maybe 8192? 512 is way too small for a virtual host)

The qlogics were slower (could only handle 3-4gbps on 1500 frames, but still hit line speed on jumbo. Newer models are better), but they didn't panic and my lacp interface uptime was being measured in days instead of seconds which was an acceptable tradeoff.

e: I can't believe they're still selling them to be honest. The senior engineer at Dell I finally got through to said they were having no end to the problems with them and were about ready to drop Intel nics as an option. This was years ago, I figured they would have finally sorted out the issues or made a new version of a quad-port 10gige interface. If you need to stick with Intel, try the X722 instead, hopefully they got their poo poo together. X710's are old, I was dealing with this poo poo back in 2014/2015

BangersInMyKnickers fucked around with this message at 13:56 on May 6, 2020

greatapoc
Apr 4, 2005


BangersInMyKnickers posted:

The x710's are unstable trashfires, especially on bonded links.

Well that's just bloody great, we just took delivery of the things two weeks ago. Looks like we might have to go back to Dell and ask for something else. The thing is we've never had any problem with it dropping the connection, the port-channels are rock solid and haven't missed a beat. The performance is just terrible.

Thanks Ants
May 21, 2004

Bless You Ants, Blants



Fun Shoe

Tell Dell that they need to ship you new mezzanine cards with different NICs on or you won't pay the invoice

Methanar
Sep 26, 2013



X710 is great because it puts ESXi closer towards it's natural state. psod


Also I once discovered that my non-lacp bonded interfaces had been flapping several times a second for like a year on one of our databases.

Also sometimes the cards would just not be recognized as plugged in at all, even by the BMC until you restarted 10 times.

gently caress datacenters

Methanar fucked around with this message at 23:53 on May 6, 2020

greatapoc
Apr 4, 2005


Touch wood it looks like I may have fixed it but I'm not sure exactly which part did it.

Removed the team and recreated it (still using LACP)
Enabled jumbo frames on both NICs
Increased receive and transfer buffers to 4096
Added reg key HKLM\SYSTEM\CurrentControlSet\Services\VMSMP\Parameters\TenGigVmqEnabled=1 (VMQ was already enabled on the VMs)
Rebooted host

iperf and file transfers are now flying like they should but failover cluster manager is throwing up it's hands so I need to do more with that.

Edit: Here's a capture from where I have it running on one of the Dells then live migrate it to the one I've just (hopefully) fixed.

[ 4] 2.00-3.00 sec 54.5 MBytes 456 Mbits/sec
[ 4] 3.00-4.00 sec 25.9 MBytes 217 Mbits/sec
[ 4] 4.00-5.00 sec 49.8 MBytes 419 Mbits/sec
[ 4] 5.00-6.00 sec 43.4 MBytes 364 Mbits/sec
[ 4] 6.00-7.00 sec 48.2 MBytes 405 Mbits/sec
[ 4] 7.00-8.00 sec 49.4 MBytes 414 Mbits/sec
[ 4] 8.00-9.00 sec 39.8 MBytes 334 Mbits/sec
[ 4] 9.00-12.49 sec 7.50 MBytes 18.1 Mbits/sec
[ 4] 12.49-12.49 sec 0.00 Bytes 0.00 bits/sec
[ 4] 12.49-12.49 sec 0.00 Bytes 0.00 bits/sec
[ 4] 12.49-13.00 sec 2.62 MBytes 43.1 Mbits/sec
[ 4] 13.00-14.00 sec 111 MBytes 929 Mbits/sec
[ 4] 14.00-15.00 sec 112 MBytes 936 Mbits/sec
[ 4] 15.00-16.00 sec 110 MBytes 919 Mbits/sec
[ 4] 16.00-17.00 sec 110 MBytes 922 Mbits/sec
[ 4] 17.00-18.00 sec 105 MBytes 880 Mbits/sec
[ 4] 18.00-18.58 sec 56.5 MBytes 813 Mbits/sec

greatapoc fucked around with this message at 00:28 on May 7, 2020

Potato Salad
Oct 23, 2014

Nobody Cares




Tortured By Flan

loving nobody gets LACP right

Orchestrate static aggregation channels instead of counting on LACP to do it for you

Potato Salad
Oct 23, 2014

Nobody Cares




Tortured By Flan

Lacp exists to cause you more pain and suffering and production losses than the time it takes to set up channels/aggregation up by hand, every single time

It is worth taking a moment to just write some logic to set up static aggregation orchestration with whatever tools you use to manage your switches and your compute

Potato Salad fucked around with this message at 10:11 on May 7, 2020

BangersInMyKnickers
Nov 3, 2004

I have a thing for courageous dongles



greatapoc posted:

Touch wood it looks like I may have fixed it but I'm not sure exactly which part did it.

Removed the team and recreated it (still using LACP)
Enabled jumbo frames on both NICs
Increased receive and transfer buffers to 4096
Added reg key HKLM\SYSTEM\CurrentControlSet\Services\VMSMP\Parameters\TenGigVmqEnabled=1 (VMQ was already enabled on the VMs)
Rebooted host

iperf and file transfers are now flying like they should but failover cluster manager is throwing up it's hands so I need to do more with that.

Edit: Here's a capture from where I have it running on one of the Dells then live migrate it to the one I've just (hopefully) fixed.

[ 4] 2.00-3.00 sec 54.5 MBytes 456 Mbits/sec
[ 4] 3.00-4.00 sec 25.9 MBytes 217 Mbits/sec
[ 4] 4.00-5.00 sec 49.8 MBytes 419 Mbits/sec
[ 4] 5.00-6.00 sec 43.4 MBytes 364 Mbits/sec
[ 4] 6.00-7.00 sec 48.2 MBytes 405 Mbits/sec
[ 4] 7.00-8.00 sec 49.4 MBytes 414 Mbits/sec
[ 4] 8.00-9.00 sec 39.8 MBytes 334 Mbits/sec
[ 4] 9.00-12.49 sec 7.50 MBytes 18.1 Mbits/sec
[ 4] 12.49-12.49 sec 0.00 Bytes 0.00 bits/sec
[ 4] 12.49-12.49 sec 0.00 Bytes 0.00 bits/sec
[ 4] 12.49-13.00 sec 2.62 MBytes 43.1 Mbits/sec
[ 4] 13.00-14.00 sec 111 MBytes 929 Mbits/sec
[ 4] 14.00-15.00 sec 112 MBytes 936 Mbits/sec
[ 4] 15.00-16.00 sec 110 MBytes 919 Mbits/sec
[ 4] 16.00-17.00 sec 110 MBytes 922 Mbits/sec
[ 4] 17.00-18.00 sec 105 MBytes 880 Mbits/sec
[ 4] 18.00-18.58 sec 56.5 MBytes 813 Mbits/sec

Make sure you kick the tires on live migrations between hosts on the new 10gig interfaces once you have them all up. The problems I was seeing didn't manifest until I was regularly moving traffic at multi-gbps rates. Being choked down by the old server interfaces could be masking issues.


My guess is that jumbo frames were the thing that did the most good. The buffer size won't be an issue at those speeds, especially on an iperf test. Likely the old server nic's can't handle full gbps rate on 1500 mtu and that's where your bottleneck was. For all their faults, x710 can handle 1 gbps on 1500mtu.

BangersInMyKnickers fucked around with this message at 12:52 on May 7, 2020

1000101
May 14, 2003

BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

Potato Salad posted:

Lacp exists to cause you more pain and suffering and production losses than the time it takes to set up channels/aggregation up by hand, every single time

It is worth taking a moment to just write some logic to set up static aggregation orchestration with whatever tools you use to manage your switches and your compute

What are people actually getting wrong? Thereís not a whole lot to setup on LACP beyond timers (which most platforms only have 1 option) and if the interfaces are going to actively send LACP PDUs or not. Iíve probably seen more people get static link aggregation wrong where maybe one side has the wrong load distribution algorithm set. I think the dvswitch itself supports something like 26 different options of which not all exist on all switching platforms.

That said I almost never bother with link aggregation to hypervisors anymore. 10 gig is cheap and source based load distribution doesnít require upstream switch configuration.

Potato Salad
Oct 23, 2014

Nobody Cares




Tortured By Flan

everyone's lacp implementation is awful, from "why do i have 1/n packet loss for n links" to literally unusable

BangersInMyKnickers
Nov 3, 2004

I have a thing for courageous dongles



1000101 posted:

What are people actually getting wrong? Thereís not a whole lot to setup on LACP beyond timers (which most platforms only have 1 option) and if the interfaces are going to actively send LACP PDUs or not. Iíve probably seen more people get static link aggregation wrong where maybe one side has the wrong load distribution algorithm set. I think the dvswitch itself supports something like 26 different options of which not all exist on all switching platforms.

That said I almost never bother with link aggregation to hypervisors anymore. 10 gig is cheap and source based load distribution doesnít require upstream switch configuration.

Last I fought with LACP on ESXi (years ago when 6.5 was new), the host would only do slow PDUs which makes for an unacceptable time to detect a fault and down the bad link. If you ran the host in passive mode and the switch in active with fast pdus, the host would ignore the switch's parameters and continue to use slow pdus/long timeouts and that mismatch would cause the upstream switch to flap the link because PDU timeouts are being exceeded. The only solution was to run an esxcli script every time the host got rebooted to manually force the vdswitch on to fast PDUs using unsupported commands. I come to find that this issue has been present since 4.x when vdswitches first came out and they just didn't give a poo poo about fixing it.

With that said, once you manually forced it on to fast PDUs everything functioned as expected and failovers were snappy and well within tolerances, even for storage fabric with some minimal stalling on the VMs

Pile Of Garbage
May 28, 2007





Potato Salad posted:

everyone's lacp implementation is awful, from "why do i have 1/n packet loss for n links" to literally unusable

Care to elaborate? I ask because I've never once had issues configuring LACP between devices of any vendor from 2x1Gb up to 4x10Gb.

greatapoc
Apr 4, 2005


greatapoc posted:

Touch wood it looks like I may have fixed it but I'm not sure exactly which part did it.

Removed the team and recreated it (still using LACP)
Enabled jumbo frames on both NICs
Increased receive and transfer buffers to 4096
Added reg key HKLM\SYSTEM\CurrentControlSet\Services\VMSMP\Parameters\TenGigVmqEnabled=1 (VMQ was already enabled on the VMs)
Rebooted host

iperf and file transfers are now flying like they should but failover cluster manager is throwing up it's hands so I need to do more with that.

So it looks like I spoke too soon on this one. Although iperf and file transfers were a lot better, once we moved SQL over to it some applications couldn't connect to it and others were showing very slow queries. It appears we've fixed it by disabling RSC on the virtual switch.

bad boys for life
Jun 6, 2003

Nothing happens to anybody which he is not fitted by nature to bear.

What are people doing for capacity planning for large vmware deployments? Vrops doesnt seem to scale to our infrastructure size. Getting recommendations to look at turbonomics and veaam one, but not sure if any of you have experience at SP level VMware deployments and have something better.

TheFace
Oct 4, 2004

Fuck anyone that doesn't wanna be this beautiful


bad boys for life posted:

What are people doing for capacity planning for large vmware deployments? Vrops doesnt seem to scale to our infrastructure size. Getting recommendations to look at turbonomics and veaam one, but not sure if any of you have experience at SP level VMware deployments and have something better.

How big is your deployment that vROPs can't scale that large? If vROPs can't do it Veeam One is going to definitely fall on it's face, can't speak for Turbonomics.

Maneki Neko
Oct 27, 2000



bad boys for life posted:

What are people doing for capacity planning for large vmware deployments? Vrops doesnt seem to scale to our infrastructure size. Getting recommendations to look at turbonomics and veaam one, but not sure if any of you have experience at SP level VMware deployments and have something better.

This seems like a fine question for an account manager/partner resources?

Potato Salad
Oct 23, 2014

Nobody Cares




Tortured By Flan

Not gonna lie, you're the first I've bumped into the year deploying sp infrastructure on vmware at "vROP isn't enough" scale

TheFace
Oct 4, 2004

Fuck anyone that doesn't wanna be this beautiful


Between extra large nodes, and clustering vROPs scales to some 180000-200000 objects I think. That is assuming you can actually deploy XL nodes (24 vCPU is a bit of an ask, especially when VMware guidelines suggest being able to fit it in a single socket)

I've never seen Veeam One be able to handle an environment that large without being horrifically slow.

abelwingnut
Dec 23, 2002



probably a pretty basic problem, but i'm having issues with virtualbox.

specifically, i'm trying to run an os x vm on a windows 10 machine. no matter what i do, i cannot escape the mouse or keyboard once i boot up the vm. i press the host key combination to decapture myself from the window, but nothing. i'm wondering if it has to do with the fact i'm using a usb wireless keyboard and a usb wireless mouse? i say that because whenever i type on the vm it is a bit...choppy and stuttered. so i'm wondering if there's some connection issue happening?

in any case, i've tried changing the host key combination from right control to right shift, and it just does nothing else. like, once i start the vm and it loads, i can only access that vm. ctrl+alt+del can't get me out, nothing can.

any ideas? i also loaded both usb input devices in the vm's settings. really not sure what's going on.

wolrah
May 8, 2006
what?


abelwingnut posted:

probably a pretty basic problem, but i'm having issues with virtualbox.

specifically, i'm trying to run an os x vm on a windows 10 machine. no matter what i do, i cannot escape the mouse or keyboard once i boot up the vm. i press the host key combination to decapture myself from the window, but nothing. i'm wondering if it has to do with the fact i'm using a usb wireless keyboard and a usb wireless mouse? i say that because whenever i type on the vm it is a bit...choppy and stuttered. so i'm wondering if there's some connection issue happening?

in any case, i've tried changing the host key combination from right control to right shift, and it just does nothing else. like, once i start the vm and it loads, i can only access that vm. ctrl+alt+del can't get me out, nothing can.

any ideas? i also loaded both usb input devices in the vm's settings. really not sure what's going on.

If you have attached the USB devices to the guest that's why they're not able to escape, they're literally being disconnected from the host and passed through to the guest when it's running. Don't do that unless that's what you actually want (only really makes sense if you're trying to do a "two workstations, one PC" style setup). USB passthrough is for other devices that you need to have appear as directly connected to the guest.

Adbot
ADBOT LOVES YOU

abelwingnut
Dec 23, 2002



ohhhhhh, got it. yea, that's worked--thanks.

now to try and figure out why this drat thing won't connect to icloud.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply
«290 »