|
It's not really raw performance I'm looking for so much as being able to run more VM's. I'm only at half the maximum config for memory so I have room to grow there and we're not projected to have rapid VM growth now that the P2V migration is completely done. ~$1000 per host for a 55% increase in flops now plus some new dimms down the road is a much easier pill for management to swallow than replacing hosts for $8,000 a pop. At the rate we're going I think these R610 could run for another 4 years easily.
|
# ? Apr 22, 2014 21:11 |
|
|
# ? Apr 26, 2024 05:54 |
|
derp nvm
|
# ? Apr 22, 2014 21:29 |
|
What is everyone's thoughts on Pernix FVP and other software solutions that use local Flash/SSD storage to cache read/writes to improve VM I/O? Right now I'm looking at a very small deployment of vSphere (3 hosts) for our first production environment. The only SAN storage we really have available to us is an Equallogic PS4110 w/7.2kRPM NL-SAS drives. If I wanted to ensure a reasonable expectation of performance for our VMs, would such a solution be worth pursuing?
|
# ? Apr 22, 2014 23:52 |
|
Where's the best place to start troubleshooting high DAVG latency against a datastore? We have identical hosts with identical FC connectivity to a Hitachi-based SAN, one living on Hitachi VSP and another on Hitachi HUS, and while we get fantastic <1ms guest disk service times against VSP, we get 5-10ms guest service times that spikes into 100+ms latency under very little IO load. esxtop shows the KAVG as nearly 0 and it's the DAVG that spikes into the hundred-millisecond range at random, which VMware says is on the driver side and on down, potentially being a misconfig with your HBAs or at the SAN layer. Is it worth trying to tune on the VMware side (LUN queue depth, etc) first, or do I need to blackmail my storage guy to open a case with Hitachi? I don't have visibility into the SAN side of things, so I want to have my ducks in a row before I try to blame everything on Hitachi. Cidrick fucked around with this message at 23:57 on Apr 22, 2014 |
# ? Apr 22, 2014 23:52 |
|
Cidrick posted:Where's the best place to start troubleshooting high DAVG latency against a datastore? We have identical hosts with identical FC connectivity to a Hitachi-based SAN, one living on Hitachi VSP and another on Hitachi HUS, and while we get fantastic <1ms guest disk service times against VSP, we get 5-10ms guest service times that spikes into 100+ms latency under very little IO load. esxtop shows the KAVG as nearly 0 and it's the DAVG that spikes into the hundred-millisecond range at random, which VMware says is on the driver side and on down, potentially being a misconfig with your HBAs or at the SAN layer. if the KAVG is showing near zero, I'd do this 1) Check the ESXI image you are using and driver version for your HBA. If you don't want to pack drives just download the latest supported Vendor customized ESXi image for your host. 2) Ensure the firmware is upto date on the card and BIOS 3) Ensure you are not seeing any queuing on the SAN 4) Ensure no switch issues(if possible) 5) Change the Queue depths If you need help determining what to change to what let me know Wicaeed posted:What is everyone's thoughts on Pernix FVP and other software solutions that use local Flash/SSD storage to cache read/writes to improve VM I/O? A dell MD3200i/MD3220i will do SSD caching on the controller level so you don't even have to bother with the hosts. Peferably I'd do it on the storage controller level that way, if you do need to scale out you only need to examine the cache usage of the NAS/SAN and not each host. Sure it will be faster on the host level, but how fast does fast need to be for you? Dilbert As FUCK fucked around with this message at 01:30 on Apr 23, 2014 |
# ? Apr 23, 2014 01:27 |
|
Dilbert As gently caress posted:
What do you mean 'SSD caching on the storage controller level'? I'm familiar with the idea of storage controller caching, but the SSD's I'm looking at would be running on the VMware hosts themselves, not part of any array we have. As far as performance, I'm not really sure at this point. I'm shooting for "hopefully these guys don't notice that they are running on virtualized servers" This project is kind of our "Hey guys virtualization isn't really that bad, see!" testbed. Right now we have two (!) Apache webservers slated for this hardware, with hopefully more to come. If people like what they see, we can use that to get some leverage for us to get some real virtualization hardware in place, instead of re-using old servers. Right now my bet is that we criminally underutilize our existing infrastructure, but that's a story for another day. Wicaeed fucked around with this message at 05:51 on Apr 23, 2014 |
# ? Apr 23, 2014 05:48 |
|
Wicaeed posted:Not really sure at this point. I'm shooting for "hopefully these guys don't notice that they are running on virtualized servers" Just go for Storage based caching on SSD's your main draw back will be HBA's to storage and non cached items. Other than that I am not sure what you are asking/saying. However, I'd be happy to explain things.
|
# ? Apr 23, 2014 05:51 |
|
If your KAVG is low then it's not going to be a queue depth issue, at least not on the host side. You could hit qfull conditions on the array side, but those would still show up a device latency, not kernel latency. To properly tune queue depth you'll need to talk to your SAN admins anyway to determine the fan in ratio and how luns are distributed across ports. I would do as DAF suggested and make sure you have the latest jab firmware, but it's pretty likely that the problem is on the fabric or array, and not the host.
|
# ? Apr 23, 2014 06:15 |
|
PernixData FVP is pretty awesome. Keeping data closer to the server is always going to be lower latency than storage-based caching. It's also able to hide performance problems on lower tier SANs or oversubscribed SANs.
|
# ? Apr 23, 2014 15:06 |
|
Cidrick posted:Where's the best place to start troubleshooting high DAVG latency against a datastore? We have identical hosts with identical FC connectivity to a Hitachi-based SAN, one living on Hitachi VSP and another on Hitachi HUS, and while we get fantastic <1ms guest disk service times against VSP, we get 5-10ms guest service times that spikes into 100+ms latency under very little IO load. esxtop shows the KAVG as nearly 0 and it's the DAVG that spikes into the hundred-millisecond range at random, which VMware says is on the driver side and on down, potentially being a misconfig with your HBAs or at the SAN layer. I don't know enough to give too useful of a suggestion, but we're looking at HDS HUS right now and they do a product called "Tuning Manager" which seems to give reports on absolutely anything/everything imaginable to help ascertain what's happening at an array level.
|
# ? Apr 23, 2014 18:03 |
|
Bitch Stewie posted:I don't know enough to give too useful of a suggestion, but we're looking at HDS HUS right now and they do a product called "Tuning Manager" which seems to give reports on absolutely anything/everything imaginable to help ascertain what's happening at an array level. If you want that, make sure you get the license for it. Seems like everything Hitachi sells is separately licensed...
|
# ? Apr 23, 2014 21:07 |
Sometimes when I boot up my Mint 16 VM with VirtualBox the 2nd monitor kinda overlaps the first one and windows don't correctly maximize (leaves a small portion of the top screen). I can always fix it by going to Monitor Preferences and disabling the 2nd monitor and then re-enabling it. Just annoying to have to do this every time I boot up the VM though. Any suggestions for how to fix it? Can I just automate disabling & re-enabling the 2nd screen on boot?
|
|
# ? Apr 23, 2014 23:01 |
|
If I want to create an OVA from a folder with an OVF and VMDK, is that just a zip file renamed to OVA or is there some other magic that has to happen?
|
# ? Apr 24, 2014 02:57 |
|
Oh look, a third intermittant IvyBridge PSOD scneario on multiple fully patched hosts that the hardware vendor can't diagnose. So now I have hosts that randomly have memory parity, some that have NMI from unidentified 3rd party drivers, and some with recursive CPU0 panics. At least the memory parity seems to go away with their latest BIOS firmware, doesn't fix the other two yet though.
|
# ? Apr 24, 2014 16:47 |
|
Martytoof posted:If I want to create an OVA from a folder with an OVF and VMDK, is that just a zip file renamed to OVA or is there some other magic that has to happen? Just deploy the ovf then export to ova.
|
# ? Apr 24, 2014 16:54 |
|
NullPtr4Lunch posted:If you want that, make sure you get the license for it. Seems like everything Hitachi sells is separately licensed... True, but it's around a grand so if someone has a VSP but doesn't have it I'd be.. surprised
|
# ? Apr 24, 2014 17:40 |
|
Mausi posted:Oh look, a third intermittant IvyBridge PSOD scneario on multiple fully patched hosts that the hardware vendor can't diagnose. Dumb question but are you running any power settings on these hosts, IIRC E1 stepping in IvyBridge had some really weird issues. Maybe just set it to higher performance so the CPU won't step? I know it's a shot in the dark but it's the only thing that comes to mind
|
# ? Apr 24, 2014 17:57 |
|
Dilbert As gently caress posted:Dumb question but are you running any power settings on these hosts, IIRC E1 stepping in IvyBridge had some really weird issues. Maybe just set it to higher performance so the CPU won't step?
|
# ? Apr 24, 2014 18:39 |
|
Setting HP power management to OS control can for sure produce PSODs as well. Doesn't sound like that's your issue though.
|
# ? Apr 24, 2014 23:02 |
|
Is it a bad design decision to separate NFS and ISCSI? I mean I know you can use NIOC and some things with VMkernels but I thought it was always best to separate storage protocols whenever possible.
|
# ? Apr 24, 2014 23:36 |
|
Dilbert As gently caress posted:Is it a bad design decision to separate NFS and ISCSI? Might not physically isolate them (on 10GbE anyway) but it's helpful to toss them in their own VLAN. Also depending on network topology may be helpful to set CoS on your storage traffic so you can manage it better through the physical network infrastructure. For example a Nexus will take CoS 2 marked frames and put them in a "no drop" class to guarantee delivery. Provide more details about the network topology/configuration and host configuration.
|
# ? Apr 24, 2014 23:58 |
|
Mausi posted:Oh look, a third intermittant IvyBridge PSOD scneario on multiple fully patched hosts that the hardware vendor can't diagnose.
|
# ? Apr 25, 2014 00:09 |
|
Dilbert As gently caress posted:Just deploy the ovf then export to ova. Yeah, that's what I ended up doing, I was just hoping to save myself 30 minutes.
|
# ? Apr 25, 2014 00:41 |
|
I'm trying to make a host out of a HP Compaq 8300 elite but I'm hitting a dead end because the network adapter doesn't seem to be recognized. I've seen some similar problems from other people on the internet but I don't exactly follow what they're doing to fix it. Edit: Is this something worth trying to fix or should I just go to Fry's and grab a supported NIC? Dr. Arbitrary fucked around with this message at 05:03 on Apr 25, 2014 |
# ? Apr 25, 2014 04:57 |
|
Dr. Arbitrary posted:I'm trying to make a host out of a HP Compaq 8300 elite but I'm hitting a dead end because the network adapter doesn't seem to be recognized. Is it a broadcom or intel adapter?
|
# ? Apr 25, 2014 14:22 |
|
Dilbert As gently caress posted:Is it a bad design decision to separate NFS and ISCSI? The reasons for separating them are generally for ease of management or flexibility rather than performance. Different VLANs will allow you to apply different jumbo frames policies depending on the type of storage traffic, and the CoS example mentioned by 1000101 is another example. Sometimes it also makes sense to segregate not just by protocol but more granularly, by purpose or management domain. For instance we have different VLANs for client NFS, client iSCSI, VMWare NFS and VMWare iSCSI because people assigning IPs on the client side generally aren't the same as people assigning them on the ESX side and they're much more likely to use an IP that is already on the network and cause issues (IP conflicts that take datastores offline suck a lot). I consider it a best practice because VLANs are free and private IP spaces are free and it makes things a lot cleaner, so why not do it? But it's not a requirement and it won't cause performance problems or anything if you don't do it.
|
# ? Apr 25, 2014 15:13 |
|
It comes down to operational preferences, imo.
|
# ? Apr 25, 2014 15:49 |
|
Something I've been pondering recently along those lines is what to do with 10GbE. Traditionally with gigabit I've always used separate physical ports for iscsi and had those switch ports on their own vlan to keep the storage traffic isolated from everything else. With 10x the bandwidth available though, is it wise to run LAN/Storage traffic together on the same interface? Still separated with VLANS, but sharing physical ports. The idea makes me uneasy, but i can't think of a good reason why it wouldn't work as long as the link wasn't saturated.
|
# ? Apr 25, 2014 16:03 |
|
sanchez posted:Something I've been pondering recently along those lines is what to do with 10GbE. Traditionally with gigabit I've always used separate physical ports for iscsi and had those switch ports on their own vlan to keep the storage traffic isolated from everything else. With 10x the bandwidth available though, is it wise to run LAN/Storage traffic together on the same interface? Still separated with VLANS, but sharing physical ports. This is pretty common practice. On each of my hosts I am running 2x10GbE and 2x1GbE. The 10GbE connections are doing VM, iSCSI and Management traffic. The 1GbE connections are doing vMotion.
|
# ? Apr 25, 2014 16:48 |
|
sanchez posted:Something I've been pondering recently along those lines is what to do with 10GbE. Traditionally with gigabit I've always used separate physical ports for iscsi and had those switch ports on their own vlan to keep the storage traffic isolated from everything else. With 10x the bandwidth available though, is it wise to run LAN/Storage traffic together on the same interface? Still separated with VLANS, but sharing physical ports. The likelihood that you have a single host pushing the 1.25 GB/s of throughput required to saturate a single 10GbE link is basically non-existent. With bundled links and iSCSI load balancing your SAN will likely tap out before your network links. Consolidate as much as possible (lights out will still require 1g, but everything else can be consolidated, for the most part) so you can spend less on ports, on rack space to house switches, on cabling, on management overhead from maintaining a billion different connections, and on unplanned downtime due to unnecessary complexity.
|
# ? Apr 25, 2014 17:15 |
|
Dilbert As gently caress posted:Is it a broadcom or intel adapter? I think it's Intel, it's built right onto the motherboard.
|
# ? Apr 25, 2014 17:17 |
|
1000101 posted:Might not physically isolate them (on 10GbE anyway) but it's helpful to toss them in their own VLAN. Also depending on network topology may be helpful to set CoS on your storage traffic so you can manage it better through the physical network infrastructure. For example a Nexus will take CoS 2 marked frames and put them in a "no drop" class to guarantee delivery. Running about 6 hosts 3 with 12 and 3 with 8 1Gb/s nics in the servers, two 3750's, and a SAN of Netapp, Dell, and soon EMC VNX 5400. San Topology is: Netapp 2x1Gb/s->Nextena Caching server ( So NetApp 2Gb->Nexenta[planned]4x1Gb->Servers) NFS/ISCSI *Nexenta L2ARC 384GB ram cache, 800GB SSD cache Dell, 4x1Gb/s ISCSI EMC, 8x1GB/s ISCSI/NFS *coming soon* My plan was to use 4 1Gb nics on the host with 12 nics 2 ports per QP card, 2 for NFS and 2 for ISCSI, then have full mesh to the switches from the servers. That way if one card/nic/switch fails, I still have a potential 2Gb/s for NFS and 2Gb/s for Iscsi IF and WHEN I lose a card/switch. We have very narrow windows in which maintenance can be performed and when it is performed it is hard due to the rack placements, so installing and setting up the cards allows for us to not have this issue down the road. Really the environment hasn't had proper maintenance in 4+ years to give you an idea of what I am planning for. The servers are going to be fairly loaded with lab environments for students, with about 50-60 students on it Tues-Thurs, most of the labs are very similar such as Sec+/ICM/VCAP/Backupandrecovery/EMC/RH/etc. Mostly spawning from pre-created vAPP's. I'm game for using vDS and divy'ing, however given some of the people who like to tinker with it; I'd like to have at least 2 uplinks, 1 for iscsi the other for NFS IF the VDS becomes corrupt(which given what has happened to it isn't out of the question). The kick back I got was that I'm going to probably run into some bottle neck on the NettApp, currently the only thing in production at this time, which I know I may have to adjust the queue'ing. But given the fact we are implementing a Nexenta Accelerator into the environment, VNX, and Dell; the 4x1Gb nic's seems like a reasonable design decision to make. Especially since the nexenta will be handling a large amount of the reads(some writes), I'd like to minimize slowdowns to the 6 hosts as well as not having to revisit an issue when it becomes a problem Dr. Arbitrary posted:I think it's Intel, it's built right onto the motherboard. If it's an intel L825xx check the lab thread I can't find the link but I think there is a driver floating around for it in there. NippleFloss posted:The likelihood that you have a single host pushing the 1.25 GB/s of throughput required to saturate a single 10GbE link is basically non-existent. With bundled links and iSCSI load balancing your SAN will likely tap out before your network links. Ehh vMotion can chew up some poo poo when you throw it into maintenance mode, especially if it's running view Moey posted:This is pretty common practice. I actually like doing that and recommend it if you are running 10G, even most 1G environments I feel can benefit from it. Not to say I won't use a VDS for some of my traffic but given the history with this particular environment I'd like some backup non VDS'd storage and management just for worst case. Dilbert As FUCK fucked around with this message at 17:26 on Apr 25, 2014 |
# ? Apr 25, 2014 17:20 |
|
Dilbert As gently caress posted:If it's an intel L825xx check the lab thread I can't find the link but I think there is a driver floating around for it in there. I've done a little research and I think there is. The problem I'm having is that I'm not really comprehending what I'd be doing to fix this even at a very zoomed out level. My experience installing ESXi onto a machine is pretty much: Turn it on, pop in the disk, press OK a bunch of times, set an IP address, connect via vSphere. (I'm recalling the steps that lead to failure here from memory because I'm not at work presently) Because of the network card problem, ESXi won't even finish the install. There's a point early in the process where I can change the install options, I think by pressing ctrl-O or shift-O. This pops up a command line. From here, I'm assuming there's some way for me to install the driver via command line and then I'll be ok. If not, then I don't even have a high level comprehension of what I'm even doing, like I'm approaching this problem from a completely wrong and weird direction due to expectations I've developed while working almost exclusively with Windows based Operating Systems.
|
# ? Apr 25, 2014 17:51 |
|
Dr. Arbitrary posted:I've done a little research and I think there is. The problem I'm having is that I'm not really comprehending what I'd be doing to fix this even at a very zoomed out level. You spin a new image which includes the VIB that has the driver for your card and use that for install, generally.
|
# ? Apr 25, 2014 18:04 |
|
evol262 posted:You spin a new image which includes the VIB that has the driver for your card and use that for install, generally. Aha! I would have wasted a lot of time. Thanks. This'll be something new.
|
# ? Apr 25, 2014 18:48 |
|
Anything I can check to make sure memory performance is the best it can be on our VMWare hosts? Finance has some new servers running a memory intensive OLAP server (TM1) and performance isn't where they want to be. Now starts the fight between infrastructure(my team), applications, and vendor on why performance is not up to snuff. Application and vendor don't understand VMware so they blame us. I'm 95% sure our VMware environment is setup optimally. ESXi 5.1u2 running on DL360p G8 w/ dual E5-2690's and 384GB RAM. Their main VM is running 4vCPU with 64GB RAM and none of the performance counters anywhere come close to showing a heavy load.
|
# ? Apr 25, 2014 18:49 |
|
skipdogg posted:Anything I can check to make sure memory performance is the best it can be on our VMWare hosts? Finance has some new servers running a memory intensive OLAP server (TM1) and performance isn't where they want to be. Now starts the fight between infrastructure(my team), applications, and vendor on why performance is not up to snuff. Application and vendor don't understand VMware so they blame us. I'm 95% sure our VMware environment is setup optimally. ESXi 5.1u2 running on DL360p G8 w/ dual E5-2690's and 384GB RAM. Their main VM is running 4vCPU with 64GB RAM and none of the performance counters anywhere come close to showing a heavy load. Memory performance? unless the guest is maxing out I'd check 1) Ensure tools is latest then vhardware 2) Check swapping at the guest or how the guest is handling the process in ram 3) Ensure EPT is enabled in the BIOS 4) Set a reservation for that VM equal to the ram or disable TPS(last option please don't do this) What is the speed of the ram? 1333, 1600Mhz? Also check to ensure IntelVT-d is enabled Dilbert As FUCK fucked around with this message at 19:19 on Apr 25, 2014 |
# ? Apr 25, 2014 19:05 |
|
sanchez posted:Something I've been pondering recently along those lines is what to do with 10GbE.
|
# ? Apr 25, 2014 19:28 |
|
Dilbert As gently caress posted:Memory performance? unless the guest is maxing out I'd check The guest seems fine. I'm really just throwing poo poo at the wall and seeing what sticks to be honest, maybe someone has ran into some deep setting that might help. Tools are latest, Hardware is v7 though, not sure if moving to 9 would help. Ram speed is 1333 Mhz, operating at 1067Mhz due to the config (24 x 16GB Low Voltage RDIMM) I thought about setting a memory reservation for the machine, but the host it's on right now is only using 90/384GB of RAM, so it shouldn't be a problem I would think. Personally we're convinced that the application needs to be tuned properly, but you know.. politics and poo poo.
|
# ? Apr 25, 2014 20:50 |
|
|
# ? Apr 26, 2024 05:54 |
|
skipdogg posted:The guest seems fine. I'm really just throwing poo poo at the wall and seeing what sticks to be honest, maybe someone has ran into some deep setting that might help. Virtual hardware definitely can help improve guest operations, but yeah I'd place my money on application tuning.
|
# ? Apr 25, 2014 21:01 |