Virtualization Megathread V2: VMs inside VMs

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > Virtualization Megathread V2: VMs inside VMs

«‹›312 »

theperminator: Sep 16, 2009; by Smythe; Fun Shoe

If the task has to be compromised to the point where it's a ticking timebomb due to money/staff skill then the responsible thing to do is tell management you can't do it and why.

# ? Apr 12, 2015 05:13

Adbot: ADBOT LOVES YOU

# ? Apr 29, 2024 09:17

Zero VGS: Aug 16, 2002; ASK ME ABOUT HOW HUMAN LIVES THAT MADE VIDEO GAME CONTROLLERS ARE WORTH MORE; Lipstick Apathy

NippleFloss posted:

Honestly the whole thing is weird. He originally stated that his requirements were 100% uptime and when told it was basically impossible with his budget he said "well obviously I don't really mean 100%, you guys just read that all wrong". Then he said he can't use VMware because his company is a competitor, but what competitor to VMware can't scrape together more than like 5k for infrastructure hardware to support a 600 person call center that is presumably pretty i important, otherwise why pay 600 people to man it? And then you go back to his previous job post where he basically jumped into a job that he was very under-qualified for because he wants to stack that paper and retire early or something and the whole thing paints this portrait of a really dysfunctional environment that he's treating as a playground to try whatever random idea pops up.

I think a single hyper-v host running on hardware raid with frequent backups to a cheap nas would end up being more resilient because it would have a support contract backing it and would be more in line with his technical capabilities. And if management truly doesn't care about uptime then the lack of storage redundancy wouldn't be a show stopper.

To address all that:

- I was thinking I was being theoretical when I first said 100% uptime, that was my bad for not realizing that is an actual term and can be taken seriously. The two VMs I have slated so far on this are for voicemails (which management confirmed almost no one actually uses in the call center, to the point that we agreed to buy standalone user licenses without voice mailbox licenses going forward), and recording PoE security camera footage (which was mission-critical at my last job, not for a call center). They can even be taken down for maintenance for several days if need be. But I want to learn to build for high availability because if this winds up being rock-solid, I can start virtualizing other nice-to-have quality of life servers.

- Support is the key word here, I'm planning this system to support the call center staff but the only true line-of-business stuff we run is O365 and Salesforce. Plenty of people can work just as well from home without any VPN.

- My first IT job had me running the whole network on a $2 billion warship, after that I was the only IT guy for three psychiatric hospital facilities. I wasn't under-qualified for either, I was given embarrassingly small budgets and I learned that bitching to management doesn't solve anything when the scrimping is institutional. I decided to embrace it, be really meticulous with backups, deliberate with changes, and document the hell out of all my ghetto hacks. My old bosses still tell me they appreciate it. And I just insisted on some haughty job title because it was free to negotiate it and while it might not help my career it certainly doesn't hurt.

- A single hyper-v host may indeed be more appropriate, but I'm prepared to put in the work to avoid nag calls about needing to reboot the thing, like I do now with the moron IT guy who left this Dell Optiplex running the voicemail.

Can I coin the phrase "developroductionment"?

# ? Apr 12, 2015 06:26

Gucci Loafers: May 20, 2006; Ask yourself, do you really want to talk to pair of really nice gaudy shoes?

Docjowles posted:

I'm with evol on this one (and MC Fruit Stripe talked about this behavior among tech people recently too). Dude has given us his situation and requirements, and they aren't going to change. Even if you think they are dumb--which for the record, I do. Screaming "UR DOIN IT RONG" isn't going to help anything. Might as well give him the best path to "success" given his constraints.

I don't know what Fruit Stripe is referring to or the background but it's clear the company is asking for the impossible.

Instead of trying to figure out the best way to jump on a grenade I'd re-focus and look for greener pastures. When it goes off not only will you be unemployed your future career will become progressively difficult.

# ? Apr 12, 2015 08:01

Pile Of Garbage: May 28, 2007

Zero VGS posted:

To address all that:

- I was thinking I was being theoretical when I first said 100% uptime, that was my bad for not realizing that is an actual term and can be taken seriously. The two VMs I have slated so far on this are for voicemails (which management confirmed almost no one actually uses in the call center, to the point that we agreed to buy standalone user licenses without voice mailbox licenses going forward), and recording PoE security camera footage (which was mission-critical at my last job, not for a call center). They can even be taken down for maintenance for several days if need be. But I want to learn to build for high availability because if this winds up being rock-solid, I can start virtualizing other nice-to-have quality of life servers.

- Support is the key word here, I'm planning this system to support the call center staff but the only true line-of-business stuff we run is O365 and Salesforce. Plenty of people can work just as well from home without any VPN.

- My first IT job had me running the whole network on a $2 billion warship, after that I was the only IT guy for three psychiatric hospital facilities. I wasn't under-qualified for either, I was given embarrassingly small budgets and I learned that bitching to management doesn't solve anything when the scrimping is institutional. I decided to embrace it, be really meticulous with backups, deliberate with changes, and document the hell out of all my ghetto hacks. My old bosses still tell me they appreciate it. And I just insisted on some haughty job title because it was free to negotiate it and while it might not help my career it certainly doesn't hurt.

- A single hyper-v host may indeed be more appropriate, but I'm prepared to put in the work to avoid nag calls about needing to reboot the thing, like I do now with the moron IT guy who left this Dell Optiplex running the voicemail.

Can I coin the phrase "developroductionment"?

I've only been half-heartedly skimming the discussion going on but this post has made be look back through and re-read things. Honestly you sound super-motivated but somewhat clueless about your place in the business which is solely to provide value. Building a production system on a platform of which you have limited knowledge and no reference architecture available doesn't add value and only adds risk. Building a proof-of-concept system and a building a production system are two entirely separate tasks yet you want to merge them into one which is a universally bad idea. Sure, maybe your system will work flawlessly but have you accounted for the amount of time which will be required to write documentation to support the system? Because if not then you still aren't adding value because a working but undocumented system is useless.

I think you need to take a step back and rethink things.

# ? Apr 12, 2015 09:05

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

Do whatever you think gets you the best career development, because with this budget it's clear the company doesn't and won't give a poo poo about the outcome anyway. At least when the time comes to swim away from this sinking ship of an IT operation you'll be closer to shore.

# ? Apr 12, 2015 13:58

Internet Explorer: Jun 1, 2005

Local storage on two hosts with backups and the ability to restore to the other host is all that is needed. The current plan is super reckless.

# ? Apr 12, 2015 17:00

evol262: Nov 30, 2010; #!/usr/bin/perl

ITT: goonsplaining about how software stacks they've never used and know nothing about are reckless

In what world is an active+cold standby less reckless than just backing up the brick (or its contents) on the off chance that something goes catastrophically wrong with gluster's distributed replicate on a system he won't need to update or manage for months?

Tab8715 posted:

I don't know what Fruit Stripe is referring to or the background but it's clear the company is asking for the impossible.

Instead of trying to figure out the best way to jump on a grenade I'd re-focus and look for greener pastures. When it goes off not only will you be unemployed your future career will become progressively difficult.

It sounds like the company isn't asking for this at all, but they told him he can do something with it if he wants. What the blowback will be if something goes wrong is another question (virt-p2v is pretty easy anyway), but it sounds like most of their business critical stuff isn't on premises anyway.

The killer for career prospects is the insistence on not taking a salary/title cut to go to a position where his skill level is appropriate, which will probably continue when/if this goes down.

Misogynist posted:

Do whatever you think gets you the best career development, because with this budget it's clear the company doesn't and won't give a poo poo about the outcome anyway. At least when the time comes to swim away from this sinking ship of an IT operation you'll be closer to shore.

It sounds like pretty normal "small shop with one IT guy gives him free reign, has no idea what a good environment looks like" stuff to me. I have a friend back home with a director title who's also the sole IT guy for a reasonably-sized business (which is the home office for a bunch of franchises to which they provide email addresses and a basic wiki+forum or something), and everything he talks about makes me cringe. Same for my wife's uncle's agriculture business. Runs fine (or makes low 10s of millions a year), but they have one IT guy who has no idea what he's doing and are literally still using coppermine p3s running win2k for some of their applications. Nobody inside the business sees this as a time bomb.

It could keep going like this for a long time...

# ? Apr 12, 2015 17:43

Gucci Loafers: May 20, 2006; Ask yourself, do you really want to talk to pair of really nice gaudy shoes?

evol262 posted:

ITT: goonsplaining about how software stacks they've never used and know nothing about are reckless

vs. actually using one you know nothing about and in production?

evol262 posted:

The killer for career prospects is the insistence on not taking a salary/title cut to go to a position where his skill level is appropriate, which will probably continue when/if this goes down.

If you didn't know that ssh was used to interact with linux your job title ought to not include senior.

# ? Apr 12, 2015 20:11

evol262: Nov 30, 2010; #!/usr/bin/perl

Tab8715 posted:

vs. actually using one you know nothing about and in production?

I'm not sure what that's supposed to mean. The whole "he's already doing this even if you think it's a bad idea" has gone round and round on the last two pages, and bringing it up again isn't advancing the discussion in any way.

If you're implying that I don't know anything about VMware or hyper-v and haven't used them in production, you're wildly off base.

If you mean he's using a software stack he knows nothing about, that's all virt stacks, and there's no way to fulfill what he wants to do without one. If you mean "production" by "this app could probably be down for a few days, because everything is on o365 and salesforce", we have really different ideas of production.

Tab8715 posted:

If you didn't know that ssh was used to interact with linux your job title ought to not include senior.

There are potentially large numbers of "senior" people (.net devs, windows admins, etc) who don't know anything about Linux and don't need to.

# ? Apr 13, 2015 00:21

Gucci Loafers: May 20, 2006; Ask yourself, do you really want to talk to pair of really nice gaudy shoes?

evol262 posted:

There are potentially large numbers of "senior" people (.net devs, windows admins, etc) who don't know anything about Linux and don't need to.

There are senior Windows Admins, Developers who've probably haven't heard of ssh but that isn't the case. You trying implement a solution that's command-line and linux heavy and the user isn't experienced when you should really have some prior background.

I missed the earlier discussion but we're essentially arguing what's the safest way to dive on a grenade.

# ? Apr 13, 2015 04:30

evol262: Nov 30, 2010; #!/usr/bin/perl

Tab8715 posted:

There are senior Windows Admins, Developers who've probably haven't heard of ssh but that isn't the case. You trying implement a solution that's command-line and linux heavy and the user isn't experienced when you should really have some prior background.

That's the thing, though. It isn't. As I've said repeatedly. adorai may be the only person in this thread who's used it.

It takes, as noted, 3 commands to bring up the engine, which let's you configure everything. We also publish an image which requires zero commands and hides the command line. There's a menu-driven configuration for a number of fields, but you only need hostname and network.

It isn't command line heavy. At all:

Neither is adding networks to hosts:

Or configuring them:

It's a virtualization solution based on Linux. Not a Linux solution for virtualization. You can do whatever Linux thing you want, but you don't need to know Linux any more than you need to for XenServer (Linux based) or needed to for ESX (management console was Linux based).

The only common use case you still need to use the console for is adding isos to boot from, which is obnoxious.

This thread treats open source like a pejorative, and assumes that an open source, upstream solution is going to be some hacky thing that breaks frequently and requires a bunch of Linux knowledge. Again, it's basically RHEV, which does extremely well on ease of use comparisons.

RHEV has low marketshare. I'm not surprised people haven't used it. But don't talk authoritatively about what it is and what it's like if you don't have any idea.

Tab8715 posted:

I missed the earlier discussion but we're essentially arguing what's the safest way to dive on a grenade.

Read the earlier discussion or read between the lines of the responses where it's a side project he wants to do that the business doesn't care about or see as critical.

So we're saying "in a two host environment with no storage array, no budget, and no business pressure, how do you make things as resilient as possible?"

Not how to dive on a grenade. And it's a better task for a lab at home. But the incessant harping and need for a pecking order on SH/SC is out of hand. For the nth time, everyone already told him it was a bad idea. Everyone reading the thread gets that. It's not your rear end on the line if anything happens. It's not your responsibility if he lied to this thread and it is business critical. It's past that time.

# ? Apr 13, 2015 05:43

Gucci Loafers: May 20, 2006; Ask yourself, do you really want to talk to pair of really nice gaudy shoes?

Fair defense, I stand corrected.

# ? Apr 13, 2015 15:46

adorai: Nov 2, 2002; 10/27/04 Never forget; Grimey Drawer

I'll confirm that it is pretty easy to get ovirt working on centos7. Hosted engine and gluster do make it more complicated.

# ? Apr 13, 2015 17:49

Cidrick: Jun 10, 2001; Praise the siamese

Are there any good design docs for setting up a distributed virtualization (oVirt + KVM or otherwise) cluster using all local disk, with shared storage running on GlusterFS bricks? I'd kind of like to try it out in our lab environment as a POC since we have a bunch of old Hadoop machines lying around with fat local SATA drives that I would love to start stacking VMs on. I have zero experience with Gluster but I'd like to start playing with it.

I'm not too concerned with a step by step guide on what to do, but rather a "here's how you should lay things out and here's how you should scale it" type of write-up.

# ? Apr 13, 2015 17:58

evol262: Nov 30, 2010; #!/usr/bin/perl

Cidrick posted:

Are there any good design docs for setting up a distributed virtualization (oVirt + KVM or otherwise) cluster using all local disk, with shared storage running on GlusterFS bricks? I'd kind of like to try it out in our lab environment as a POC since we have a bunch of old Hadoop machines lying around with fat local SATA drives that I would love to start stacking VMs on. I have zero experience with Gluster but I'd like to start playing with it.

I'm not too concerned with a step by step guide on what to do, but rather a "here's how you should lay things out and here's how you should scale it" type of write-up.

oVirt is always KVM. For better or worse, it's not something you can just stick on top of an existing environment, though. It expects to be on a dedicated virt setup and to own all the relevant components. vdsm (essentially the glue between libvirt/network/storage and the web ui) in particular doesn't play well. Migrating from a plain libvirt environment to ovirt involves standing up one host and virt-v2v'ing machines. If you're starting from scratch, though...

What kind of use case are you going for? Ceph RBD isn't supported by oVirt, but it's significantly better at some workloads as long as you're ok doing a little extra work. It's especially good at tiering storage and letting you configure fast/slow pools, or splitting up pools of disks on a single chassis, which gluster is frankly poo poo at.

If you've got a bunch of identical machines with no other constraints, though, gluster's pretty great. You basically want to set up the disks for optimal local performance (hardware raid or mdraid or whatever), mount them somewhere, and use that as a volume. Change the volume ownership to the kvm or qemu user.

No real reason to ever use more than replicas=3, which just ends up with a ton of storage traffic and lost capacity for not a lot of gain.

Gluster really starts to shine at 4+ hosts (and it actually gets outperformed by local filesystems with less than 3 hosts, last time I saw). lookup-unhashed should almost always be on (otherwise it multicasts to all nodes and is a serious killer when dealing with a lot of small files, which libvirt isn't, but I don't know your use case).

Other important tuneables are:

performance.write-behind-window-size � Default: 1MB.
performance.cache-refresh-timeout � Default: 1 second.
performance.cache-size � Default: 32MB.
cluster.stripe-block-size � Default: 128KB.
performance.io-thread-count � Default: 16.

What these do is mostly obvious by their names. You probably want to crank up the block size for qcows, but I'll take it for granted that a hadoop shop knows how to tune disks.

I'd also say that if you have the budget, I'd probably just use VMware+vSAN. The gluster support is ok, but there are a ton of integration pieces slated for oVirt 3.6 which will it a lot nicer, but that caveat goes everywhere.

# ? Apr 13, 2015 18:42

Cidrick: Jun 10, 2001; Praise the siamese

evol262 posted:

oVirt is always KVM. For better or worse, it's not something you can just stick on top of an existing environment, though. It expects to be on a dedicated virt setup and to own all the relevant components. vdsm (essentially the glue between libvirt/network/storage and the web ui) in particular doesn't play well. Migrating from a plain libvirt environment to ovirt involves standing up one host and virt-v2v'ing machines. If you're starting from scratch, though...

I don't really have any plans to migrate from anything, so this would be from scratch, and the oVirt piece is just going to be for my POC environment. My long term goal is to design a compute and storage platform to move our internal cloud platform onto (which is already running on Cloudstack + KVM) which is all running on local local disk on 1U pizza boxes, into something that runs against a shared storage pool sitting on top of Gluster so that we get the flexibility of having shared storage so we can juggle machines around in the environment, without throwing tons of money at Hitachi for another array.

Don't get me wrong - all of our Hitachi arrays have been rock solid, but they're very expensive to get going and to manage, and they're difficult to scale. I'd much rather buy more cheap servers and shove them into the cluster since we can get a new server in a couple of days, whereas adding shelves to a Hitachi array (or Nimble, or Netapp, or whatever you end up using) takes weeks or months depending on the vendor and how much of a pain in the rear end the procurement department is feeling like being that week. Yes, I realize that adding commodity hardware with a bunch of local disk is not going to be as rock solid as dedicated storage array, but my hope is that Gluster is robust enough noawadays to be able to gracefully handle hardware failures in the environment with all its self-healing features.

evol262 posted:

What kind of use case are you going for? Ceph RBD isn't supported by oVirt, but it's significantly better at some workloads as long as you're ok doing a little extra work. It's especially good at tiering storage and letting you configure fast/slow pools, or splitting up pools of disks on a single chassis, which gluster is frankly poo poo at

If you've got a bunch of identical machines with no other constraints, though, gluster's pretty great. You basically want to set up the disks for optimal local performance (hardware raid or mdraid or whatever), mount them somewhere, and use that as a volume. Change the volume ownership to the kvm or qemu user..

This is pretty much the use case, yes. Our local disk footprint for our app tier is both minimal and ephemeral, which is how we get away with running everything on local disk without any shared storage. The only thing our apps really do is log to disk, which we're shipping off to a logstash environment anyway, so performance isn't a huge concern. Or at least, not at the forefront of our requirements. The VMs are all qcow2s based off CentOS images that I maintain, so the image deduplication is already handled for us in that regard.

I hadn't looked at Ceph though. This is still just a twinkle in my eye at the moment so I'm trying to figure out what's out there.

evol262 posted:

I'd also say that if you have the budget, I'd probably just use VMware+vSAN. The gluster support is ok, but there are a ton of integration pieces slated for oVirt 3.6 which will it a lot nicer, but that caveat goes everywhere.

VMware is off the table completely for us. As much as I have loved working with VMware over the years, my company's relationship with them is basically irreparably poisoned so I'm looking at open source alternatives. We had about a hundred blades all running ESXi in a nicely partitioned farm all backed by a couple of NetApp heads that worked fairly well, but I'm faced with moving everything onto Cloudstack, so I'm trying to come up with a nicely scaling platform that will handle it all without the limitation of a single pair of NetApps being the single choke point for all storage in the environment.

# ? Apr 13, 2015 20:00

evol262: Nov 30, 2010; #!/usr/bin/perl

Cidrick posted:

I don't really have any plans to migrate from anything, so this would be from scratch, and the oVirt piece is just going to be for my POC environment. My long term goal is to design a compute and storage platform to move our internal cloud platform onto (which is already running on Cloudstack + KVM) which is all running on local local disk on 1U pizza boxes, into something that runs against a shared storage pool sitting on top of Gluster so that we get the flexibility of having shared storage so we can juggle machines around in the environment, without throwing tons of money at Hitachi for another array.

Don't get me wrong - all of our Hitachi arrays have been rock solid, but they're very expensive to get going and to manage, and they're difficult to scale. I'd much rather buy more cheap servers and shove them into the cluster since we can get a new server in a couple of days, whereas adding shelves to a Hitachi array (or Nimble, or Netapp, or whatever you end up using) takes weeks or months depending on the vendor and how much of a pain in the rear end the procurement department is feeling like being that week. Yes, I realize that adding commodity hardware with a bunch of local disk is not going to be as rock solid as dedicated storage array, but my hope is that Gluster is robust enough noawadays to be able to gracefully handle hardware failures in the environment with all its self-healing features.

Honestly, I'd say that you should move to Openstack if you wanna move away from cloudstack. We recently (Icehouse-ish) added support for HA to Nova (the compute agent). Ceph is kind of the default, and it natively does block storage nicer than gluster (ceph -> block, gluster -> file, swift -> object), but it also works with gluster's block storage mode, supposedly. Which I've never used.

oVirt is great and all, and we're converging with parts of the openstack ecosystem. Glance (image), Neutron (network), and upcoming Cinder (block) support are there. We support cloud-init. But it's booting from templates. You can create VMs from templates and configure them and boot them up with cloud-init data from java/json/python really easily, but it's more like "here's this cloud-y stuff if you want a traditional virt system that can do some of this stuff." If you're full-bore into cloud (like cloudstack), openstack is just a more natural fit. And you can still make that HA and juggle stuff around with Ceph backed volumes.

Cidrick posted:

This is pretty much the use case, yes. Our local disk footprint for our app tier is both minimal and ephemeral, which is how we get away with running everything on local disk without any shared storage. The only thing our apps really do is log to disk, which we're shipping off to a logstash environment anyway, so performance isn't a huge concern. Or at least, not at the forefront of our requirements. The VMs are all qcow2s based off CentOS images that I maintain, so the image deduplication is already handled for us in that regard.

Openstack.

Cidrick posted:

I hadn't looked at Ceph though. This is still just a twinkle in my eye at the moment so I'm trying to figure out what's out there.

Openstack!

Seriously. I work on both products, and I'd recommend oVirt if it fit what it sounds like you want (and if you want to use traditional virt, feel free to contradict me), but...

Cidrick posted:

VMware is off the table completely for us. As much as I have loved working with VMware over the years, my company's relationship with them is basically irreparably poisoned so I'm looking at open source alternatives. We had about a hundred blades all running ESXi in a nicely partitioned farm all backed by a couple of NetApp heads that worked fairly well, but I'm faced with moving everything onto Cloudstack, so I'm trying to come up with a nicely scaling platform that will handle it all without the limitation of a single pair of NetApps being the single choke point for all storage in the environment.

Most of the companies I've been at are moving off of VMware, but I still recommend it for some things that there just aren't good alternatives to. If what people are looking for is a basic traditional virt setup comparable to vSphere, there are loads of options. Once you start adding vShield and GRID and all the rest, it gets complex really fast, and it's better to just recommend VMware. But I get what you mean. Especially as a shop with a significant cloud infrastructure, they don't have a lot to offer.

# ? Apr 13, 2015 20:48

bull3964: Nov 18, 2000; DO YOU HEAR THAT? THAT'S THE SOUND OF ME PATTING MYSELF ON THE BACK.

We're going to be changing up the network adapters in our VMWare hosts soon and I'm trying to clarify a few things relating to MTU.

We have jumbo frames enabled on our VMotion network and iSCSI network, but not for general virtual machine networks or management network.

Up until now, everything has had their own vSwitch since we were using 12 1gb adapters per host. All functions were segregated. We will be moving to 4x 10gb adapters so things will be sharing.

So, if I wanted Management, VMotion, and Virutal Machine traffic to be on the same vSwitch (sharing two of the 10gb adapters) then I would set the vSwitch to 9000 MTU, VMKernel Ports for VMotion to 9000 MTU, Management VMKernel ports to 1500 MTU, and the Virtual Machine Port groups will use whatever MTU the guest adapters in the virtual machines are configured with?

That last bit is what tripped me up since there is no MTU configuration option for the Virtual Machine Port Groups in VMWare, but I guess it makes sense that it's up to the guest's setting.

bull3964 fucked around with this message at 23:06 on Apr 13, 2015

# ? Apr 13, 2015 22:59

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

evol262 posted:

Read the earlier discussion or read between the lines of the responses where it's a side project he wants to do that the business doesn't care about or see as critical.

So we're saying "in a two host environment with no storage array, no budget, and no business pressure, how do you make things as resilient as possible?"

Not how to dive on a grenade. And it's a better task for a lab at home. But the incessant harping and need for a pecking order on SH/SC is out of hand. For the nth time, everyone already told him it was a bad idea. Everyone reading the thread gets that. It's not your rear end on the line if anything happens. It's not your responsibility if he lied to this thread and it is business critical. It's past that time.

Nobody in this thread, or anywhere, would have any career whatsoever if working in IT wasn't a balancing act between that's best for the business and what's best for career development and growth of people on the team. If you spend 0% of your day trying to take on challenges that are too big for you, you never escape the call center. In dead-end environments with little to no growth potential, sometimes you just have to make those projects happen for yourself.

If there's one thing that drives me up the fuckin' wall, it's these people who insist you need to spend five years watching a cook before you can even touch the sushi rice.

Vulture Culture fucked around with this message at 00:02 on Apr 14, 2015

# ? Apr 14, 2015 00:00

adorai: Nov 2, 2002; 10/27/04 Never forget; Grimey Drawer

Misogynist posted:

If there's one thing that drives me up the fuckin' wall, it's these people who insist you need to spend five years watching a cook before you can even touch the sushi rice.

I think they are more concerned that he should learn proper food prep and storage techniques before the customers get food poisoning and die from eating un(der)cooked seafood without informed knowledge of the risks. Personally, I am a believer in the IT cowboy in smaller businesses, it's fun and most of the time it works out.

# ? Apr 14, 2015 00:42

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

adorai posted:

I think they are more concerned that he should learn proper food prep and storage techniques before the customers get food poisoning and die from eating un(der)cooked seafood without informed knowledge of the risks. Personally, I am a believer in the IT cowboy in smaller businesses, it's fun and most of the time it works out.

# ? Apr 14, 2015 00:44

Cidrick: Jun 10, 2001; Praise the siamese

evol262 posted:

Openstack!

Seriously. I work on both products, and I'd recommend oVirt if it fit what it sounds like you want (and if you want to use traditional virt, feel free to contradict me), but...

Heh, fair enough. Honestly I'm less concerned about migrating from Cloudstack to Openstack because most everyone I work with in operations at my company is on board with doing that, but my concern is coming up with a scaled storage model that will work with both Cloudstack AND Openstack while we transition. I mostly wanted to dick with oVirt + Gluster just so I could learn the ropes of a distributed storage environment without having to whole-hog with a full infrastructure stack, but I suppose I had better just dive in, because you're absolutely right, there's not really a reason to play with oVirt at this point.

How stable and mature is Ceph? Does it do all the replication and self-healing of distributed storage that Gluster does? I admittedly know very little about it.

# ? Apr 14, 2015 00:59

adorai: Nov 2, 2002; 10/27/04 Never forget; Grimey Drawer

Cidrick posted:

How stable and mature is Ceph? Does it do all the replication and self-healing of distributed storage that Gluster does? I admittedly know very little about it.

Ceph is pretty great. I haven't had an opportunity to play with it in a real environment, but everything I have read and what I have done in the lab is good stuff. CERN uses it extensively IIRC. Like, thousands of nodes.

# ? Apr 14, 2015 02:09

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

adorai posted:

Ceph is pretty great. I haven't had an opportunity to play with it in a real environment, but everything I have read and what I have done in the lab is good stuff. CERN uses it extensively IIRC. Like, thousands of nodes.

We're also using it, at significantly smaller scale of course, but we've put it through the wringer with lots of interesting failure cases and it does exceedingly well at self-healing everything besides bitrot. (No distributed filesystem copes well with this yet, but you can mitigate it by using something like ZFS under the hood on all your OSD nodes if it concerns you.)

One thing to note is that if you're planning on using it for general-purpose file storage using CephFS, the MDS (metadata server) can't scale out as of the Firefly release, and it's a single point of failure. (You can run multiple MDSes in active/standby, but failover isn't a zero-downtime operation.) My view is that GlusterFS is unequivocally and almost without exception a better option as a POSIX filesystem. The RBD and object storage components of Ceph work great, though.

Vulture Culture fucked around with this message at 03:00 on Apr 14, 2015

# ? Apr 14, 2015 02:54

adorai: Nov 2, 2002; 10/27/04 Never forget; Grimey Drawer

Misogynist posted:

My view is that GlusterFS is unequivocally and almost without exception a better option as a POSIX filesystem. The RBD and object storage components of Ceph work great, though.

It's certainly easier. It was obvious to me that you needed someone who was an expert in Ceph to really get the benefit from it.

I wish I could justify the manpower to play with these things at my employer. We're just not big enough to stray from the tried and true VMware + HA SAN, except for very small scale projects. Sad

# ? Apr 14, 2015 03:33

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

adorai posted:

It's certainly easier. It was obvious to me that you needed someone who was an expert in Ceph to really get the benefit from it.

I wish I could justify the manpower to play with these things at my employer. We're just not big enough to stray from the tried and true VMware + HA SAN, except for very small scale projects. Sad

Ceph really isn't difficult, in the sense that it doesn't really do all that much. I picked it up enough to get OpenStack running on it in a day or so. Some of the concepts are a little bit tricky to grasp, like the specific implementation of the CRUSH map, but having the server distribute the hashing algorithm to clients isn't a new approach for any distributed filesystem.

I referenced this comparison of distributed filesystems when putting together my slides, and it summarizes a lot of what you need to know about Ceph (cache coherence, repair, etc.) into a handful of paragraphs in Chapter 2: https://hal.inria.fr/hal-00789086/document

I'd still be wary of distributed filesystems for serious heterogeneous workloads like VMware clusters right now if only because the performance monitoring and analysis tools available on them are really poo poo-poor. For fairly homogeneous app workloads, they do just fine. VMware also doesn't support any parallel filesystems like pNFS, though, making their utility questionable unless you go through some really crazy poo poo with re-exporting your RBDs as iSCSI targets and multipathing them over different hosts.

Vulture Culture fucked around with this message at 03:56 on Apr 14, 2015

# ? Apr 14, 2015 03:49

Nitr0: Aug 17, 2005; IT'S FREE REAL ESTATE

bull3964 posted:

We're going to be changing up the network adapters in our VMWare hosts soon and I'm trying to clarify a few things relating to MTU.

We have jumbo frames enabled on our VMotion network and iSCSI network, but not for general virtual machine networks or management network.

Up until now, everything has had their own vSwitch since we were using 12 1gb adapters per host. All functions were segregated. We will be moving to 4x 10gb adapters so things will be sharing.

So, if I wanted Management, VMotion, and Virutal Machine traffic to be on the same vSwitch (sharing two of the 10gb adapters) then I would set the vSwitch to 9000 MTU, VMKernel Ports for VMotion to 9000 MTU, Management VMKernel ports to 1500 MTU, and the Virtual Machine Port groups will use whatever MTU the guest adapters in the virtual machines are configured with?

That last bit is what tripped me up since there is no MTU configuration option for the Virtual Machine Port Groups in VMWare, but I guess it makes sense that it's up to the guest's setting.

The virtual machines will use whatever you set them to in the guest OS. If you have your port group set to 1500 mtu and your guest os to 9000 mtu poo poo will not work.

Everything else you said looks good

# ? Apr 14, 2015 03:52

bull3964: Nov 18, 2000; DO YOU HEAR THAT? THAT'S THE SOUND OF ME PATTING MYSELF ON THE BACK.

Well, you can't set the MTU on Virtual Machine Port Groups. I just needed to confirm that if they were in a vSwitch that was 9000 MTU, the guests would still be 1500 MTU if that's what the guest was set at.

So, I will have this:

code:

vSwitch (9000 MTU)
    Management VKernel (1500 MTU)
    VMotion VKernel (9000 MTU)
    VM Port Group (determined by guest OS)

Then the other two 10gb adapters would be dedicated to iSCSI. The only other philosophical question I'm batting around is in guest iSCSI. A handful of VMs will be using in guest iSCSI and I'm trying to decide where to put that. Access to the storage network could be given from the VM Port Group from the first two adapters or I could create a VM Port Group along side the iSCSI VKernel ports on the 2nd two adapters and have all iSCSI traffic (VMWare and in guest iSCSI) there on that 2nd vSwitch. I'm leaning towards that 2nd option as cleaner so storage traffic wouldn't have the possibility of choking VM network traffic.

# ? Apr 14, 2015 04:11

goobernoodles: May 28, 2011; Wayne Leonard Kirby.

Orioles Magician.

One of my two offices has only one host on local storage running a DC and some file, print, and super low-end application servers. It's a small office with about 20-30 people. The long term plan is to replace the core server and storage infrastructure in our main office, then potentially bringing the SAN and servers to the smaller office to improve on their capacity as well as have enough resources to act as a DR site. Until then though, I was planning on loading up a spare host down with 2.5" SAS or SATA drives in order to get some semblance of redundancy down there, as well as being able to spin up new servers to migrate the old 2003 servers to 2012. Right now, there's ~50Gb of free space on the local datastore. I'm looking for at least 1.2tb of space on the server I take down. I'm trying to decide on what makes the most sense from a cost, performance, resiliency and future usability standpoint. I'm trying to keep everything under a grand.

The spare x3650 I have has 8 total 2.5" bays (I have 3x 73Gb 10k SAS drives handy) but the downside is that 2.5" SAS drives are pretty spendy from what I've found so far. At least IBM drives, anyway.

I've been considering grabbing another IBM x3650 with 3.5" trays for about $130 a few blocks away, since, for some reason I have 4 500Gb IBM 7.2k SATA drives laying around. No idea why. We don't have any IBM servers with 3.5" bays. :iiam:

At that point though, if I chose to go SATA, I might as well load the thing up with much larger drives since they're so cheap.

I was thinking of installing either ESXi or Freenas, though I'm open to trying something else to present the storage. I also have a spare SAS controller as well as plenty of memory and a couple HBA's. I've never actually tried it - you can mix SAS and SATA drives on the same controller, right - assuming different RAID arrays?

# ? Apr 15, 2015 01:04

Thanks Ants: May 21, 2004; #essereFerrari

I've been presented with vSphere 5 (the free hypervisor) running on a shitbox HP tower with only one hard disk in :downs:

. I'm convinced that the disk is dying a slow death - one of the Windows guests will run chkdsk on boot and take ~2 hours to finish, backup jobs will fail within 15 minutes of starting, and trying to use vmkfstools to copy the vmdk to another datastore fails at the same place each time. Moving the files using the vSphere Client also fails at the same place each time with file read errors.

The VM itself actually boots and runs fine, but I can't seem to move it away from this disk. Even booting something like Clonezilla and trying to copy the volume that way didn't work, failing at around the same point as every other job.

Is there any way to un-gently caress this situation?

# ? Apr 21, 2015 21:33

GobiasIndustries: Dec 14, 2007; Lipstick Apathy

Thanks Ants posted:

I've been presented with vSphere 5 (the free hypervisor) running on a shitbox HP tower with only one hard disk in . I'm convinced that the disk is dying a slow death - one of the Windows guests will run chkdsk on boot and take ~2 hours to finish, backup jobs will fail within 15 minutes of starting, and trying to use vmkfstools to copy the vmdk to another datastore fails at the same place each time. Moving the files using the vSphere Client also fails at the same place each time with file read errors.

The VM itself actually boots and runs fine, but I can't seem to move it away from this disk. Even booting something like Clonezilla and trying to copy the volume that way didn't work, failing at around the same point as every other job.

Is there any way to un-gently caress this situation?

Could you set up a new Windows guest on a non-hosed disk and try migrating the features needed from the VM in question to the newly set up guest?

# ? Apr 21, 2015 21:39

Thanks Ants: May 21, 2004; #essereFerrari

GobiasIndustries posted:

Could you set up a new Windows guest on a non-hosed disk and try migrating the features needed from the VM in question to the newly set up guest?

That is the plan now, it's just a domain controller and print server and that can all be rebuilt pretty easily. I guess a part of me wanted there to be an easy way out of this, but I think it's beyond that point.

# ? Apr 21, 2015 22:44

Moey: Oct 22, 2010; I LIKE TO MOVE IT

I had a similar situation. Standalone host running with local storage. Brought the server into vCenter, SvMotion would fail. Veeam and PHD backups would fail. Either the datastore or the VMDK were corrupt (only VM on that host).

In the long run, had to backup the virtual appliance from within the guest, deploy a new virtual appliance in our production environment, then restore from backup.

First time I have ever came across it.

# ? Apr 22, 2015 01:02

adorai: Nov 2, 2002; 10/27/04 Never forget; Grimey Drawer

Two openstack questions:

1) does anyone have a link to a great Enterprise VMware -> openstack primer? I want to get started playing with openstack, and have lots of generic enterprise VMware experience, but none with openstack, cloudstack, or any cloud provider poo poo like amazon or rackspace.
2) is two quad cores with 8GB of ram each enough to play around with it at home?

# ? May 2, 2015 02:29

minato: Jun 7, 2004; cutty cain't hang, say 7-up.; Taco Defender

Look at the RedHat OpenStack docs, they are miles better than the actual OpenStack docs. https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux_OpenStack_Platform/

Be prepared to spend a lot of time reading docs and scratching your head; OpenStack's documentation is poor.

Use packstack to install it, it is MUCH easier. Even better, you can find all the OpenStack components in Docker containers at index.docker.io, which is useful because you may need to re-install many times in a sandbox.

# ? May 2, 2015 06:29

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

adorai posted:

Two openstack questions:

1) does anyone have a link to a great Enterprise VMware -> openstack primer? I want to get started playing with openstack, and have lots of generic enterprise VMware experience, but none with openstack, cloudstack, or any cloud provider poo poo like amazon or rackspace.
2) is two quad cores with 8GB of ram each enough to play around with it at home?

instead of playing with openstack you can just blow your loving brains out

# ? May 2, 2015 06:49

evol262: Nov 30, 2010; #!/usr/bin/perl

The docker containers are totally unnecessary if you're using packstack, which uses puppet to install/configure everything anyway, so you just need to worry about the answer file.

If you're a chef guy, use rackspace's installers.

If you want an installer that's actually good, use mirantis'.

But there's no good traditional virt to cloud primer. I have an effort post floating around somewhere about what all the components do which goes part way, but to get your head around ephemeral disk and cloud init and booting from generic images and configuring them at runtime and building horizontally scalable apps and all that, you should read an aws primer

# ? May 2, 2015 14:36

nitrogen: May 21, 2004; Oh, what's a 217°C difference between friends?

What the heck does this mean, exactly?

Vmware tools is actually running on the guest this is appearing. I've even reinstalled it.

I'm seeing this on about 15 guests actually, linux, redhat 5.11, vsphere 5.1, virtual machine version 8.

# ? May 4, 2015 18:49

Pile Of Garbage: May 28, 2007

nitrogen posted:

What the heck does this mean, exactly?

Vmware tools is actually running on the guest this is appearing. I've even reinstalled it.

I'm seeing this on about 15 guests actually, linux, redhat 5.11, vsphere 5.1, virtual machine version 8.

Probably a dumb question but the guests have been rebooted right?

# ? May 4, 2015 18:51

Adbot: ADBOT LOVES YOU

# ? Apr 29, 2024 09:17

nitrogen: May 21, 2004; Oh, what's a 217°C difference between friends?

yes, rebooted, had vmware tools reinstalled, restarted, swore at, kicked, bitten and prayed to.

# ? May 4, 2015 19:37

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > Virtualization Megathread V2: VMs inside VMs

«‹›312 »