Virtualization Megathread V2: VMs inside VMs

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > Virtualization Megathread V2: VMs inside VMs

«‹›312 »

some kinda jackal: Feb 25, 2003; �
�

Dr. Arbitrary posted:

Apparently VMWare is bringing in some money these days because they paid The Onion to make some commercials for them.

http://www.theonion.com/sponsored/guardian-angel,98/
http://www.theonion.com/sponsored/support-group,100/

This thread is my Guardian angel.

Maybe if they paid them more money they'd make the commercials funny.

# ? Jun 8, 2014 16:05

Adbot: ADBOT LOVES YOU

# ? Apr 20, 2024 02:35

Dr. Arbitrary: Mar 15, 2006; Bleak Gremlin

Martytoof posted:

Maybe if they paid them more money they'd make the commercials funny.

I don't know what happened with these ones, they knocked it out of the park with the DSW commercial where the guy killed all his pets.

I think the group therapy joke has been done somewhere else, better.

# ? Jun 8, 2014 18:42

complex: Sep 16, 2003

CtrlMagicDel posted:

When you put an ESXi host into maintenance mode and reboot the host via Auto Deploy, is it supposed to come back into Virtual Center in maintenance mode or not in maintenance mode? Our hosts always came back in maintenance mode previously, and after upgrading to 5.1U2 they seem to be booting back up active and immediately having VM's move onto them via DRS. I've had different VMware support people who have told me that one or the other is the expected behavior, including one support guy who both linked me to and quoted some documentation basically verbatim except for a portion which stated that it was supposed to come up in maintenance mode except he had literally changed two words in his quote to indicate it was NOT supposed to come up in maintenance mode It is almost funny except for how infuriating it is.

Edit: We apply a host profile with an answer file that joins the host to an AD domain if that makes a difference.

Shows how you well understood Auto Deploy & Host Profiles are, even inside VMware. Good luck getting proper support for these features. (We've tried.)

The host comes out of maintenance mode if and only if the host profile is successfully and complete applied. Look into the F11 console while booting to see what the host profile is doing, if it is just spinning on, say, enumerating every disk. If this takes longer than the timeout, your host profile is considered "failed". The number of disks it takes for this to happen is surprisingly low.

# ? Jun 8, 2014 19:26

Stealthgerbil: Dec 16, 2004

I have a bunch of servers that I want to set up as a cluster with high availability VMs and have fault tolerance. Is vSphere essentials plus the cheapest way to do that? The standard edition is that price per cpu socket, right? Also is it possible to get discounts on the software?

# ? Jun 9, 2014 02:30

Daylen Drazzi: Mar 10, 2007; Why do I root for Notre Dame? Because I like pain, and disappointment, and anguish. Notre Dame Football has destroyed more dreams than the Irish Potato Famine, and that is the kind of suffering I can get behind.

Stealthgerbil posted:

I have a bunch of servers that I want to set up as a cluster with high availability VMs and have fault tolerance. Is vSphere essentials plus the cheapest way to do that? The standard edition is that price per cpu socket, right? Also is it possible to get discounts on the software?

Essentials Plus is definitely geared towards the SMB market, but HA is specifically geared only towards the VMs - if you're looking to do App HA you would actually need to buy Enterprise Plus. You also need to know in advance how much RAM you intend to make available to each VM since there is a maximum to the number of vCPUs and vRAM entitlements (but I think you should be okay since I'm betting you won't be doing something like 32 vCPUs and 1TB of vRAM - it's also important to note that with FT you are running a primary VM that's mirrored to a secondary VM on a separate host, and that you're limited to 1vCPU and 16GB of vRAM or less to avoid problems in 5.1).

VMware does offer discounts to students and educational institutions - unless you're buying millions of dollars worth of licenses I doubt they give a drat otherwise. Never hurts to ask around however, but you should probably plan on paying full price when you start putting together an estimate.

Also, check to make sure the servers are on the VMware HCL - when 5.5 came out VMware removed a bunch of hardware from the list, and it pretty much sucks to put all that effort into the planning only to find out that the hardware isn't supported. You're also going to need to explain a little more about your environment - what kind of storage are you going to use; how many servers are you planning on standing up; is everything going to be in one datacenter, or scattered across a wide geographic area?

This, of course, is just poo poo I'm throwing out off the top of my head, but you get the idea. When you start looking at designing your environment you really need to consider all the options so you aren't left scrambling near the end when you realize you left a small, but important, detail out of the planning.

# ? Jun 9, 2014 03:01

DevNull: Apr 4, 2007; And sometimes is seen a strange spot in the sky
A human being that was given to fly

Martytoof posted:

Maybe if they paid them more money

You sound like a VMware vet.

# ? Jun 9, 2014 04:50

dox: Mar 4, 2006

Crossposting from the build thread in the hopes that some of you have seen this with your whitebox/homelab setups:

I am running into some issues with my new virtualization build and was looking for some advice on two issues. Parts are: i5-4590S, ASUS H97M-E, and Fractal Define Mini. Previous parts from older build include 3 drives (1 platter, 2 SSD) and an Intel Pro/1000 PT Quad Port NIC. I previously had ESXi installed on a P8P67 + i5-2500k combo with the same NIC without having to slipstream any drivers.

Issue one: I can't get the NIC to be detected by the ESX installer even though link lights are on and even after I slipstreamed updated drivers-- so I have included pictures of the relevant PCI settings.

Issue two: After slipstreamming the onboard Realtek drivers, I was able to install ESX but when I try to boot off the USB, I am greeted to this screen: http://i.imgur.com/nPphRB2l.jpg

If I go into the advanced boot setting to change the OS and/or delete the secure boot keys, I'm unable to boot into ESX off the USB at all-- it just skips it and immediately loads the BIOS.

Has anyone seen anything like this before? The rest of the relevant photos of BIOS settings are here.

dox fucked around with this message at 03:19 on Jun 10, 2014

# ? Jun 9, 2014 13:28

Dilbert As FUCK: Sep 8, 2007; by Cowcaster; Pillbug

1. I have to hunt down the vib but I can get you the onboard driver I believ

2. You need to put the bios mode into "legacy" mode

# ? Jun 9, 2014 14:57

CtrlMagicDel: Nov 11, 2011

complex posted:

Shows how you well understood Auto Deploy & Host Profiles are, even inside VMware. Good luck getting proper support for these features. (We've tried.)

The host comes out of maintenance mode if and only if the host profile is successfully and complete applied. Look into the F11 console while booting to see what the host profile is doing, if it is just spinning on, say, enumerating every disk. If this takes longer than the timeout, your host profile is considered "failed". The number of disks it takes for this to happen is surprisingly low.

I feel your pain. I've opened several cases about Auto Deploy/Host Profiles and gotten pretty poor responses from support. I guess that makes me not feel so bad about having my sales people escalate this support case I opened because my bug was fixed :ughh:

# ? Jun 9, 2014 16:56

CtrlMagicDel: Nov 11, 2011

Aaaaaand now support just called me back saying it should stay in maintenance mode again. I'm certainly glad everyone at VMware support is on the same page.

# ? Jun 9, 2014 23:11

dox: Mar 4, 2006

Dilbert As gently caress posted:

1. I have to hunt down the vib but I can get you the onboard driver I believ

2. You need to put the bios mode into "legacy" mode

Thanks Dilbert.

In the end I needed to use formatwithmbr during the install options phase (SHIFT+O) to avoid using GPT with the UEFI motherboard, which I found here (thanks to namol in IRC).

I also needed to slipstream (using ESXiCustomizer) the onboard Realtek NIC drivers and the H97 sata drivers. For a while there I was regretting my specific purchases, but in the end it has worked out.

I still can't seem to get the E1000 vib to work with my Intel Quad Port NIC and I'm not quite sure why-- but for now I'm pretty happy. Also, gently caress you VMware for making this so difficult.

dox fucked around with this message at 01:20 on Jun 10, 2014

# ? Jun 10, 2014 01:17

Dilbert As FUCK: Sep 8, 2007; by Cowcaster; Pillbug

1000 PT should work natively....

If you need to do a join.me just let me know, I'm pretty stacked this week but should be free friday night or saturday night(before 7pm EST)

Alternatively, you can upload it to VUM and apply it to the host, or what might be quicker:

Download the offline package
Enable SSH on the host
Download winSCP
copy the vib to a datastore
SSH to host
run 'esxcli software vib install -d /path/to/datastore/driver.zip -n net-e1000e'

Dilbert As FUCK fucked around with this message at 01:38 on Jun 10, 2014

# ? Jun 10, 2014 01:28

Daylen Drazzi: Mar 10, 2007; Why do I root for Notre Dame? Because I like pain, and disappointment, and anguish. Notre Dame Football has destroyed more dreams than the Irish Potato Famine, and that is the kind of suffering I can get behind.

Correct me if I'm wrong, but can't I nest an ESXi 5.5 VM inside ESXi 5.5? I want to set up two ESXi servers, carve out half a dozen or so virtual datastores (80-100GB at most) and then lab it up. I'm trying to do this as cheaply as possible (i.e. free) but I'm having problems finding some resources that aren't 1-2 years old.

# ? Jun 10, 2014 01:39

Dilbert As FUCK: Sep 8, 2007; by Cowcaster; Pillbug

Daylen Drazzi posted:

Correct me if I'm wrong, but can't I nest an ESXi 5.5 VM inside ESXi 5.5? I want to set up two ESXi servers, carve out half a dozen or so virtual datastores (80-100GB at most) and then lab it up. I'm trying to do this as cheaply as possible (i.e. free) but I'm having problems finding some resources that aren't 1-2 years old.

You can,

You'll need to do the following IIRC

Build a VM to 2vCPU, 4GB ram, Linux-64 bit other, HW version 9
After building it set it to virtualize AMD-V/Intel-VT and OS version Other-> Esxi 5.x
you'll then need to SSH into the physical host and add a line into the /path/to/datastorevm/%name%.vmx
and do a vi and add vhv.enable = "TRUE"
wq to save and you should be able to rock it with 64 bit VM's running on those nested esxi builds

# ? Jun 10, 2014 01:45

Moey: Oct 22, 2010; I LIKE TO MOVE IT

Dilbert As gently caress posted:

You can,

You'll need to do the following IIRC

Build a VM to 2vCPU, 4GB ram, Linux-64 bit other, HW version 9
After building it set it to virtualize AMD-V/Intel-VT and OS version Other-> Esxi 5.x
you'll then need to SSH into the physical host and add a line into the /path/to/datastorevm/%name%.vmx
and do a vi and add vhv.enable = "TRUE"
wq to save and you should be able to rock it with 64 bit VM's running on those nested esxi builds

Yup. Just setup a nested lab at work the other day, what he speaks is true.

# ? Jun 10, 2014 01:50

Daylen Drazzi: Mar 10, 2007; Why do I root for Notre Dame? Because I like pain, and disappointment, and anguish. Notre Dame Football has destroyed more dreams than the Irish Potato Famine, and that is the kind of suffering I can get behind.

Is it vhv.enable="TRUE" or vhv.allow="TRUE"? I've already got vhv.allow="TRUE" in the config file, but when I fired up the first ESXi VM I got an error message about unsupported hardware. Also, for some reason I only have version 8 of the virtual machine hardware available and not 9.

# ? Jun 10, 2014 02:19

Moey: Oct 22, 2010; I LIKE TO MOVE IT

What version of ESXi is your physical host running?

Apparently 5.0 is vhv.allow and 5.1 is vhv.enable

# ? Jun 10, 2014 03:03

Daylen Drazzi: Mar 10, 2007; Why do I root for Notre Dame? Because I like pain, and disappointment, and anguish. Notre Dame Football has destroyed more dreams than the Irish Potato Famine, and that is the kind of suffering I can get behind.

Moey posted:

What version of ESXi is your physical host running?

Apparently 5.0 is vhv.allow and 5.1 is vhv.enable

I'm actually running 5.5 on my physical host.

And it looks like changing it from vhv.allow to vhv.enable did the trick. Thanks!

Daylen Drazzi fucked around with this message at 13:21 on Jun 10, 2014

# ? Jun 10, 2014 13:05

Dilbert As FUCK: Sep 8, 2007; by Cowcaster; Pillbug

http://www.vhersey.com/2014/05/hrvmug-meeting-june-10-2014-be-there/

Any Hampton Roads goons feel free to come on out for our VMUG.

# ? Jun 10, 2014 13:29

Dilbert As FUCK: Sep 8, 2007; by Cowcaster; Pillbug

UNless people are use to what they learned in college and are only think to be told X. HOW THE gently caress CAN ANYONE DEFEND CITRIX?!??!?!

# ? Jun 12, 2014 05:18

three: Aug 9, 2007; i fantasize about ndamukong suh licking my doodoo hole

Dilbert As gently caress posted:

UNless people are use to what they learned in college and are only think to be told X. HOW THE gently caress CAN ANYONE DEFEND CITRIX?!??!?!

Because it's a good system when installed correctly and you know what you're doing.

The problem is that it does so much it gives you lots of way to screw yourself.

# ? Jun 12, 2014 18:03

Alereon: Feb 6, 2004; Dehumanize yourself and face to Trumpshed; College Slice

It's time for the V3 thread!_{or would that be the ownCloud edition?} :siren:

Let's start a discussion of what should be changed/improved in the new OP, and if anyone wants to volunteer to work on it that would be great. I'd post a draft OP in the thread, take comments, then post the new thread and report this one with reason "close old thread."

# ? Jun 13, 2014 01:37

soy: Jul 7, 2003; by Jeffrey of YOSPOS

Alereon posted:

It's time for the V3 thread!_{or would that be the ownCloud edition?}

Let's start a discussion of what should be changed/improved in the new OP, and if anyone wants to volunteer to work on it that would be great. I'd post a draft OP in the thread, take comments, then post the new thread and report this one with reason "close old thread."

more openstack stuff, vmware is for wankers.

# ? Jun 13, 2014 06:01

evol262: Nov 30, 2010; #!/usr/bin/perl

soy posted:

more openstack stuff, vmware is for wankers.

To update my :effort:

post:

What OpenStack is:
IaaS. It's backed by AMD, Brocade, Canonical, Cisco, Dell, EMC, HP, IBM, Intel, NEC, Rackspace, Red Hat, Novell, VMware, and more. Essentially every player in the industry. OpenStack's essential aim is to provide a toolset for implementing private and public EC2-alike cloud, so to take a step back:

What cloud computing is:
Rapidly scalable anonymous services. The essential basis of IaaS, PaaS, NaaS, SaaS, and anything else on which you don't have a vested interest in configuration, statefulness, and the nitty-gritty details.

If you just made the frontpage of Slashdot and you need 100 more servers right now and you have a configuration management deployment environment ready to provision them, add them to load balancers and clusters, IP them, and all the other parts of configuration, you may be a good candidate. If you have basically identical development/QA servers, and you want to be able to create them at-will (and don't care what happens to them when they disappear), you may be a good candidate. If you want a VDI environment and you have images ready to go for users to log into with roaming profiles and no local... you get the point. There are volume services available for running persistent cloud images, and this is definitely a thing you can do, but you should go through this CERN presentation which explains it pretty succinctly. You want more cattle than pets.

What cloud computing isn't:
A generic term for things that may be represented by a cloud on a Visio diagram. In the sense that your Dropbox documents are in the "cloud" and you don't need to worry about the implementation of it and your GMail is in the "cloud" and you don't need to worry about the implementation of it, that's true. But cloud computing is not a generic synonym for "on the internet".

It is also not a generic term for by-the-hour grid compute services (SQL Azure or otherwise) to which you can submit data. These may be PaaS or SaaS, but "cloud computing" (especially in the OpenStack sense) is IaaS.

So, what is OpenStack?
OpenStack is a collection of services which work together to provide the essentials of a "cloud" environment. There are others (Eucalyptus is the big competitor), but OpenStack is far and away the leader in this space, and aims for EC2/S3 API compatibility, where Eucalyptus is hoping for API compatibility and to be architecturally similar. The difference to you (as an end user) between OpenStack, Eucalyptus, Rackspace, EC2, and any other cloud provider which isn't Azure or overtly different should be zero.

Compute (Nova)
Generic compute services. By this, I mean that OpenStack should be able to consume any generic virtualization provider. Or raw hardware. There is support for ESXi, KVM, Xen, XenServer, Hyper-V, LXC, and potentially more. You could (in theory) write a nova-compute provider which drives HP's iLO through SSH to load an image. It doesn't matter. Nova targets anything which can run an operating system on that architecture. Xen PV is a bad choice (limited), as is LXC (limited), but you can build an OpenStack environment which runs only Linux on LXC inside a virtual machine. It's agnostic.
Object Storage (Swift)
OpenStack's VMFS analogue. Except it isn't, really. Swift's closest comparison is Gluster. Or maybe VMware's new vSAN stuff (which I know virtually nothing about). It aims for horizontal scaling by making images available across as much of the environment as possible. Swift has a fair amount of logic for figuring out how many copies it needs, and where. It automatically replicates. S3 buckets.
Image Service (Glance)
How AMIs are presented to clients.
Block Storage (Cinder)
The layer behind Swift, which handles interactions between Swift's and whatever it's running on (loopback filesystem, NetApp, EMC, Gluster, raw partitions, whatever). Ephemeral disk.

The previous three essentially replicate S3 and friends, as well as providing raw (ephemeral) storage, AMI storage, and snapshots.. You can run Swift, Cinder, and Glance without Nova or other services for a generic, distributed filestore which compares to S3 or MongoFS. Various parts of this stack can run on Ceph, iSCSI, FC, NFS, Gluster, local storage, and more! It doesn't really matter which for the purposes of this post.
Networking (Neutron)
Generically service which ensures networking is the same across all compute nodes. OpenVSwitch is the big target here, though (like everything else openstack related) it's pretty modular, and there's support for Nexus switches and other platforms. openvswitch is beyond the scope of this, but the Neutron stack is broadly equivalent to dvSwitches. Since the last post, various vendors are jumping on the bandwagon with opencontrail (juniper), opendaylight (everyone else), and assorted vendor plugins, though they're mostly not production ready. Neutron now also implements loadbalancing-aaS, firewall-aaS, and VPN-aaS in the same way it does everything else: use generic services (dnsmasq for dhcp and dns, haproxy for load balancing, etc) running in different namespaces with generic configs. Vendors can add plugins if they want to with more features. Probably the part of Openstack you'll hate the most if you do a deployment. Software-defined networking is hard. Try to use vxlans or gre if you can.

This long paragraph still doesn't convey how much you'll want to slit your wrists after typing "ip netns ${some_uuid} addr" trying to figure out why your guests can't get outside their private network through the software-defined router and why they can't reach the metadata server.
Dashboard (Horizon)
Web UI and accounting services. Enough said.
Identity Service (Keystone)
AAA and API keys. API keys are how everything Openstack talks to everything else Openstack, and Keystone organizes it. This is the only part of Openstack that's required.
Heat
It's CloudFormations, basically. Now elastically scales! If you give it a service definition that's smart enough.
Celiometer
Billing just like the real cloud! Or accounting for how many cycles your internal departments are wasting.
Upcoming stuff, maybe
DNS as a service if we can ever agree on what to pursue as a standard (currently, DNS for guests is sort of ugly, hence the AWS-ish "you get a random hostname, front your load balancers with a floating IP and give that a sane name" strategy
Baremetal, which Ubuntu is sort of pursuing early while it's vaguely production ready. Ironically (in the Alanis sense) named "Project Ironic"
Triple-O or "Openstack on Openstack". Because what you really need is a cloud that tries to manage itself. Hopefully renamed to "Project Skynet"

The key thing is that all of these services can run anywhere. You can run 100 nova instances on 100 PCs. Or 1 on 1. You can run Horizon, Glance, Swift, Keystone, and Nova on 5 different PCs. Or VMs. It doesn't matter. The only thing you need to run is Keystone. All other services are optional, and you can use as many or as few as you think you can handle. Keystone keeps track of service endpoints in a registry, and you can query their APIs to your heart's content :allears:

RHOS/RDO or whatever we're calling it now is probably the fastest way to get up and running with OpenStack unless you have a lot of time and in-house expertise to get it all talking. Mirantis is also good. Ubuntu is OK. Rackspace ships ISOs with their own spin which basically uses Chef instead of Puppet for provisioning. And the Openstack Foreman Installer is a thing, plus a load of stuff in staypuft to automatically deploy high-availability, clustered database openstack that (theoretically) never goes down.

Build your images with Packer, or Oz, or ImageFactory, or something. But they better fulfill these requirements. Use this for Windows. Even though you technically can dump ISOs into Glance, boot VMs from them, take snapshots, and use those as "images", this isn't VMware with golden images, and you're a bad person if you do this. Build images with a tool. Do boot-time customization with the configuration-management software of your choice.

Scale horizontally. If your application is a special snowflake which isn't stateless and needs to scale vertically, it should not be on Openstack. You should be able to boot a new instance from the API, add the new IP to a load balancer through the API, and have increased capacity. Welcome to the cloud :frogsiren:

I'm not going to effort post about non-Openstack VMware alternatives, but I also work on oVirt/RHEV and am broadly familiar with XCP if anybody cares.

Somebody do a writeup on Hyper-V, because that actually has market share.

evol262 fucked around with this message at 06:55 on Jun 13, 2014

# ? Jun 13, 2014 06:53

Wicaeed: Feb 8, 2005

soy posted:

more openstack stuff, vmware is for ~~wankers~~ people with Enterprise-level budgets.

Fixed

# ? Jun 14, 2014 01:23

minato: Jun 7, 2004; cutty cain't hang, say 7-up.; Taco Defender

Is this thread appropriate to talk about Docker, or does it deserve its own thread? Technically it's not virtualization, but practically it is. I've noticed that most of the discussion in this thread is about the hardware and configuration aspects of VM tech, and figure Docker discussion would probably focus more on the software and orchestration side of things.

# ? Jun 14, 2014 03:36

evol262: Nov 30, 2010; #!/usr/bin/perl

minato posted:

Is this thread appropriate to talk about Docker, or does it deserve its own thread? Technically it's not virtualization, but practically it is. I've noticed that most of the discussion in this thread is about the hardware and configuration aspects of VM tech, and figure Docker discussion would probably focus more on the software and orchestration side of things.

The Linux thread may get more responses, but container-based virt could go here.

Docker is explicitly not virtual machines in its intended use case (no init, binaries as entrypoints, gimped networking) and not really orchestration either (everything docker does can be done, arguably better, with some configuration management system+api driven spinup of real VMs on openstack or ec2 or vsphere or whatever, especially when paired with cloud-init, heat/cloudformations, and maybe a software load balancer), but it has a ton of buzz if you want to discuss it.

# ? Jun 14, 2014 03:41

Docjowles: Apr 9, 2009

Also, very specific threads like that in SH/SC tend to generate 5 replies and then die. Might as well ask in a semi-related megathread. Docker is very high up on my "cool poo poo to play with next" list but I haven't quite gotten there, so I'm interested in any discussion

# ? Jun 14, 2014 04:32

evol262: Nov 30, 2010; #!/usr/bin/perl

Also, despite my general confusion about Docker (except as an application+libraries packaging solution for desktop apps with all their dependencies like OSX's Something.app bundles, which nobody is doing with Docker yet), I wrote the Docker support bits for oVirt and Docker integration for our build system, and have experience with it. Potentially more in the future as we look at rebuilding oVirt on top of Atomic.

You should talk about it here or the Linux thread if you're interested. There's not a lot of non-VMware uptake here other than Docjowles and I, but I dunno what the lurker population is.

# ? Jun 14, 2014 05:18

minato: Jun 7, 2004; cutty cain't hang, say 7-up.; Taco Defender

My team is already running Docker in production on a large OpenStack cluster. Everything from the tools to the processes to the orchestration is very "Version 0.1" and a pattern of best practices has yet to emerge, which is why I'm keen to discuss it. It sounds like I should start a new thread about it, but since it's more about packaging + SOA tools like Apache Mesos or Kubernetes, it's not clear whether it'd go in this forum or Cavern of Cobol.

# ? Jun 14, 2014 05:52

luminalflux: May 27, 2005

I've just started playing around with openstack and here's a few notes:

trystack.org seems hosed, like instances in "scheduling" for 20+ hours
RDO Quickstart is great to get a small stack up with
Mirantis packaging of Fuel is pretty nice, especially their virtualbox scripts
Devstack doesn't seem to work at all in CentOS 6.5, FC20 or Ubuntu.
Trying to run any size of cluster on my macbook with 8G memory wasn't really possible, so I rented a server off OVH (SoYouStart) for this.
You can make a floating IP range work with routed IPs on OVH with Packstack by following these instructions
Packer seems broken working with openstack and rackspace. I'm currently trying to learn Go to fix this.

# ? Jun 14, 2014 05:53

evol262: Nov 30, 2010; #!/usr/bin/perl

minato posted:

My team is already running Docker in production on a large OpenStack cluster. Everything from the tools to the processes to the orchestration is very "Version 0.1" and a pattern of best practices has yet to emerge, which is why I'm keen to discuss it. It sounds like I should start a new thread about it, but since it's more about packaging + SOA tools like Apache Mesos or Kubernetes, it's not clear whether it'd go in this forum or Cavern of Cobol.

Genuinely curious: what differences/advantages did you see in your workflow from adding Docker? Development prototyping is easier, though not necessarily easier than masterless puppet or salt with nova-client and cloud-init. I've seen problems with Docker's lack of init and reaping children with forking processes, but I don't think you'd be doing this without process improvement. I expect Docker alongside vagrant, Docker when you can't create real VMs, Docker to test software on Dev laptops, Docker for version controlling images, but openstack already does many of the same things as Docker. Why Docker on openstack?

# ? Jun 14, 2014 06:07

minato: Jun 7, 2004; cutty cain't hang, say 7-up.; Taco Defender

With our previous setup, there was massive friction in the development/deployment process:
- Couldn't easily/quickly set up a Test/Dev environment that resembled Prod without significant Ops intervention
- Dev couldn't easily experiment with new tech, change dependencies themselves, or add new services that needed to run on their own server. And when they did, they still had to provision through Ops and synchronize code deployment who were in charge of the Puppet manifests that deployed such dependencies.
- Puppet/Mcollective were inadequate & unreliable but difficult to move away from due to the large manifest size.
- Deployment process consisted of a mess of scripts that had started out small and simple, but clearly evolved over time into spaghetti, written and maintained by Ops people who weren't really coders so didn't know how do keep it clean. Making changes to the deployment process was like playing Jenga with boxing gloves.
- Our old IaaS was not being used dynamically to spin up environments, but instead was full of "pets". It was only being used to avoid having to provision physical servers when we needed them.

With Docker + OpenStack, all these problems fall away:
- Setting up an environment (even Prod) is push-button. We use the IaaS's APIs to set up an empty cluster, and Docker to add the app services to it. All our environments and services are now properly "cattle", we can destroy and recreate them easily.
- Dev can now provision an environment that's identical to Prod for debugging a Production issue, or scaled down if they're focusing on a smaller area of the app. They can do this on their local laptop with Virtualbox or in any other IaaS. They couldn't do this before.
- Docker allows Dev complete control over their app's dependencies and (to some extent) config.
- Ops no longer has to use Puppet for its Prod config management (although we could). We can just use Docker containers which are more flexible and allow re-use of the SOA deployment mechanism.

Vagrant would have solved some of these problems and we were looking into using it, but then Docker started to get big, and it was a much better fit for us. We use Vagrant to provision local Dev clusters, but not for anything else.

The missing piece that I've seen no-one mention is CoreOS. It's a minimal Docker-focused Linux made for easy clustering. It doesn't even have a package manager. You add functionality to it via Docker containers. This means all your hosts or VMs can run the same OS, and having a homogenous set of VMs where the apps are containers instead of installed directly onto the box makes life much easier for Ops.

All this makes our Dev, DevOps, and Ops's life a lot easier. But the business driver behind all this is that it means our Dev teams can deploy features faster. Currently there's a lead time of at least 1 week required if they want to deploy any change that involves a schema upgrade. We want to get to the point where we can make hundreds of deployments per day if necessary. (I recently saw a statistics showing that the mean time between Amazon service deployments is 11.6 seconds - that's ~7500 deployments / day).

# ? Jun 14, 2014 06:55

luminalflux: May 27, 2005

minato posted:

The missing piece that I've seen no-one mention is CoreOS. It's a minimal Docker-focused Linux made for easy clustering. It doesn't even have a package manager. You add functionality to it via Docker containers. This means all your hosts or VMs can run the same OS, and having a homogenous set of VMs where the apps are containers instead of installed directly onto the box makes life much easier for Ops.

This sounds like Solaris Zones.

# ? Jun 14, 2014 07:07

Docjowles: Apr 9, 2009

evol262 posted:

I've seen problems with Docker's lack of init and reaping children with forking processes, but I don't think you'd be doing this without process improvement.

Have you looked at Baseimage? It's, well, a base image you can build on that includes a few quality of life things like properly reaping processes, cronjobs and SSH. There's debate as to whether a container actually should do all of those things (except reaping procs which seems pretty important), but it's out there if you want it. It's maintained by Phusion, who make the popular "Passenger" Ruby on Rails server.

luminalflux posted:

This sounds like Solaris Zones.

Linux containers (which Docker leverages) are pretty similar to zones, as I understand it.

# ? Jun 14, 2014 08:13

evol262: Nov 30, 2010; #!/usr/bin/perl

minato posted:

The missing piece that I've seen no-one mention is CoreOS. It's a minimal Docker-focused Linux made for easy clustering. It doesn't even have a package manager. You add functionality to it via Docker containers. This means all your hosts or VMs can run the same OS, and having a homogenous set of VMs where the apps are containers instead of installed directly onto the box makes life much easier for Ops.

In general, coreos uptake is hard. Much as it's minimal, it's in flux with a very small Dev team, and I know from experience what a pain the readonly model can be. Moreover, most businesses running Linux have a significant investment in ancillary pieces -- Symantec endpoint, oracle, VAS, or some other service which must run on the hardware. Some of the parts of CoreOS are interesting (especially fleet and skydns), but CoreOS is a step backwards for a lot of shops, and it's not much more minimal than minimal Debian/centos/whatever installs which leverage existing expertise and architecture.

I guess this was my takeaway from your post. Your old model (which I wouldn't have even called IaaS) was broken. So you threw away 100% of it. But nearly all of the positive changes could have happened with a move to openstack alone, soa, stateless apps, and masterless puppet. I still see what Docker adds for devs reproducing production on their laptops, and maybe for containerized deployment even if I think salt or puppet manifests in git updated by dev for their applications may work better (especially if checked out and applied inside the dockerfiles).

To rephrase my question more directly:

Why docker on openstack when openstack already does 99% of what you wanted and the other 1% overturns a decade of best practice and returns to the "unpack this tarball from the developer to deploy" model? What does docker do for you that openstack doesn't?

I'm not trying to be aggressive, just trying to get to the heart of it, so please don't be offended.

Docjowles posted:

Have you looked at Baseimage? It's, well, a base image you can build on that includes a few quality of life things like properly reaping processes, cronjobs and SSH. There's debate as to whether a container actually should do all of those things (except reaping procs which seems pretty important)

Sure, but this is the crux of it. "Should a container actually do all of those things"? Maybe. LXC does. LMCTFY does. Docker best practice is "no", and you start to lose some of the perceived advantages of containers once you turn on them into real machines. A very minimal systemd setup would make sense, but that also breaks the "your application is your configuration model". It'll be interesting to see how it evolves.

# ? Jun 14, 2014 17:24

minato: Jun 7, 2004; cutty cain't hang, say 7-up.; Taco Defender

evol262 posted:

[CoreOS stuff]

The other key aspect of CoreOS is that it's autoupdating. It takes its inspiration from the Chrome browser which updates in the background instead of having to be explicitly updated by Ops. Auto-updating the OS probably seems like an anathema to an Ops person, but there's little risk because there's next-to-nothing to update, and they've thought through an automatic rollback process in the case of it not working.

(That said, we've disabled auto-updating in Production because CoreOS is still in beta)

evol262 posted:

Your old model (which I wouldn't have even called IaaS) was broken. So you threw away 100% of it. But nearly all of the positive changes could have happened with a move to openstack alone, soa, stateless apps, and masterless puppet.

You're right, and that's exactly what we did (except the masterless puppet bit, but the point is that our service packaging system has improved from what it was).

The question I hear seems to be "But you can do all this with VMs, and there's lots of existing tooling and experience with those, so what does Docker get you?". The answer lies in the fact that Docker containers are more lightweight than VMs:
- Faster or comparable build times
- Better resource utilization
- Fast launch times (like, instant in some cases)
- Simpler configuration at instantiation

This agility means we can do things like spin up our entire SOA network of about 10 services and have them ready to serve in a couple of seconds, on a Dev box with only 4 GB of RAM. I don't believe the same would be anywhere near as performant if each service was in its own VM. The fast spin-up time means faster dev iterations, faster deployment, and makes things possible like re-provisioning the entire SOA network between individual tests instead of between test-runs.

It also means that you can treat containerized services as if they were CLI tools instead of full-blown VMs. For example, we have a "log report" process with a complex set of dependencies that needs to be run periodically over a log directory to generate a report. We containerized it, and launch it with "docker run". We mount in the input/output directory as a "parameter" to this "CLI tool". We use "docker run" to launch it, it reads the mounted directory and generates its output files, then halts after which Docker cleans it up removes just like a regular process (because that's what it is). So unlike a VM that contains the log reporter, there's less startup/teardown overhead and it's not consuming CPU/memory/network resources when it's idle.

An additional advantage of Docker (and this was big for us), Dev gets control of the dependencies, and using tools they're already familiar with (i.e. yum, apt-get, pacman).

I saw a talk [slides here] recently by a Twitter guy who also was the author of the Apache Mesos cluster manager. He was anti-VMs for his Linux SOA apps, because he sees them as "neither sufficient nor necessary".

From his POV: Apps have outgrown a single server. Managing/scheduling SOA services across a cluster has a lot in common with what a Linux kernel deals with when managing its own processes. I.e. in the same way that a Linux kernel is dispatched processes to run and has to find, allocate, and manage the resources those processes require, Apache Mesos + Marathon are dispatched SOA services to schedule on a cluster and need to find, allocate, and manage the resources within the cluster that those services will consume. Therefore Mesos + Marathon are like a "Datacenter OS" (his words).

Ultimately, a service wants resources, not a machine. Mesos is responsible for providing those resources, rather than a machine which provides those resources. It cuts out the middleman for faster spin up times and better resource utilization. To this end, he wants a datacenter with possibly heterogeneous hardware but a homogenous and simple OS on each host, the ability to easily dispatch processes to the any of the managed hosts, and a centralized service manager for the cluster. They use a minimal base OS as the host platform, Docker provides lightweight process isolation, and Mesos/Marathon provide the cluster resource management.

(Google does something very similar with a base OS they rolled themselves + Docker, and the scheduling/resource management is done with a tool they've just announced called Kubernetes)

Note that it's not a strategy that works for everybody. But it works pretty well for his particular set of services (all Linux, mostly 12-factor services).

evol262 posted:

Sure, but this is the crux of it. "Should a container actually do all of those things"? Maybe. LXC does. LMCTFY does. Docker best practice is "no", and you start to lose some of the perceived advantages of containers once you turn on them into real machines. A very minimal systemd setup would make sense, but that also breaks the "your application is your configuration model".

Yes, the Docker ideal is to avoid having to run your own init inside the container (although it's easy with lightweight process managers like supervisord). If your app is a web service, then it should just run Apache/Nginx/Node. If it's a cron service, then just run cron. It's important to think of containers as highly-constrained processes, not VMs. The isolation level is effectively the same, but the construction and utilization is different.

It's tempting to add things like SSHD into the container for debugging, but often it's unnecessary. It's possible to inspect a container's filesystem, process space, environment and stdio etc from outside the container, so why bother adding more to go inside? (and all the issues that entails, like managing SSH keys and futzing with cloud-init)

# ? Jun 14, 2014 20:12

evol262: Nov 30, 2010; #!/usr/bin/perl

minato posted:

I appreciate the discussion, and I agree with much of what you're saying, I'm just trying to grasp the stack, really. Docker and openstack is sort of :psyduck:

, especially on CoreOS.

Census, Mesos, and similar tools are extremely useful for service auto discovery and scaling, and the container "process as a service" model works really well (I'm pretty sure that Google is still using their own container stack for production and only looking at docker experimentally so far, but don't quote me on that).

E: confirmed. Google is still using omega+lmctfy for prod, though potentially moving eventually.

Let me flip this around: even though I develop VM products for Redhat, we're (and I'm) looking for use cases for Docker. And there are potentially a lot. I know openstack can do 99% of what you need, but I also get that VMs are heavyweight, and your log process could probably execute 5 times as quickly as even a CirrOS image could boot. So why use VMs at all?

I'm wondering what openstack adds. Tenant separation on software defined networks is nice, and I could see that use case, maybe. But I'm struggling to see how it fits with docker. I'm guessing you guys have a local image repository. So you could say "docker run whatever" on CoreOS on bare metal, or Ubuntu, or even rhel at this point. Using mesos, census, or maybe even etcd would let you scale rapidly. You think VMs are too full-blown, and maybe they are, but I'm just confused about your stack. Either Docker or openstack could do what you're talking about without the other. Why both? This is really my question. I focused on the "why docker", but now it seems too aggressive against docker judging from your responses. So I could ask "why openstack?" But why both?

Developer driven workflow is maybe nice. Historically, this died because it turns out that making developers document how their application actually works and how to get it running clearly enough for totally unrelated people to deploy it reduces their inclination towards complexity, new shiny things, and hacky crap, plus spreading knowledge of essential business logic around to other teams and management, but ideas get new wrapping paper every so often in IT and thin clients/VDI come back from the dead, etc, so forget that argument for now. This workflow obviously works for you. But both openstack and docker facilitate that, one through devs shipping images themselves with tools they're familiar with (docker), and one through the whole devops ci+config management+api driven spinup+anonymous VMs bit.

The "cloud" (openstack or other traditional iaas) is tough because it requires full-stack developers who are white whales or close cooperation between Dev, ops, and infra in order to avoid pets, get reasonable puppet/chef/salt/whatever configs that everybody understands. I can see why people may shy away from this. Docker is a hard sell for every other team other than Dev (and maybe management assuming they're OK with Dev ruling the roost) for the reasons in the previous paragraph.

I can see how shops might come down on one side or the other, but how both? The only way I can figure is that most of your docker stuff is basically processes and even micro instances are wasteful, so you have multiple Docker instances per VM. That makes sense. But then why not skip VMs altogether and do CoreOS on bare metal? Or Ubuntu+shipyard? Or whatever?

evol262 fucked around with this message at 22:27 on Jun 14, 2014

# ? Jun 14, 2014 22:16

evol262: Nov 30, 2010; #!/usr/bin/perl

I'm also really curious how you're cleaning up the docker images. Does CoreOS come with a cronjob or systemd service for this? It's sort of a non-issue on openstack (except that the root disks tend to be pretty small), but last I checked, docker leaves garbage all over unless you tell "docker rm(i)..." every so often (or really often if you're using docker a lot.

# ? Jun 14, 2014 22:46

Adbot: ADBOT LOVES YOU

# ? Apr 20, 2024 02:35

minato: Jun 7, 2004; cutty cain't hang, say 7-up.; Taco Defender

It's a good question to ask what OpenStack does for us. OpenStack mainly gives us the tenancy separation, and the ability to provision/destroy QA/Dev environments at will since our QA/Dev resource usage is fairly elastic. To be clear, we didn't actually choose OpenStack and we could have gotten away with physical servers. It's just a convenient platform for us. Having an IaaS API layer in our deployment process gives us good flexibility; much like Vagrant, our environment provisioning tools can be pointed at any IaaS layer (including VirtualBox for Devs or AWS for QA who need ephemeral environments for testing).

Aside: the IaaS/PaaS line gets blurry with OpenStack because someone's written a Docker driver for it (to live alongside kvm, qemu, etc). So it's possible to use OpenStack APIs to spin up an instance of a Docker container, and what's really happening is that it's running the Docker container directly on the Nova node. :psyduck:

This is not something we want to pursue, but it's an interesting proof-of-concept.

evol262 posted:

E: confirmed. Google is still using omega+lmctfy for prod, though potentially moving eventually.

Yes, I saw a talk from the Google guys and essentially since 2009 they've been using their own containerization tech (lmctfy) to do what the Twitter guys were talking about, in the typical Google fashion where it could never be practically used outside their ecosystem. Now that the Linux kernel has proper namespacing for everything and mature CoW filesystems like btrfs, Google are end-of-lifeing lmctfy, rolling the good bits of it into libcontainer (which is the guts of the Docker app) and moving to Docker*.

*(technically libcontainer because Google needs to bypass Docker to do some Weird Things)

evol262 posted:

Docker is a hard sell for every other team other than Dev (and maybe management assuming they're OK with Dev ruling the roost) for the reasons in the previous paragraph.

For us, it was a win for Ops as well as Dev. The burden of being in charge of dependency management was very large for them, and often a blocker for Dev. In addition, managing racks of hosts with a homogenous and simple OS is much easier compared with the setup we had before.

But it's not without its challenges. Giving Dev control over dependency management means they need to have more awareness of how that low-level stuff actually works. If logging/monitoring are just services in a SOA network, then does that mean that Dev is responsible for them? If Dev can deploy without Ops intervention, then who should be on pager duty when the web site falls over at 3am?

# ? Jun 14, 2014 23:18

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > Virtualization Megathread V2: VMs inside VMs

«‹›312 »