Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
chutwig
May 28, 2001

BURLAP SATCHEL OF CRACKERJACKS

jaegerx posted:

Coreos has openstack running in containers where it should be considering how often poo poo fails. You should look into that. Or just run k8s on top of coreos.

I don't actually have a problem with OpenStack as far as stability goes, on either Kilo or Liberty. What I have a problem with is the huge split between the people who write OpenStack and the people who run OpenStack clusters. The developers like to crank out releases every 6 months but most operators are running releases that are 2-3 versions behind, so many clusters you find in production are considered EOL by the developers. I started building Kilo clusters at Bloomberg about 18 months ago. I just finished upgrading them all to Liberty last month, and soon I will start upgrading them all to Mitaka. Liberty is considered EOL by the developers. I find this stupid and wish that they'd move to a yearly release cadence and use the extra time for testing.

OpenStack also has a lot of weird and stupid scaling bugs that don't show up in testing because developers write stuff that works on devstack, passes the gates, but then falls over in a cluster of any real size. Got a cluster with many hundreds of Ceph OSDs? Watch as your nova-compute processes mysteriously crash! Watch as Nova and Cinder suffer from constant race conditions! Watch as RabbitMQ gets bogged down in millions of bullshit messages from Neutron agents!

Adbot
ADBOT LOVES YOU

Docjowles
Apr 9, 2009

As you and VC have repeatedly posted, whenever there is an OpenStack perf problem or general "this should never, ever, ever happen WTF?" issue, look at RabbitMQ and Galera. One of those two is almost certainly ruining your life, despite you having deployed the recommended config.

Docjowles
Apr 9, 2009

Also I totally agree on Ops being a pitiful afterthought. I went to the Atlanta summit (thanks, Cisco!) a few years ago. There was one loving session out of the whole 5 days dedicated to operators. It was in a room packed nuts-to-butts. The poor host asked "ok who has a comment on the operability of OpenStack?" and then a scene from the Walking Dead ensued and suddenly it was an hour later and everyone was dead. The other 99.5% of the conference was marketing boondogles and/or companies trying to hire anyone who could spell OpenStack.

jaegerx
Sep 10, 2012

Maybe this post will get me on your ignore list!


I've still got lots of friends at Rackspace running private openstack instances. I feel their pain. Most of the customers do like it though.

Ashex
Jun 25, 2007

These pipes are cleeeean!!!
If people are on AWS take a serious look at using their managed Elasticsearch for building a ELK stack as there's a logstash agent plugin for pushing there.

And there's things I could say about the service to make it more attractive but I can't yet.

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.

Ashex posted:

If people are on AWS take a serious look at using their managed Elasticsearch for building a ELK stack as there's a logstash agent plugin for pushing there.

And there's things I could say about the service to make it more attractive but I can't yet.
Assuming you're related to this product and not just evangelizing a thing you like: if I'm looking to pay someone to manage 1/3 of my logging stack, why would I do that instead of going all-in for a hosted offering like Loggly?

Ashex
Jun 25, 2007

These pipes are cleeeean!!!

Vulture Culture posted:

Assuming you're related to this product and not just evangelizing a thing you like: if I'm looking to pay someone to manage 1/3 of my logging stack, why would I do that instead of going all-in for a hosted offering like Loggly?

I do a lot of work on AWS so I just tend to know a bit about the services and have the advantage of being able to talk to Amazon people directly.

The biggest perks are it handles cluster management and scaling so you can just use their service then build a logging solution around it that makes you happy. Honestly if you're seriously considered Loggly just go for it as you'll still be building your own ELK stack that has it's own management requirements, the main benefit is the core bits (Kibana/Elasticsearch) are handled.

There's drawbacks to it like the inability to add custom plugins and being restricted to the metrics exposed through Cloudwatch.

Pile Of Garbage
May 28, 2007



Has anyone here used or is currently using SkyFormation for pulling in events from Azure AD or O365 Security & Compliance to pass to on-premise SIEM? Just looking for some opinions, we're currently running McAfee ESM SIEM on-premise and need to pull in audit events from Azure AD and O365.

chutwig
May 28, 2001

BURLAP SATCHEL OF CRACKERJACKS

jaegerx posted:

I've still got lots of friends at Rackspace running private openstack instances. I feel their pain. Most of the customers do like it though.

Know if any of them are in the NYC area and are looking for a new job? We are aggressively hiring OpenStack SREs and developer types. :getin:

netcat
Apr 29, 2008
Where should I look if my instances don't get a reply to dhcp? We've had this problem before but it started working again when we moved the controller node to another server but now after a few weeks everything has broken down again. I've looked in the dnsmasq ip -> mac mapping files and they seem fine (I had an issue before where that file wasn't updated properly). If I do a tcpdump on the all the interfaces I can see that the request appears on both the controller and compute nodes but not on the tap interface for the network.

I haven't really touched the network conf so I'm unsure why it would just suddenly break.

necrobobsledder
Mar 21, 2005
Lay down your soul to the gods rock 'n roll
Nap Ghost
Have you confirmed that the DHCP server is seeing the request on its interfaces? Is the reply making it through firewalls and gateways? What are your port mirroring / broadcast / promiscuous interface settings in your stack?

netcat
Apr 29, 2008
Actually I lied earlier, it looks like the packets won't even arrive on the bridge. tcpdump on the tap interface for the VM on the compute host shows the dchp requests but I see nothing on any other interface on the path. "ovs-ofctl show br-int" shows br-int as down but I think that's how it's supposed to be after googling for a bit (?). Doing ifup on br-int didn't change anything anyway, as far as I could tell.

If I understand everything correctly packets travel something like:
code:
VM -> tapxyz -> br-int -> br-ex -> physical (compute) -> physical (controller/network) -> br-ex -> br-int -> tap-xyz -> dnsmasq
Now it stops in the second step. Earlier when I had a similar problem it stopped perhaps at br-ex on the controller.

We use vxlan, but I'm not sure about all the interface settings or where to find them since our install is managed by Fuel so I've done very little manual tinkering related to that.

The weird thing for me with this is that it's been kind of a gradual degradation. At first it worked OK, sometimes VM's failed to get ip from dhcp but it would work after restarting the instance. Then it started to fail more frequent, like 50% chance that the VM got connectivity and now it doesn't work at all.

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.
Jesus Christ, the Azure CLI 2.0 is one of the most user-hostile pieces of software I've ever experienced. It's like if every command in the AWS CLI only took ARNs as arguments. This thing is clearly designed for Jenkins or whatever and not to be ever used by a human to do work.

Gucci Loafers
May 20, 2006

Ask yourself, do you really want to talk to pair of really nice gaudy shoes?


I don't understand why they wouldn't just use the same syntax of Powershell?

underlig
Sep 13, 2007
The people who used to work here set up one of our backup-servers to use Azure for long-time protection.

We're using Microsoft DPM 2012r2 and backing up to local disk and something that in our Azure-portal is called "recovery services vault".


My problem is that i cannot see what is actually in this vault, in the portal i can only see something called "Cloud - LRS" using ~6tb of data. I've made multiple attempts to find the right link inside the portal to get access to it but i always end up running in circles.
"Backup items" are listed as 0 in the portal, but my servers still gets backed up to Azure because there are recovery points and i can download them.


What am i missing? Am i looking in the wrong place? Isn't portal the place i should be? My CSP said "oh just download azure storage explorer" but that doesn't even show me the recovery services vault, then they said is should find everything on portal but like ive said i can't navigate to the correct place there.

Could the old technicians have configured this for some old part of Azure that has since been forgotten, or lost in updates?

Thanks Ants
May 21, 2004

#essereFerrari


Is it in "Backup vaults (classic)"?

https://portal.azure.com/#blade/HubsExtension/Resources/resourceType/Microsoft.Backup%2FbackupVault

Adbot
ADBOT LOVES YOU

underlig
Sep 13, 2007
Unfortunately not there either.

I did find an option from our CSP to get "statistics" though all it shows is that the LRS has increased slowly, the vault however only shows small variations in size for the last five months.

I've asked the csp-support once more to help me.

  • Locked thread