Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
distortion park
Apr 25, 2011


we've been migrating services from ecs to k8s for a while now and about 50% result in some unplanned downtime. the end result is sometimes a bit better, sometimes a bit worse, but definitely not worth all the investment by the infra team and all the new poo poo that the rest of the team has had to learn about and debug.

if we were starting from scratch then it might make sense to start with k8s, but as a migration target for some http apis from a perfectly functional ecs/fargate setup it was completely unjustified.

Adbot
ADBOT LOVES YOU

dads friend steve
Dec 24, 2004

distortion park posted:

i want to say that the problem is that "self serve" dev ops systems are being chosen by people who dedicate their jobs to infrastructure, not to the people focussing on application and feature development, but have not much confidence in that statement.

it’s an interesting point. on the flip side, right now in my org we have the dev team trying to push through an IAC standardization, but they’re also of the mindset that they don’t want to and don’t have time to learn poo poo that should be handled by an infra / platform team. which is fine and valid, but i don’t believe it’s a recipe for success to have the people who want to minimize their own long-term responsibility and involvement in a system designing that system

which I guess was the original industry motivation behind devops as a proper role, but no one in my group, dev or ops, is interested in becoming devops lol

distortion park
Apr 25, 2011


dads friend steve posted:

it’s an interesting point. on the flip side, right now in my org we have the dev team trying to push through an IAC standardization, but they’re also of the mindset that they don’t want to and don’t have time to learn poo poo that should be handled by an infra / platform team. which is fine and valid, but i don’t believe it’s a recipe for success to have the people who want to minimize their own long-term responsibility and involvement in a system designing that system

which I guess was the original industry motivation behind devops as a proper role, but no one in my group, dev or ops, is interested in becoming devops lol

agreed, i think outside some specific setups (some Vercel stuff, maybe simple fly.io type things, some cloud provider services) a full devops role is too hard right now to be broadly achievable, even if it remains a good goal.

my homie dhall
Dec 9, 2010

honey, oh please, it's just a machine

distortion park posted:

maybe using a service mesh is the root of our problems (we've certainly spent a lot of time messing with config values after the infrastructure team added it and we started getting random networking errors). But it's also hard to say no to something described like this

It almost sounds compulsory for a microservices architecture

my only-for-yospos opinion is that “microservices architecture” is an extremely stupid idea and it is honestly embarrassing that it ever gained traction in the industry. I’ve unfortunately seen a lot of places where things should be shifted to a “service-oriented” architecture (because the programs are communicating via files or things like POSIX shared memory constructs lol), but when I see specifically things called “microservices” in the wild deep down I can’t help but feel like the developers just wanted to try a new language or framework or just make a new thing and that was the way to justify doing so

I suppose if you do have 100s of different processes fuckin n suckin each other though, maybe service mesh could be good, I actually have no idea. but mostly I think ops teams generally have a handful of workloads they kinda just want to run and migrate and keep running if a node dies and you definitely don’t need an additional layer of yaml-configured iptables magic to do that with kube

CommieGIR
Aug 22, 2006

The blue glow is a feature, not a bug


Pillbug

my homie dhall posted:

my only-for-yospos opinion is that “microservices architecture” is an extremely stupid idea and it is honestly embarrassing that it ever gained traction in the industry. I’ve unfortunately seen a lot of places where things should be shifted to a “service-oriented” architecture (because the programs are communicating via files or things like POSIX shared memory constructs lol), but when I see specifically things called “microservices” in the wild deep down I can’t help but feel like the developers just wanted to try a new language or framework or just make a new thing and that was the way to justify doing so

I suppose if you do have 100s of different processes fuckin n suckin each other though, maybe service mesh could be good, I actually have no idea. but mostly I think ops teams generally have a handful of workloads they kinda just want to run and migrate and keep running if a node dies and you definitely don’t need an additional layer of yaml-configured iptables magic to do that with kube

A good 75% of microservices in use are likely because a dev or a manager heard about it and wanted to buy into a buzzword. The problem is none of these people seem to understand that everything has a use case and its not just a lift and shift to go from a monolithic architecture to a microservices one.

Progressive JPEG
Feb 19, 2003

i'm running a $6 digitalocean vps with 1gb ram, originally planned to put a k3s server on it but that consumed 400mb when empty/idle with flannel/servicelb/traefik already turned off

decided to just run everything as plain "restart=always" docker containers orchestrated via terraform and the combined system memory usage for 17 containers spanning a bunch of different stuff is around that same 400mb

sorta amazing how much overhead even a "minimal" k8s install creates

Progressive JPEG
Feb 19, 2003

i recently replaced a bunch of stuff that was deploying things via helm templates with terraform's k8s support and its been overall a good move

tfstate management is of course an exercise left to the reader as usual but everything else (templating, secrets management, one-off passwords for things, not leaving random old poo poo lying around as the deployment evolves, structure in general) is waaay nicer

can also deploy public/3rdparty helm charts directly from tf and that seems to work fine as well. was previously opposed to that when already using helm directly, keeping copies of the full yamls in source control. but now tf just handles it transparently

one catch is tf is pretty bad at crd management because it tries to look up the cr even when the crd might not exist yet. "official" solution is a separate preceding stage for just adding the crds. was able to avoid that in the one case where it was an issue by using the helm chart version of the thing

MononcQc
May 29, 2007

we use helm charts, but literally just call the 'helm upgrade --install' command with a bunch of files for each service and static build artifacts templated in from a manifest in S3, stick that in a CircleCI scheduled task, and use alerting on pre-prod environments' SLOs to know whether to auto-deploy to prod with ways to do manual validation of the checks fail (eg. pre-prod being down shouldn't preclude from force-deploying a fix to prod to restore service there first). Update the manifest to a previous version and re-run the job and your rollback is out in minutes.

it has been needs suiting so far, and I'm glad we managed to keep full automation rather than having to do some annoying gitops.

MononcQc
May 29, 2007

upgrading k8s itself and the cluster and auditing all the poo poo, now that's garbage.

Progressive JPEG
Feb 19, 2003

the home shoestring trino cluster was running ubuntu 20.04 with k3s. this was sort of by accident because the cluster started as a few 4gb rpi4s a couple years ago, at a time when ubuntu was producing prebuilt aarch64 rpi images that worked with k3s. the cluster grew organically from there but it was all still managed by a couple ansible yamls for turning off ubuntu's endless poo poo and for installing k3s onto there, respectively

given the situation it made sense to just have a clean slate with upgrading from 20.04 to 22.04. i got things mostly working after a couple hours but it was very unstable with all pods randomly crashing with no logs, even in a stock/empty k3s cluster on a single machine. this instability turned into a weekend-consuming pita with nothing explaining or solving the problem and i really didn't want to be dealing with it. there wasn't anything to lose at this point so i ended up just trying talos os. after a couple hours i had an empty talos cluster up with the instability fixed, and with my janitorial workload significantly reduced. being able to delete the aforementioned ansible yamls was very satisfying

one catch with talos is it wants a dedicated storage device to itself on each machine, while any persistent storage should go on separate devices. can put stuff directly in the "ephemeral" partition within /var but that is easy to wipe in e.g. a talos upgrade if you aren't careful. probably a good idea to have separate devices anyway

but overall if you're wanting to do some on-prem k8s without regular effort to maintain the underlying os then talos seems real good so far. at least until their funding dries up or whatever their situation is

Progressive JPEG
Feb 19, 2003

to clarify talos is basically an "appliance" linux distro whose sole purpose is to provide a k8s environment, managed via cli/api. i think it uses kubeadm underneath

carry on then
Jul 10, 2010

by VideoGames

(and can't post for 10 years!)

ok so does rook-ceph get any production usage or is it pretty much just use for dev clusters like mine

Nomnom Cookie
Aug 30, 2009



my homie dhall posted:

i think kubernetes is only complex if you need persistent storage or if you do something stupid like install a service mesh

I sincerely wish for everyone who believes that k8s isn’t complex to not run into one of the many, many “edge cases” that make k8s complex. edge cases in scare quotes because they weren’t until k8s showed up

Perplx
Jun 26, 2004


Best viewed on Orgasma Plasma
Lipstick Apathy

Progressive JPEG posted:

i'm running a $6 digitalocean vps with 1gb ram, originally planned to put a k3s server on it but that consumed 400mb when empty/idle with flannel/servicelb/traefik already turned off

decided to just run everything as plain "restart=always" docker containers orchestrated via terraform and the combined system memory usage for 17 containers spanning a bunch of different stuff is around that same 400mb

sorta amazing how much overhead even a "minimal" k8s install creates

oracle is an evil company and don't normally recommend them, but they easily have the best free tier

for free you get 4 arm core and 24 GB of ram, and 2 x (1 x86-64 core and 1GB of ram)

Progressive JPEG
Feb 19, 2003

tbh i'd rather pay someone $6/mo than risk interacting with oracle for free

Share Bear
Apr 27, 2004

Progressive JPEG posted:

i recently replaced a bunch of stuff that was deploying things via helm templates with terraform's k8s support and its been overall a good move

tfstate management is of course an exercise left to the reader as usual but everything else (templating, secrets management, one-off passwords for things, not leaving random old poo poo lying around as the deployment evolves, structure in general) is waaay nicer

can also deploy public/3rdparty helm charts directly from tf and that seems to work fine as well. was previously opposed to that when already using helm directly, keeping copies of the full yamls in source control. but now tf just handles it transparently

one catch is tf is pretty bad at crd management because it tries to look up the cr even when the crd might not exist yet. "official" solution is a separate preceding stage for just adding the crds. was able to avoid that in the one case where it was an issue by using the helm chart version of the thing

i REALLY like terraform compared to basically everything else mentioned in this thread, i havent really encountered weird edge cases besides the "immutable" state of the deploy being mutabled and not lining up with the terraform config, which can be rolled back or overwritten

outhole surfer
Mar 18, 2003

Progressive JPEG posted:

tfstate management is of course an exercise left to the reader as usual but everything else (templating, secrets management, one-off passwords for things, not leaving random old poo poo lying around as the deployment evolves, structure in general) is waaay nicer

i've seen this used as an argument against tf in other places, but honestly tfstate management seems trivial if you have any cloud provider or hosted database service. i've been trying out a pattern of storing it in git with git-crypt, and for one-admin personal infrastructure it's pretty slick.

12 rats tied together
Sep 7, 2006

it's more that you shouldn't need an entire infrastructure devoted to copy pasting "reality serialized to json" into a data store, especially if your only need is "put some yamls into k8s"

it's better than helm because helm is, and was, absolute clown poo poo for operations teams who run off of blog posts and hn articles

Tom Collins
Aug 25, 2000

Share Bear posted:

i REALLY like terraform compared to basically everything else mentioned in this thread, i havent really encountered weird edge cases besides the "immutable" state of the deploy being mutabled and not lining up with the terraform config, which can be rolled back or overwritten

terraform is basically the only thing i truly do trust, as it provides sanity in a way nothing else does.

it just needs a few small tweaks, like grouping and tagging resources so i can taint them en masse, and --untarget to exclude things from the plan

Tom Collins
Aug 25, 2000

"but targeting is an anti-pattern!!!"

shut up zoomer, the grown ups are deploying

Nomnom Cookie
Aug 30, 2009



terraform 0.15 is bearable but its still poo poo

distortion park
Apr 25, 2011


Nomnom Cookie posted:

I sincerely wish for everyone who believes that k8s isn’t complex to not run into one of the many, many “edge cases” that make k8s complex. edge cases in scare quotes because they weren’t until k8s showed up

e.g. reliably serving traffic during deployments https://scribe.rip/kubernetes-dirty-endpoint-secret-and-ingress-1abcf752e4dd

Qtotonibudinibudet
Nov 7, 2011



Omich poluyobok, skazhi ty narkoman? ya prosto tozhe gde to tam zhivu, mogli by vmeste uyobyvat' narkotiki
i love how the industry is so excited about kubernetes being so hot yet never understands poo poo about it. our product people have decided that we need to sell our operator somehow. attempts to explain that this is basically like trying to sell one of those windows installer exes separate from the program it's installing are falling on deaf ears

it makes even less sense given that we deal entirely with big enterprise contracts where the actual listed SKUs are basically window dressing to justify whatever price sales was going to charge anyway

carry on then
Jul 10, 2010

by VideoGames

(and can't post for 10 years!)

VSOKUL girl posted:

i love how the industry is so excited about kubernetes being so hot yet never understands poo poo about it. our product people have decided that we need to sell our operator somehow. attempts to explain that this is basically like trying to sell one of those windows installer exes separate from the program it's installing are falling on deaf ears

it makes even less sense given that we deal entirely with big enterprise contracts where the actual listed SKUs are basically window dressing to justify whatever price sales was going to charge anyway

lmao even ibm gives their operators away for free

Perplx
Jun 26, 2004


Best viewed on Orgasma Plasma
Lipstick Apathy
we are setting up kubernetes at work and plan to use off the shelf helm charts, which from the sounds of things is the wrong way to do this
anyway not my problem, the whole point was make things self serve to business units

12 rats tied together
Sep 7, 2006

having the plan be off the shelf public helm charts/ansible galaxy roles/chef marketplace cookbooks/puppet forge modules/terraform registry modules/cfengine build packages has been a bad idea for about a decade, yea

my homie dhall
Dec 9, 2010

honey, oh please, it's just a machine

quote:


Here’s how everyone would love to think that removing a pod from a service or a load balancer works in Kubernetes.

The replication controller decides to remove a pod.
The pod’s endpoint is removed from the service or load-balancer. New traffic no longer flows to the pod.
The pod’s pre-stop hook is invoked, or the pod receives a SIGTERM.
The pod ‘gracefully shuts down’. It stops listening for new connections.
The graceful shutdown completes, and the pod exits, when all its existing connections eventually become idle or terminate.
Unfortunately this just isn’t how it works.

there’s no way for the control plane to be HA and operate in this manner because it means removing pods would require synchronizing the entire cluster. the way this person apparently wants kube to work is literally not possible lol

Nomnom Cookie
Aug 30, 2009



my homie dhall posted:

there’s no way for the control plane to be HA and operate in this manner because it means removing pods would require synchronizing the entire cluster. the way this person apparently wants kube to work is literally not possible lol

for removing a pod to be successful the entire cluster must synchronize. this is implied by how services work in kubernetes. all that is required for pod deletion to be safe is a mechanism for nodes to indicate how far behind their kube-proxy is. this would place an incredible write load on etcd because kubernetes was designed by muppets, but it would still be HA

Nomnom Cookie
Aug 30, 2009



btw if you really want to see some poo poo overload your apiserver. cluster dns flapping, pods getting requests for 30 seconds after they terminate, it’s real fun

12 rats tied together
Sep 7, 2006

we have like 4 decades of HA load balancers, software and hardware, that are able to agree on when it's time for a node to stop receiving traffic

it doesn't work in the kube because it was designed by an advertising company and adtech only cares about dumb and bad poo poo happening when it's happening 10 or 20k times per second

my homie dhall
Dec 9, 2010

honey, oh please, it's just a machine

Nomnom Cookie posted:

for removing a pod to be successful the entire cluster must synchronize. this is implied by how services work in kubernetes. all that is required for pod deletion to be safe is a mechanism for nodes to indicate how far behind their kube-proxy is. this would place an incredible write load on etcd because kubernetes was designed by muppets, but it would still be HA

what is supposed to happen if a node is temporarily unavailable or slow to update?

my homie dhall
Dec 9, 2010

honey, oh please, it's just a machine
you realize synchronization means strong consistency, right?

Nomnom Cookie
Aug 30, 2009



my homie dhall posted:

what is supposed to happen if a node is temporarily unavailable or slow to update?

if a node fucks off into hyperspace then we mark it down. if it’s slow to update then we wait until it updates

Nomnom Cookie
Aug 30, 2009



my homie dhall posted:

you realize synchronization means strong consistency, right?

hmm does it. I feel like in this case we can use the fact that the pod is dying to make things work without getting too fancy. look at this:

at resourceVersion m a pod is deleted
at resourceVersion n we observe that every ready node’s kube-proxy is synced up to at least m
with some hand waving about kubelet and kube-proxy we can now say that it’s no longer possible for a ready node to send requests to the pod, and it won’t be possible at any future resourceVersion
kubelet kills the pod

I think it’s worth noting that this only works if there’s some way to keep unready nodes from receiving incoming traffic, and usually that way is “ask a properly built load balancer to handle that” which tickles me

dads friend steve
Dec 24, 2004

I’m not understanding what you guys are talking about where it’s impossible to remove pods gracefully without full cluster synchronization (not sure what this means to be honest) or hammering etcd. but I do know AWS ELB has been able to do connection draining for the better part of a decade, so I’m going to have to agree with this

12 rats tied together posted:

we have like 4 decades of HA load balancers, software and hardware, that are able to agree on when it's time for a node to stop receiving traffic

it doesn't work in the kube because it was designed by an advertising company and adtech only cares about dumb and bad poo poo happening when it's happening 10 or 20k times per second

my homie dhall
Dec 9, 2010

honey, oh please, it's just a machine

Nomnom Cookie posted:

hmm does it. I feel like in this case we can use the fact that the pod is dying to make things work without getting too fancy. look at this:

at resourceVersion m a pod is deleted
at resourceVersion n we observe that every ready node’s kube-proxy is synced up to at least m
with some hand waving about kubelet and kube-proxy we can now say that it’s no longer possible for a ready node to send requests to the pod, and it won’t be possible at any future resourceVersion
kubelet kills the pod

I think it’s worth noting that this only works if there’s some way to keep unready nodes from receiving incoming traffic, and usually that way is “ask a properly built load balancer to handle that” which tickles me

note that this is equivalent to “send a message to every node, commit once all of them reply to confirm” which I think is obviously not HA. you essentially have a distributed state machine that you need kept consistent (here meaning past a certain revision) across all modes that can only move forward if all nodes are available

I do think something like this could be a reasonable implementation of a different project for small-ish clusters and maybe even what a majority of cluster owners would want (most clusters presumably being on the smaller side), but it’s choosing a different set of trade offs that wouldn’t work for larger clusters that actually do need HA

my homie dhall
Dec 9, 2010

honey, oh please, it's just a machine

dads friend steve posted:

I’m not understanding what you guys are talking about where it’s impossible to remove pods gracefully without full cluster synchronization (not sure what this means to be honest) or hammering etcd. but I do know AWS ELB has been able to do connection draining for the better part of a decade, so I’m going to have to agree with this

connection draining in traditional setups is fine because the number of nodes is small, the price for keeping them in sync is fairly small. with kubernetes services, every node in your cluster becomes a load balancer lol

my homie dhall
Dec 9, 2010

honey, oh please, it's just a machine
also i recognize i am completely talking out of my own rear end here, but this is the problem as I see it

Nomnom Cookie
Aug 30, 2009



my homie dhall posted:

note that this is equivalent to “send a message to every node, commit once all of them reply to confirm” which I think is obviously not HA. you essentially have a distributed state machine that you need kept consistent (here meaning past a certain revision) across all modes that can only move forward if all nodes are available

I do think something like this could be a reasonable implementation of a different project for small-ish clusters and maybe even what a majority of cluster owners would want (most clusters presumably being on the smaller side), but it’s choosing a different set of trade offs that wouldn’t work for larger clusters that actually do need HA

nah fam if a node isn’t available we can assume it’s also not forwarding requests and so doesn’t need to block deletion. the actual problem is nodes that are available but not making progress. I contend that getting stuck is correct in this case and if your infra team doesn’t like getting pages about it then they shouldn’t have pushed so hard for kubernetes

Adbot
ADBOT LOVES YOU

distortion park
Apr 25, 2011


I should point out that idk if the problem I originally posted is impossible to solve in general, but it definitely didn't occur using ECS Fargate and definitely did running the same system on eks. This was a pretty small system with light but consistent load

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply