awful programming: feature flags and suffering

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > YOSPOS > awful programming: feature flags and suffering

«‹›1114 »

MononcQc: May 29, 2007

my fav bit of the kafka architecture is whenever someone notices that oh yeah topic compaction is not enough to guarantee reliable long term storage (i.e. re-partitioning fucks with all the keys and therefore linear history of entries) so you need another canonical data source to act as a kind of backup, and so what you do is put a consumer that materializes the views in a DB.

But that's nice because you can use the DB for some direct querying. Except for some stateful component doing stream analysis over historical data; every time that component restarts, you need to sync the whole state to build the thing afresh, but doing this from a DB is not super simple so you do it from Kafka, but since Kafka can't necessarily tell you it has all the data and the DB is the one that's canonically right, you end up building ad-hoc diffs between a DB and a live stream for every restart

And there's like no good solution, you just cover your ears and hope you never make it to that point because you know you'll be hosed janitoring and reconciliating two data sources that don't necessarily have a good way to talk to each other aside from some small component/microservice written in a language only 1 person knew and they left 3 months ago

# ? Oct 29, 2018 14:16

Adbot: ADBOT LOVES YOU

# ? Apr 19, 2024 01:56

Shaggar: Apr 26, 2006

sounds like the solution is to design your db properly and then not use kafka

# ? Oct 29, 2018 14:19

MononcQc: May 29, 2007

scans the whole table sequentially on every instance boot

this is what this db was meant for

# ? Oct 29, 2018 14:21

gonadic io: Feb 16, 2011; >>=

luckily we don't give a poo poo about our data soooooo

like if any telemetry packets from devices get lost we don't care - the device resends often. if any commands from servers get lost we don't care - the controller will still see that the devices isn't in the desired state and keep resending until its telemetry changes

# ? Oct 29, 2018 14:27

gonadic io: Feb 16, 2011; >>=

MononcQc posted:

And there's like no good solution, you just cover your ears and hope you never make it to that point because you know you'll be hosed janitoring and reconciliating two data sources that don't necessarily have a good way to talk to each other aside from some small component/microservice written in a language only 1 person knew and they left 3 months ago

as one of the last scala devs here, driven out by the golang crowd but still writing critical architecture,

# ? Oct 29, 2018 14:30

MononcQc: May 29, 2007

hell if I don't know the feeling there

# ? Oct 29, 2018 14:31

AggressivelyStupid: Jan 9, 2012

terrible programming thread:

Powerful Two-Hander posted:

so yeah, that was a waste of my life.

# ? Oct 29, 2018 14:53

prisoner of waffles: May 8, 2007; Ah! well a-day! what evil looks
Had I from old and young!
Instead of the cross, the fishmech
About my neck was hung.

Powerful Two-Hander posted:

so yeah, that was a waste of my life.

ha, there should be chick tracts but for programmers

# ? Oct 29, 2018 14:58

Finster Dexter: Oct 20, 2014; Beyond is Finster's mad vision of Earth transformed.

cinci zoo sniper posted:

he smugly proclaimed that he will take care of document storage question and we will be given a special sql interface to work with xml documents without storing them in rdbms

I think postgres' XML data type does exactly this. But then you have to store your xml inside an icky rdbms

# ? Oct 29, 2018 15:31

CRIP EATIN BREAD: Jun 24, 2002; Hey stop worrying bout my acting bitch, and worry about your WACK ass music. In the mean time... Eat a hot bowl of Dicks! Ice T; Soiled Meat

prisoner of waffles posted:

ha, there should be chick tracts but for programmers

so which one would be analogous to the one where the guy rapes his daughter but everyone sweeps it under the rug because he learns to love jesus?

i'm guessing it would have to do with javascript.

# ? Oct 29, 2018 15:43

gonadic io: Feb 16, 2011; >>=

CRIP EATIN BREAD posted:

so which one would be analogous to the one where the guy rapes his daughter but everyone sweeps it under the rug because he learns to love jesus?

i'm guessing it would have to do with javascript.

No that's the last dev on a critical legacy project

# ? Oct 29, 2018 15:45

CRIP EATIN BREAD: Jun 24, 2002; Hey stop worrying bout my acting bitch, and worry about your WACK ass music. In the mean time... Eat a hot bowl of Dicks! Ice T; Soiled Meat

bus factor: kiddly diddler

# ? Oct 29, 2018 15:47

Powerful Two-Hander: Mar 10, 2004; Mods please change my name to "Tooter Skeleton" TIA.

prisoner of waffles posted:

ha, there should be chick tracts but for programmers

there's a hans reiser joke here I'm sure of it...

# ? Oct 29, 2018 15:57

cinci zoo sniper: Mar 15, 2013

Finster Dexter posted:

I think postgres' XML data type does exactly this. But then you have to store your xml inside an icky rdbms

i mean that�s every rdbms these days that offers xml functionality, the matter of question is if xml exists at all in sql environment, which is what that guy alluded to, that it doesn�t have to

# ? Oct 29, 2018 16:00

CRIP EATIN BREAD: Jun 24, 2002; Hey stop worrying bout my acting bitch, and worry about your WACK ass music. In the mean time... Eat a hot bowl of Dicks! Ice T; Soiled Meat

Powerful Two-Hander posted:

there's a hans reiser joke here I'm sure of it...

found the art:

# ? Oct 29, 2018 16:00

simble: May 11, 2004

MononcQc posted:

my fav bit of the kafka architecture is whenever someone notices that oh yeah topic compaction is not enough to guarantee reliable long term storage (i.e. re-partitioning fucks with all the keys and therefore linear history of entries) so you need another canonical data source to act as a kind of backup, and so what you do is put a consumer that materializes the views in a DB.

But that's nice because you can use the DB for some direct querying. Except for some stateful component doing stream analysis over historical data; every time that component restarts, you need to sync the whole state to build the thing afresh, but doing this from a DB is not super simple so you do it from Kafka, but since Kafka can't necessarily tell you it has all the data and the DB is the one that's canonically right, you end up building ad-hoc diffs between a DB and a live stream for every restart

And there's like no good solution, you just cover your ears and hope you never make it to that point because you know you'll be hosed janitoring and reconciliating two data sources that don't necessarily have a good way to talk to each other aside from some small component/microservice written in a language only 1 person knew and they left 3 months ago

counterpoint: just don't do this. use kafka for streaming. there are a few use cases where you'd want to store data in a a compacted topic and imo its only for things that are directly related to supporting your kafka cluster (like schemas, if you're using avro).

also the idea of repartitioning a compacted topic sounds like another nightmare that i simply would never do. i mean whats the real argument for having a large number of partitions (or increasing the number of partitions) for compacted data anyways? just have a few (<=5) and replicate it a few times.

# ? Oct 29, 2018 17:07

Bloody: Mar 3, 2013

i dont know what kafka and avro are and im pretty sure i am not really missing out on anything as a result

# ? Oct 29, 2018 17:46

CRIP EATIN BREAD: Jun 24, 2002; Hey stop worrying bout my acting bitch, and worry about your WACK ass music. In the mean time... Eat a hot bowl of Dicks! Ice T; Soiled Meat

avro is a cool serialization format that is binary and has a schema, and allows forward and backwards compatibility.

it's use-case is generally limited to situations where you control both ends, but it works great.

kafka is just plain cool but basically its just a stream processing platform that owns.

# ? Oct 29, 2018 17:49

Bloody: Mar 3, 2013

how is that different from protobufs

# ? Oct 29, 2018 18:06

Bloody: Mar 3, 2013

what kind of streams is it good for processing

# ? Oct 29, 2018 18:07

mystes: May 31, 2006

big streams

# ? Oct 29, 2018 18:12

gonadic io: Feb 16, 2011; >>=

Bloody posted:

how is that different from protobufs

Avro is much more json-oriented (but a binary version) whereas pb is much more byte-oriented.

# ? Oct 29, 2018 18:15

MononcQc: May 29, 2007

simble posted:

counterpoint: just don't do this. use kafka for streaming. there are a few use cases where you'd want to store data in a a compacted topic and imo its only for things that are directly related to supporting your kafka cluster (like schemas, if you're using avro).

also the idea of repartitioning a compacted topic sounds like another nightmare that i simply would never do. i mean whats the real argument for having a large number of partitions (or increasing the number of partitions) for compacted data anyways? just have a few (<=5) and replicate it a few times.

yeah basically that's the fine usage: you don't repartition, and you don't treat kafka as a canonical data store of any kind. You treat it as a kind of special queue that gets a couple of weeks of persistence for multiple readers and then you're good. It's just that sooner or later if you read distsys literature you'll see someone saying you could use kafka for atomic broadcast which means data replication central but that is only true as long as you never have to repartition anything ever.

Repartions basically require you to manually do stop all writing, read all entries in existing partitions, and republish each value in the new partitions with higher "stamps" before syncing all clients to start publishing in the new partition as well; my understanding is that there is no standard tooling around it and the general advice seems to be "oh yeah you shouldn't have built your cluster that way"

So the best path forwards with Kafka is to not treat it as a thing where you care enough about its data in the long term to land you in that situation.

# ? Oct 29, 2018 19:20

FamDav: Mar 29, 2008

can you not backup to s3 and does kafka not have read-through from s3

# ? Oct 29, 2018 19:24

MononcQc: May 29, 2007

FamDav posted:

can you not backup to s3 and does kafka not have read-through from s3

if you use insertion order to define a "happens-before" relationship with data points, as is often recommended for atomic broadcast implementations, then changing the partitions means you change the time relationships between each "key" in an overall stream: if you read all of partition 1 before partition 2, or if you read most of 1 before 2, and the canonical value for "key" is in 1, then you may crush newer state of "key" with older state in your materialized view.

If you're using timestamps you already had no clearly defined "happens-before" relationship so who cares (for high-frequency events that is)

# ? Oct 29, 2018 19:35

simble: May 11, 2004

MononcQc posted:

Repartions basically require you to manually do stop all writing, read all entries in existing partitions, and republish each value in the new partitions with higher "stamps" before syncing all clients to start publishing in the new partition as well; my understanding is that there is no standard tooling around it and the general advice seems to be "oh yeah you shouldn't have built your cluster that way"

So the best path forwards with Kafka is to not treat it as a thing where you care enough about its data in the long term to land you in that situation.

if I was ever put into this situation, I would likely create a new topic with a new partition count, and write a simple consumer/producer (or use kafka connect) to shovel the messages into the new topic. this way order would be preserved.

then when they're ready, consumers and producers can switch to the new topic and with monitoring, you can tell when the old topic is effectively not used and then eol/delete it.

the reality is that the only time you should need to repartition data in kafka is if you need to increase parallelism for a particular consumer group. it could happen, for sure, but it should be a relatively rare event.

# ? Oct 29, 2018 20:17

redleader: Aug 18, 2005; Engage according to operational parameters

Powerful Two-Hander posted:

so yeah, that was a waste of my life

mods new thread title please

# ? Oct 29, 2018 21:15

MononcQc: May 29, 2007

simble posted:

if I was ever put into this situation, I would likely create a new topic with a new partition count, and write a simple consumer/producer (or use kafka connect) to shovel the messages into the new topic. this way order would be preserved.

then when they're ready, consumers and producers can switch to the new topic and with monitoring, you can tell when the old topic is effectively not used and then eol/delete it.

the reality is that the only time you should need to repartition data in kafka is if you need to increase parallelism for a particular consumer group. it could happen, for sure, but it should be a relatively rare event.

Right. That's the reasonable way to do it. You do need some coordination around the transfer as well, it's just kind of funny to imagine it being a thing everyone has to reinvent every time they gotta scale up.

Incidentally you should probably have to scale up way less often if you don't have the expectation that kafka acts as you persistent data store; you can just lose older data and not bother with scaling up as long as the throughput is there. However, if you treat it as a persistent store, you may have to scale according to storage space and/or throughput. So your data cardinality + the storage may impact it.

Really, not assuming it stores your data forever is making yourself a favor operationally speaking.

# ? Oct 29, 2018 21:22

Nomnom Cookie: Aug 30, 2009

you can do a lot of things with Kafka but only a few of them are a good idea. I blame confluent pushing it for every use case possible

# ? Oct 29, 2018 21:29

simble: May 11, 2004

Kevin Mitnick P.E. posted:

you can do a lot of things with Kafka but only a few of them are a good idea. I blame confluent pushing it for every use case possible

i wholeheartedly agree with both of these points. if i hear my boss say ksql one more time....

# ? Oct 29, 2018 21:36

FlapYoJacks: Feb 12, 2009

Terrible programming you say?

Here's a static analysis of a Java application that my team is inheriting from India. :v:

# ? Oct 29, 2018 21:46

simble: May 11, 2004

# ? Oct 29, 2018 21:50

gonadic io: Feb 16, 2011; >>=

ratbert90 posted:

Terrible programming you say?

Here's a static analysis of a Java application that my team is inheriting from India.

Lmao what analysis is that? Does it work with scala? Not that we'd be anywhere near that horror show

Also sorry for your loss

# ? Oct 29, 2018 22:26

CRIP EATIN BREAD: Jun 24, 2002; Hey stop worrying bout my acting bitch, and worry about your WACK ass music. In the mean time... Eat a hot bowl of Dicks! Ice T; Soiled Meat

ratbert90 posted:

Terrible programming you say?

Here's a static analysis of a Java application that my team is inheriting from India.

weird, I JUST (like 15 minutes ago) finished setting up a new project here at work to push results and block builds through Sonar and...

gonadic io posted:

Lmao what analysis is that? Does it work with scala? Not that we'd be anywhere near that horror show

Also sorry for your loss

its sonar, and yes it does

# ? Oct 29, 2018 22:27

FlapYoJacks: Feb 12, 2009

gonadic io posted:

Lmao what analysis is that? Does it work with scala? Not that we'd be anywhere near that horror show

Also sorry for your loss

SonarQube, and oh yes it does.

Luckily most of the 1k of vulnerabilities is "Use a goddamned logger"

However, there are about 10 plaintext passwords in the java files. :v:

Edit*

I am firm in the camp of "Let's re-architect and rewrite from scratch." Management didn't want to hear it though (My boss is with me though.)
Now we have some solid actual analytics as to "this poo poo sucks and here is why."

# ? Oct 29, 2018 22:32

Jabor: Jul 16, 2010; #1 Loser at SpaceChem

Props to that one dev that wrote 15 unit tests even though it was obvious that noone else gave a poo poo.

# ? Oct 29, 2018 23:22

FlapYoJacks: Feb 12, 2009

Jabor posted:

Props to that one dev that wrote 15 unit tests even though it was obvious that noone else gave a poo poo.

And they all pass! :v:

# ? Oct 29, 2018 23:25

toiletbrush: May 17, 2010

ratbert90 posted:

I am firm in the camp of "Let's re-architect and rewrite from scratch." Management didn't want to hear it though (My boss is with me though.)
Now we have some solid actual analytics as to "this poo poo sucks and here is why."

Did management at least have the common courtesy to emptily promise that they'd give you the time and resources to refactor away the tech debt?

# ? Oct 29, 2018 23:38

FlapYoJacks: Feb 12, 2009

toiletbrush posted:

Did management at least have the common courtesy to emptily promise that they'd give you the time and resources to refactor away the tech debt?

Fun fact about my current boss:
He not only promised that on the current project that we would not only get time to re-architect, but he delivered on that promise.

We had 6 full months to rewrite and re-architect an (admittedly far better) application, and in the end, it's the nicest software with full test coverage and end to end tests that scales well.

He will fight tooth and nail for us, and our team has earned a rep of being fixers now. :v:

This is why India was so incredibly scared of us taking over even a tiny bit of their code, because they knew what I am doing would happen. They had been ignoring my requests for Jenkins access for months, and then they got a new DevOps guy in America. A bottle of scotch and 1 day later, I had the code, how they built, and how they deployed, and I am now starting to ask questions that they don't want to ask.

Questions such as:
Why do we have a 92Mb sql file in an application you swear is microservices based? :v:

FlapYoJacks fucked around with this message at 23:43 on Oct 29, 2018

# ? Oct 29, 2018 23:41

Adbot: ADBOT LOVES YOU

# ? Apr 19, 2024 01:56

Beamed: Nov 26, 2010; Then you have a responsibility that no man has ever faced. You have your fear which could become reality, and you have Godzilla, which is reality.

ratbert90 posted:

Fun fact about my current boss:
He not only promised that on the current project that we would not only get time to re-architect, but he delivered on that promise.

We had 6 full months to rewrite and re-architect an (admittedly far better) application, and in the end it's the nicest software with full test coverage and end to end tests that scales well.

He will fight tooth and nail for us, and our team has earned a rep of being fixers now.

meanwhile the company you left is still millions of dollars in the hole right?

happy endings are so heartwarming

# ? Oct 29, 2018 23:43

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > YOSPOS > awful programming: feature flags and suffering

«‹›1114 »