awful programming: feature flags and suffering

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > YOSPOS > awful programming: feature flags and suffering

«‹›1115 »

MononcQc: May 29, 2007

Dont Touch ME posted:

Reversing unicode is trivial. Allocate the size of the string, iterate through the data, blast it into the new buffer in reverse order. When you hit a codepoint, parse it, inc/dec your markers as appropriate.

you probably want to do it on a grapheme cluster basis rather than codepoints to avoid messing combining marks, but you�ll also need to deal with issues of defining what it means to reverse text when it contains subgroups with LTR and RTL sequences in them.

# ? Mar 31, 2021 11:52

Adbot: ADBOT LOVES YOU

# ? Apr 25, 2024 01:14

HappyHippo: Nov 19, 2003; Do you have an Air Miles Card?

actual use case of reversed strings:

https://en.wikipedia.org/wiki/GADDAG

# ? Mar 31, 2021 14:30

Dont Touch ME: Apr 1, 2018

Soricidus posted:

isn’t that a pretty straightforward classification problem? accuracy would be proportional to the length of the string I guess, but we’ve come on a long way since bush hid the facts

It ends up being more of a mental problem that taxes critical analysis and design skills. You need to be able to look at dozens of encodings, pick out what makes them unique, and figure out the best way to contrast them against each other. It's not the hardest thing in the world, it just requires you to actually think and design a solution. Unicode is easy, but it can be quite hard to distinguish CJK encodings.

MononcQc posted:

you probably want to do it on a grapheme cluster basis rather than codepoints to avoid messing combining marks, but you’ll also need to deal with issues of defining what it means to reverse text when it contains subgroups with LTR and RTL sequences in them.

If we're running into the problems of semantics, we're already in trouble because to me a string is any linearly ordered, atomic data. A true reversed string reverses the order of the bits, bytes or words, depending on the domain. I would happily explain this nuance using the Intel programming manual as a citation before being told that they'll call me back.

# ? Mar 31, 2021 15:04

jesus WEP: Oct 17, 2004

the real world application of reversing a string (or of fizzing a buzz) isn�t the point, it�s a 10 minute check to see whether there�s any point doing the rest of the interview

# ? Mar 31, 2021 15:08

Bored Online: May 25, 2009; We don't need Rome telling us what to do.

you call the reverse method bing bong problem solved

# ? Mar 31, 2021 15:27

Carthag Tuek: Oct 15, 2005; Tider skal komme,
tider skal henrulle,
sl�gt skal f�lge sl�gters gang

can you reverse a string?

yeah but I'm not gonna.

# ? Mar 31, 2021 16:13

pokeyman: Nov 26, 2006; That elephant ate my entire platoon.

jesus WEP posted:

the real world application of reversing a string (or of fizzing a buzz) isn�t the point, it�s a 10 minute check to see whether there�s any point doing the rest of the interview

"we're not looking for a programmer who thinks about their work, just take the ticket and stamp out a solution and shut the hell up"

# ? Mar 31, 2021 16:19

pokeyman: Nov 26, 2006; That elephant ate my entire platoon.

HappyHippo posted:

actual use case of reversed strings:

https://en.wikipedia.org/wiki/GADDAG

uh well technically that's not a string, you're reversing a list of tiles from a board game :goonsay:

jk that's a good example

now I wonder what scrabble looks like in, say, korean

# ? Mar 31, 2021 16:23

shoeberto: Jun 13, 2020; which way to the MACHINES?

Soricidus posted:

isn�t that a pretty straightforward classification problem? accuracy would be proportional to the length of the string I guess, but we�ve come on a long way since bush hid the facts

Turns out a lot of standard Unix tools choke on this problem because of sampling strategies when checking larger documents. Efficient generalized heuristics are hard. (think I had like a 4gb vendor file that mostly appeared ASCII-encoded until you actually tried to load it row-by-row into Postgres)

# ? Mar 31, 2021 16:29

alexandriao: Jul 20, 2019

Dont Touch ME posted:

You want a non-trivial challenge? Write an algo to make a best guess of a string's encoding. Is it Big5-HKSCS or GB2312?

Is this not the exact reason c2wiki died

Soricidus posted:

isn’t that a pretty straightforward classification problem? accuracy would be proportional to the length of the string I guess, but we’ve come on a long way since bush hid the facts

lmbo

alexandriao fucked around with this message at 17:05 on Mar 31, 2021

# ? Mar 31, 2021 16:59

Share Bear: Apr 27, 2004

side promoted (not sure what to call this), same level but different department that's more laid back :toot:

plangs: still good and are my friend, for my use case, which is real programming that pays my rent

# ? Mar 31, 2021 17:33

PIZZA.BAT: Nov 12, 2016

lol

https://www.nchannel.com/blog/csv-file-based-integration-vs-api/

# ? Mar 31, 2021 20:15

Valeyard: Mar 30, 2012; Grimey Drawer

PIZZA.BAT posted:

lol

https://www.nchannel.com/blog/csv-file-based-integration-vs-api/

theres obviously a lot to unpack here, but my choice is..

Since CSVs are plain-text files, its easier for a web developer or other members of your team to create, view, and validate the data as a spreadsheet.

...
...
...

# ? Mar 31, 2021 20:34

Carthag Tuek: Oct 15, 2005; Tider skal komme,
tider skal henrulle,
sl�gt skal f�lge sl�gters gang

oh wow. there are so many things wrong with that that I don't know where to start

manual data validation via opening csv files in excel

tbh i don't think I can think up a more certain way to gently caress up your data than that

# ? Mar 31, 2021 21:38

Sapozhnik: Jan 2, 2005; Nap Ghost

All the poo poo programming languages and data formats invented at the dawn of computing are going to reverberate a thousand years into the future with as much force as historical events like the Battle of Hastings

pretty hosed up if you think about it

# ? Mar 31, 2021 22:18

FlapYoJacks: Feb 12, 2009

The best IPC method was created 30 years ago. A raw socket and serialized structs. :colbert:

# ? Mar 31, 2021 22:36

Carthag Tuek: Oct 15, 2005; Tider skal komme,
tider skal henrulle,
sl�gt skal f�lge sl�gters gang

Sapozhnik posted:

All the poo poo programming languages and data formats invented at the dawn of computing are going to reverberate a thousand years into the future with as much force as historical events like the Battle of Hastings

pretty hosed up if you think about it

lmao even if we stop reverberating data, we are definitely not gonna have a thousand years

# ? Mar 31, 2021 22:45

abraham linksys: Sep 6, 2010

http://digital-preservation.github.io/csv-schema/csv-schema-1.0.html

:shepface:

# ? Mar 31, 2021 23:22

leper khan: Dec 28, 2010; Honest to god thinks Half Life 2 is a bad game. But at least he likes Monster Hunter.

DoomTrainPhD posted:

The best IPC method was created 30 years ago. A raw socket and serialized structs.

The number of people I've worked with that don't know the difference between a raw socket and a websocket is staggering.

# ? Mar 31, 2021 23:59

Powerful Two-Hander: Mar 10, 2004; Mods please change my name to "Tooter Skeleton" TIA.

oh my god the current argument on the clusterfuck AI project is about the data we'll eventually get back because the business users keep saying "but it's a csv why is it so hard?" do we have any idea what data will be in the csv? No.

# ? Apr 1, 2021 00:07

FlapYoJacks: Feb 12, 2009

leper khan posted:

The number of people I've worked with that don't know the difference between a raw socket and a websocket is staggering.

Well of course they didn�t get it :buddy:

# ? Apr 1, 2021 00:09

Carthag Tuek: Oct 15, 2005; Tider skal komme,
tider skal henrulle,
sl�gt skal f�lge sl�gters gang

abraham linksys posted:

http://digital-preservation.github.io/csv-schema/csv-schema-1.0.html

if i ever get that kind of data i will destroy my job and become destitute

# ? Apr 1, 2021 00:23

PIZZA.BAT: Nov 12, 2016

abraham linksys posted:

http://digital-preservation.github.io/csv-schema/csv-schema-1.0.html

i love all these attempts at creating new schemas because it always follows the crab-cycle of xml and eventually just turns into a sub-par xsd

# ? Apr 1, 2021 02:17

bob dobbs is dead: Oct 8, 2017; I love peeps; Nap Ghost

edn was intended to be a lisp from the start and thats why they just stayed a lisp. its just quoted clojure lol

# ? Apr 1, 2021 02:18

Agile Vector: May 21, 2007; scrum bored

PIZZA.BAT posted:

the crab-cycle of xml

# ? Apr 1, 2021 02:29

Carthag Tuek: Oct 15, 2005; Tider skal komme,
tider skal henrulle,
sl�gt skal f�lge sl�gters gang

Agile Vector posted:

mods pizza.bat pls

# ? Apr 1, 2021 02:31

Powerful Two-Hander: Mar 10, 2004; Mods please change my name to "Tooter Skeleton" TIA.

PIZZA.BAT posted:

the crab-cycle of xml

this marks up the crab

# ? Apr 1, 2021 12:03

distortion park: Apr 25, 2011

Carthag Tuek posted:

oh wow. there are so many things wrong with that that I don't know where to start

manual data validation via opening csv files in excel

tbh i don't think I can think up a more certain way to gently caress up your data than that

my previous place had a special automation tasks that:
- saved attachments to emails sent to specific inboxes to various fileshares, with the eventual location based on some rules and the inbox/subject line/attachment name
- converted excel files (some of the attachments were excel files) into csvs
- imported the csvs into our databases

it worked about as well as you would expect. the particularly sad thing was when you realised that some of the people on the other end hadn't automated their side of things, so it went from being a frustrating story about inefficient integrations to one of a poor junior analyst hand curating excel files daily, and presumably getting shouted at every other week when they hosed it up.

# ? Apr 1, 2021 14:54

mystes: May 31, 2006

PIZZA.BAT posted:

lol

https://www.nchannel.com/blog/csv-file-based-integration-vs-api/

quote:

11/07/18 Jillian Hufford, Marketing Analyst
Why CSV File-based Integration Can Be Better than API-based Integration

quote:

Jillian joined nChannel as their Marketing Analyst. Using both her writing and analytic skills, she assists the Marketing and Sales teams. Jillian performs competitor market research, provides analysis of key sales metrics, and writes informative posts on multichannel commerce trends. She holds a BA in Marketing from Otterbein University.

To be fair there are probably a lot of people without the title of "Marketing Analyst" who also hold this dumb opinion.

Anyway even ignoring not having a schema, CSV files completely suck because there's no actual standard and just trying to get them into and out of Excel you're immediately going to run into annoying problems with number formats and trying to get Excel to recognize utf8 and things like that (and that's assuming you're even doing the quoting in the way excel accepts).

I can't even imagine having some business critical process relying on automatically reading csv files, especially ones created by humans in random different programs.

mystes fucked around with this message at 15:05 on Apr 1, 2021

# ? Apr 1, 2021 15:00

shoeberto: Jun 13, 2020; which way to the MACHINES?

mystes posted:

I can't even imagine having some business critical process relying on automatically reading csv files, especially ones created by humans in random different programs.

This has been a significant part of my job for the past 6 or so years. It gets old. Real old.

# ? Apr 1, 2021 15:39

Share Bear: Apr 27, 2004

mystes posted:

I can't even imagine having some business critical process relying on automatically reading csv files, especially ones created by humans in random different programs.

i also do a fair bit of csv/excel wrangling aka "ETL"

you gotta establish a spec and say use a lib that implements the rfc and hope that whoever is shipping data to you does

people love csvs and excel

# ? Apr 1, 2021 15:45

HamAdams: Jun 29, 2018; yospos

My last job had a significant number of small ETL processes written in PHP. It got to the point where nobody knew what was out there or where it lived until something would stop working because somebody kicked the power strip powering the PC that's been under their desk since they started. Their motto had always been "we'll accept any file from the customer"

# ? Apr 1, 2021 16:01

abraham linksys: Sep 6, 2010

is it just me or are all "observability platforms" completely incoherent? i have a little web app i'm trying to set up some real low stakes monitoring for - like, "holler at me if there's a wild spike in requests" - and surveying the tooling landscape is an incomprehensible mix of buzzwords

i have sentry set up for exception alerting and honestly that's probably 90% of what i need so maybe i shouldn't worry beyond that. digitalocean's vps monitoring will also yell at me if i am running out of disk space/memory or cpu usage is super high for a while

# ? Apr 1, 2021 18:21

12 rats tied together: Sep 7, 2006

they are all bullshit, yeah

if all you want to know about is requests per time, that's reasonably classifiable as a metric, so you can cut a lot of it out by looking for metrics tooling instead of "observability". prometheus is the latest emerging standard here and to give them some credit its quite easy to instrument your code with it, assuming that your webserver doesn't already have a prometheus exporter for requests/sec

observability in theory includes logs, events, metrics, and usually some kind of tooling for reading them all in a useful way to find correlations and poo poo. in practice, every monitoring-adjacent vendor slaps the term on their website for SEO

# ? Apr 1, 2021 18:28

Powerful Two-Hander: Mar 10, 2004; Mods please change my name to "Tooter Skeleton" TIA.

pointsofdata posted:

my previous place had a special automation tasks that:
- saved attachments to emails sent to specific inboxes to various fileshares, with the eventual location based on some rules and the inbox/subject line/attachment name
- converted excel files (some of the attachments were excel files) into csvs
- imported the csvs into our databases

it worked about as well as you would expect. the particularly sad thing was when you realised that some of the people on the other end hadn't automated their side of things, so it went from being a frustrating story about inefficient integrations to one of a poor junior analyst hand curating excel files daily, and presumably getting shouted at every other week when they hosed it up.

oh god people keep on suggesting using magic mailboxes for storing /identifying documents or to "automate a process" instead of actually doing something proper and I keep having to shoot it down.

usually it goes hand in hand with "we can use chat bots!"

# ? Apr 1, 2021 18:30

Corla Plankun: May 8, 2007; improve the lives of everyone

in their defense, magic mailboxes sound really cool and fun if you don't know anything about the implementation

# ? Apr 1, 2021 18:37

Powerful Two-Hander: Mar 10, 2004; Mods please change my name to "Tooter Skeleton" TIA.

I think the biggest problem with excel is that it doesn't force you to define column types when you actually create a sheet and now that it's too late to go back and force that behaviour they've built this whole type inference structure (which is, to be fair, less annoying than the SSIS one that only ever looked at the top 1000 rows so would continuously trip you up when doing flat file imports) so now nobody ever defines anything and excel guesses at it and excel steals the csv file association then turbofucks any file you open with it until your just end up quoting every piece of data to force excel to treat it as text

Honorable mention is whoever coded RapidSql which i am forced to use and has a "feature" where it will include comma separators in integers for thousands so if you c&p you can get "1,000" back which excel then thinks is a piece of text.

# ? Apr 1, 2021 18:38

abraham linksys: Sep 6, 2010

12 rats tied together posted:

if all you want to know about is requests per time, that's reasonably classifiable as a metric, so you can cut a lot of it out by looking for metrics tooling instead of "observability". prometheus is the latest emerging standard here and to give them some credit its quite easy to instrument your code with it, assuming that your webserver doesn't already have a prometheus exporter for requests/sec

i'd always avoided prometheus since when i originally looked into it, it seemed like it required self-hosting - like, even if you used a "cloud" variant you're still supposed to have an on-prem prometheus database that you send stuff to the cloud from - but it looks like grafana labs recently(?) introduced a thing called the "grafana agent" which is "prometheus without the database since you're just sending all your poo poo to the cloud"

gonna i guess try their grafana cloud thing since it has a free tier. it's pretty incoherent since it's just a bunch of open source poo poo slammed together, but i think if i just stick with metrics and ignore the other stuff it'll be ok. the web framework i use (javalin) has a micrometer plugin and i can set up micrometer to export to prometheus which gets read by the agent, i think

they do also have logs (loki) and traces (tempo, which seems very new) which i might give a go. traces are kinda nice for me because while i only have one service, i do make a lot of requests to external apis, and it'd be nice to keep an eye on how slow those are, though i guess metrics could cover that just as well. and logs would be nice because right now i'm still sshing into my box to look at logs lol

# ? Apr 1, 2021 18:40

Shaggar: Apr 26, 2006

Powerful Two-Hander posted:

I think the biggest problem with excel is that it doesn't force you to define column types when you actually create a sheet and now that it's too late to go back and force that behaviour they've built this whole type inference structure (which is, to be fair, less annoying than the SSIS one that only ever looked at the top 1000 rows so would continuously trip you up when doing flat file imports) so now nobody ever defines anything and excel guesses at it and excel steals the csv file association then turbofucks any file you open with it until your just end up quoting every piece of data to force excel to treat it as text

Honorable mention is whoever coded RapidSql which i am forced to use and has a "feature" where it will include comma separators in integers for thousands so if you c&p you can get "1,000" back which excel then thinks is a piece of text.

you can customize how the SSIS column suggestion thing works and also have it process way more rows during discovery. However, if you're sure you know what the columns are its best to set the types manually.

# ? Apr 1, 2021 18:42

Adbot: ADBOT LOVES YOU

# ? Apr 25, 2024 01:14

Powerful Two-Hander: Mar 10, 2004; Mods please change my name to "Tooter Skeleton" TIA.

Shaggar posted:

you can customize how the SSIS column suggestion thing works and have it process way more rows during discovery as well. However, if you're sure you know what the columns are its best to set the types manually.

yeah tbf last time I used it was like 10 years ago, I mainly liked the soothing way the row counts would move across the flow, that was nice

not so nice when it choked on a column length later so maybe the India team were on to something when they just set everything to varchar(max)

# ? Apr 1, 2021 18:44

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > YOSPOS > awful programming: feature flags and suffering

«‹›1115 »