Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Nomnom Cookie
Aug 30, 2009



bob dobbs is dead posted:

why are you fuckin touching hadoop in tyool 2018

EMR runs spark on yarn and this is significantly less of a pain in my balls than admining a bunch of poo poo to run spark jobs on

altho with spark gaining k8s integration using an EKS cluster w/autoscaler to run jobs on sounds p. good

Adbot
ADBOT LOVES YOU

Finster Dexter
Oct 20, 2014

Beyond is Finster's mad vision of Earth transformed.

Kevin Mitnick P.E. posted:

do you really think it's a good idea to replace go with a language that has worse usability than c++

Becasue it's hard to imagine something with worse usability than go.

redleader
Aug 18, 2005

Engage according to operational parameters

bob dobbs is dead posted:

why are you fuckin touching hadoop in tyool 2018

how else are you going to do, uh, stuff with, uhh, big data?

Nomnom Cookie
Aug 30, 2009



Finster Dexter posted:

Becasue it's hard to imagine something with worse usability than go.

its rust. rust has worse usability than go

bob dobbs is dead
Oct 8, 2017

I love peeps
Nap Ghost

redleader posted:

how else are you going to do, uh, stuff with, uhh, big data?

use spark

or do it all on one machine, lol

fritz
Jul 26, 2003

gonadic io posted:

agreed, but julia kinda looks interesting (if it counts as a plang, it's weakly and optionally typed)

1-based indexing

TheFluff
Dec 13, 2006

FRIENDS, LISTEN TO ME
I AM A SEAGULL
OF WEALTH AND TASTE
https://twitter.com/garybernhardt/status/600783770925420546

The MUMPSorceress
Jan 6, 2012


^SHTPSTS

Gary’s Answer

bob dobbs is dead posted:

use spark

or do it all on one machine, lol

spark is trash and fails randomly for no discernable reason. I'm actually using crunch but it runs over a Hadoop cluster.

and drat I'm getting bit by a bunch of tiny bugs that are subtle and snuck through code review and are probably my fault. what a Friday.

Beamed
Nov 26, 2010

Then you have a responsibility that no man has ever faced. You have your fear which could become reality, and you have Godzilla, which is reality.


Kevin Mitnick P.E. posted:

its rust. rust has worse usability than go

lmao

ComradeCosmobot
Dec 4, 2004

USPOL July

bob dobbs is dead posted:

why are you fuckin touching hadoop in tyool 2018

lol like spark is any better to debug

The MUMPSorceress
Jan 6, 2012


^SHTPSTS

Gary’s Answer
I found my root issue and it wasn't my fault so there's that. I will have to resume digging on Monday tho

hackbunny
Jul 22, 2007

I haven't been on SA for years but the person who gave me my previous av as a joke felt guilty for doing so and decided to get me a non-shitty av
between go and rust, I recommend swift

Nomnom Cookie
Aug 30, 2009



ComradeCosmobot posted:

lol like spark is any better to debug

run it standalone on your laptop and attach a debugger hth

toiletbrush
May 17, 2010
this but to say "don't use Salesforce"

ComradeCosmobot
Dec 4, 2004

USPOL July

Kevin Mitnick P.E. posted:

run it standalone on your laptop and attach a debugger hth

great let me just copy this multi-terabyte dataset that breaks on exactly one (unidentified) data item over to my laptop...

Corla Plankun
May 8, 2007

improve the lives of everyone
i know that feel

its always the last line bc the file was truncated when it was being pulled onto the cluster hth

Nomnom Cookie
Aug 30, 2009



ComradeCosmobot posted:

great let me just copy this multi-terabyte dataset that breaks on exactly one (unidentified) data item over to my laptop...

have you met my friend shitloads of logging, he really helps me out with stuff like that

distortion park
Apr 25, 2011


Corla Plankun posted:

i know that feel

its always the last line bc the file was truncated when it was being pulled onto the cluster hth

Or the vendor stuck a disclaimer in the last line lol

gonadic io
Feb 16, 2011

>>=
terrible games programmer status (i had time off work):



i have absolutely no idea why moving the camera even slightly completely breaks the mouse raycasting. as far as i can tell all the view/proj matrices get updated.

all in rust of course

Blinkz0rz
May 27, 2001

MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS
spring has these really great concepts called interceptors and filters that can be applied before a request hits your controller. they seem like an ideal place to do hmac validation but apparently HttpServletRequest doesn't cache the request body in any way so once you read it it's gone forever and you can't reassign the body back into the request in any way

so now i'm making a hmac validation utility and calling it from the first line of all my controllers rather than doing it the way i want

suffix
Jul 27, 2013

Wheeee!

Blinkz0rz posted:

spring has these really great concepts called interceptors and filters that can be applied before a request hits your controller. they seem like an ideal place to do hmac validation but apparently HttpServletRequest doesn't cache the request body in any way so once you read it it's gone forever and you can't reassign the body back into the request in any way

so now i'm making a hmac validation utility and calling it from the first line of all my controllers rather than doing it the way i want

if they're like jetty filters you can make a new request that wraps the old one but with getReader etc. overriden and pass that down the chain

animist
Aug 28, 2018

quote:

Not only are existing tests not very good, most things aren't tested at all. You might point out that the coverage stats for a lot of packages aren't so bad, but last time I looked, there was a bug in the coverage tool that caused it to only aggregate coverage statistics for functions with non-zero coverage. That is to say, code in untested functions doesn't count towards the coverage stats!

:laffo:

Nomnom Cookie
Aug 30, 2009



suffix posted:

if they're like jetty filters you can make a new request that wraps the old one but with getReader etc. overriden and pass that down the chain

now i'm trying to remeber if ServletRequestFilter exposes the filter chain. I think so!

Nomnom Cookie
Aug 30, 2009



Blinkz0rz posted:

spring has these really great concepts called interceptors and filters that can be applied before a request hits your controller. they seem like an ideal place to do hmac validation but apparently HttpServletRequest doesn't cache the request body in any way so once you read it it's gone forever and you can't reassign the body back into the request in any way

so now i'm making a hmac validation utility and calling it from the first line of all my controllers rather than doing it the way i want

ServletRequestFilter is provided by the spec and runs before DispatcherServlet. interceptors are provided by spring and run after DispatcherServlet sees the request. sounds like this is global to your app so just do a filter and pass a wrapped request down the chain.

Blinkz0rz
May 27, 2001

MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

suffix posted:

if they're like jetty filters you can make a new request that wraps the old one but with getReader etc. overriden and pass that down the chain

Kevin Mitnick P.E. posted:

ServletRequestFilter is provided by the spec and runs before DispatcherServlet. interceptors are provided by spring and run after DispatcherServlet sees the request. sounds like this is global to your app so just do a filter and pass a wrapped request down the chain.

i tried this but couldn't get the @RequestBody annotation to recognize the cached request body in the wrapped request. then i gave up and wrote a util :(

abigserve
Sep 13, 2009

this is a better avatar than what I had before
while we are on the subject of big data, what is the current consensus around storing and parsing structured data beyond 15-20 TB in size?

I want to do netflow storage, and I'm happy to write some code to do it because all the existing tools either suck or don't scale. But I look at stuff like Spark and it just seems overly complicated, maybe, I don't know?

Sapozhnik
Jan 2, 2005

Nap Ghost
Make a servlet filter that passes a subclass of httpservletrequestwrapper down the chain. Cache your request body there

CRIP EATIN BREAD
Jun 24, 2002

Hey stop worrying bout my acting bitch, and worry about your WACK ass music. In the mean time... Eat a hot bowl of Dicks! Ice T



Soiled Meat

Blinkz0rz posted:

i tried this but couldn't get the @RequestBody annotation to recognize the cached request body in the wrapped request. then i gave up and wrote a util :(

if you are using the webflux stuff you can alter the request in a filter by wrapping the Mono body in a MonoProcessor<T> and using an operator to essentially split the reactive stream into both your hmac poo poo and the originally web filter chain.

christ i just read that over and that's a lot of loving bullshit words, but it works

floatman
Mar 17, 2009
Hey terrible programming thread let's talk about gender.
Like, I have a customer management program CMS style thing. Because idiot hellfucker, it's one massive mono instance for multiple region worldwide.
Different users are going to have different gender management requirements. I.e. region A just wants to collect M, F, X. Region B wants to collect M, F, X, Y, P.
Anybody has any experience in implementing such a pattern?

One issue is the system is basically very poo poo database driven, so we got all the old data storing columns of old gender data consisting of M and Fs with some Xs.
I was thinking something like:
Create one master set of gender codes (M, F, X, Whatever) along with default mapping master set (M is male, F is female, X is unspecified, etc)
Each organisation can specify from the master set the subset of gender codes they wish to represent their customers (i.e. org A uses M and F, org B issues M,F,X org C uses M, F, X, A)
Each organisation can specify specific mapping overrides that will apply to their organisation only (M instead of Male is mapped to "Guy")
Create an abstraction which is basically a function that takes in the organisation id and returns a list of gender codes and mappings for that organisation id.

So far the main feeling I get from that idea is the gender codes are loosely enforced in the database i.e. you can get the configured acceptable gender codes, but you can store any gender code of the master set in the database record. I kinda don't care about this.
The other problem is custom mapping of labels i.e. if an organisation maps the code of "M", "Male" to "Guy" then that's fine, but the system technically allows them to map "M" to "Girl" as well, and I think that would gently caress things up when we need to aggregate across organisations gender information

CRIP EATIN BREAD
Jun 24, 2002

Hey stop worrying bout my acting bitch, and worry about your WACK ass music. In the mean time... Eat a hot bowl of Dicks! Ice T



Soiled Meat
you can use a lookup table like you suggest (join tables) but honestly you'll probably be fine just using a single character as your gender column and then using a separate table with no foreign key that identifies all the gender settings for a specific company, which will be a tuple like (company_id, code, description). obviously using a join table is "more correct" but constantly doing joins for some things is pointless. especially if they have changes, like say a company had A, B and C available, but then decided C isn't an option anymore. So now you'll also have to implement a "deleted" column and it all ends up being a lot of work when you could just use a column, and performance will probably be better avoiding the joins.

redleader
Aug 18, 2005

Engage according to operational parameters

abigserve posted:

while we are on the subject of big data, what is the current consensus around storing and parsing structured data beyond 15-20 TB in size?

I want to do netflow storage, and I'm happy to write some code to do it because all the existing tools either suck or don't scale. But I look at stuff like Spark and it just seems overly complicated, maybe, I don't know?

my dude have you heard of mongodb

abigserve
Sep 13, 2009

this is a better avatar than what I had before

redleader posted:

my dude have you heard of mongodb

I heard that it's Web Scale, can you confirm?

But actually is it good or

Sereri
Sep 30, 2008

awwwrigami

redleader posted:

my dude have you heard of mongodb

please do not promote self-harm outside of tcc

Jabor
Jul 16, 2010

#1 Loser at SpaceChem

CRIP EATIN BREAD posted:

you can use a lookup table like you suggest (join tables) but honestly you'll probably be fine just using a single character as your gender column and then using a separate table with no foreign key that identifies all the gender settings for a specific company, which will be a tuple like (company_id, code, description). obviously using a join table is "more correct" but constantly doing joins for some things is pointless. especially if they have changes, like say a company had A, B and C available, but then decided C isn't an option anymore. So now you'll also have to implement a "deleted" column and it all ends up being a lot of work when you could just use a column, and performance will probably be better avoiding the joins.

Yes, I'm sure that querying for an id and then manually doing a join on the client is going to be way more efficient than getting your relational database to do it.

hailthefish
Oct 24, 2010

Sereri posted:

please do not promote self-harm outside of tcc




ORM reduction

gonadic io
Feb 16, 2011

>>=

hailthefish posted:

ORM reduction

Powerful Two-Hander
Mar 10, 2004

Mods please change my name to "Tooter Skeleton" TIA.


CRIP EATIN BREAD posted:

you can use a lookup table like you suggest (join tables) but honestly you'll probably be fine just using a single character as your gender column and then using a separate table with no foreign key that identifies all the gender settings for a specific company, which will be a tuple like (company_id, code, description). obviously using a join table is "more correct" but constantly doing joins for some things is pointless. especially if they have changes, like say a company had A, B and C available, but then decided C isn't an option anymore. So now you'll also have to implement a "deleted" column and it all ends up being a lot of work when you could just use a column, and performance will probably be better avoiding the joins.


let's see, 26 characters a to z, make it case sensitive gives us 52,0-9 takes us to 62, add in NULL and an empty string and boom, 64 genders which should be enough for anyone

CRIP EATIN BREAD
Jun 24, 2002

Hey stop worrying bout my acting bitch, and worry about your WACK ass music. In the mean time... Eat a hot bowl of Dicks! Ice T



Soiled Meat

Jabor posted:

Yes, I'm sure that querying for an id and then manually doing a join on the client is going to be way more efficient than getting your relational database to do it.

the idea is you don't need a join at all, ever.

Powerful Two-Hander
Mar 10, 2004

Mods please change my name to "Tooter Skeleton" TIA.


in all seriousness though, you should use a join because if you don't and just have a single column someone is gonna be posting in this thread in future going "how the do I unfuck this single column that has some mysterious and arbitrary meaning"

if you have Person with a column called CompanyGender_id (or whatever), key that to a table that had the short code, description and company_id and you can get available genders for a company, edit them, add new without affecting anything all without having to mess with the person table

Adbot
ADBOT LOVES YOU

CRIP EATIN BREAD
Jun 24, 2002

Hey stop worrying bout my acting bitch, and worry about your WACK ass music. In the mean time... Eat a hot bowl of Dicks! Ice T



Soiled Meat
obv it all depends on the use-case, but often times you'll want to pull from a pool of values so you can still do queries like "give me the count of people grouped by value X across all companies".

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply