|
CPColin posted:Guess I'll go function-by-function! ![]()
|
![]() |
|
![]()
|
# ? Mar 29, 2023 09:31 |
|
tef posted:stripped out protobufs from python code and made them go faster and smaller with gzipped json (very string heavy data) Outside of C/C++ this is almost always going to be the case. Every serialization performance test, no matter how flawed, will show JSON performing faster than others. The gotcha is the size of the message and the transit costs that means, so say using a reverse proxy to transparent `gzip` and if the client is a browser, you get "free" decompression. ![]()
|
![]() |
|
to add: if your transport is http and you gzip in code, you hosed up
|
![]() |
|
is that brotli compression thing worth bothering with it was created by googleoids so i assume no
|
![]() |
|
we use a lot of zstd at current job and i don't think i've heard anyone complaining about it yet
|
![]() |
|
Sapozhnik posted:is that brotli compression thing worth bothering with Brotli is excellent for compressing static web text and you should do it. It is not a general purpose compressor. It has a big dictionary in the spec, so every browser has a giant brotli reference set shipping with it that lets you do insanely effective compression. Zstd is insanely good and general purpose.
|
![]() |
|
Spent half the day getting the applet used for the build system to run after corporate policy hosed around with our Java installs again, but now it’s Friday and my poo poo is heading off for prod.
|
![]() |
|
Carthag Tuek posted:e: it was 70% but still lmao I forgot about this lol
|
![]() |
|
Crossposting from Coding Horrors because I figured it out:CPColin posted:gently caress you, Groovy (highlighting as Java for readability):
|
![]() |
|
brotli (usually) is quicker to compress and compresses marginally better (compare brotli level 5 to zstd level 9 for a similar reference of “decently compressed”), and the differences we're talking about is a variable couple of % of compression ratio difference at very variable compression speed difference. you can also do the dictionary thing with zstd if you're benefitting from min-maxing the numbers after comma on really small files. zstd decompresses much faster in literally any scenario, and is not a streamed format so you get such innovative features like checksums for your compressed files in other words, brotli is great at sending a webpage into a web browser, but for everything else zstd should be the default choice these days
|
![]() |
|
in terms of real world users that aren't their home companies, iirc cloudflare does brotli on everythign, and zstd ive heard is used by some bigger aws teams for, e.g., internal tooling log storage
|
![]() |
|
tef posted:stripped out protobufs from python code and made them go faster and smaller with gzipped json (very string heavy data) amazing and this also applies to me. was this a “this might be faster” hunch or you just tried it on a lark?
|
![]() |
|
MrMoo posted:Outside of C/C++ this is almost always going to be the case. Every serialization performance test, no matter how flawed, will show JSON performing faster than others. The gotcha is the size of the message and the transit costs that means, so say using a reverse proxy to transparent `gzip` and if the client is a browser, you get "free" decompression. wait what json is _faster_ to (de)serialize than protobuf? is it cheaper in memory too? perf dumps of our kubernetes poo poo shows that we spend a ton of resources converting to and from JSON for API server interaction, and i was thinking "surely there must be a better way and this poo poo doesn't reallllllllly need json" but
|
![]() |
|
It should not be read literally, what happens in higher level languages is that you end up with a dual level of serialization, between internal format to JSON, then to text. Thus in JavaScript land, objects are already JS objects thus that eliminates an entire stage of the pipeline in comparison. "Products" like MsgPack try to benefit from the lower memory usage in encoded form. Ultimately with encoded messages there is a massive performance and convenience trade off. Random access to a field is an incredibly expensive operation and usually means (1) conversion to a map, or (2) reparsing on each access. The highest performance is given by single pass through a structure. However this only means something if your language has direct access to unpacked fields, any slow interpreted language is likely to require a map to simply access any variable, ideally JITs eliminating this.
|
![]() |
|
Powerful Two-Hander posted:I guarantee you they do not know what that is or what a deploy tool is either. if they did they could have said "ok deploy the latest prod artefact build to the test env and also refresh the database back from production" which is actually dead simple to do but I'm not gonna do it unless asked and we have other stuff in that env anyway loooooool auditors looking at code, fantastic.
|
![]() |
cinci zoo sniper posted:brotli (usually) is quicker to compress and compresses marginally better (compare brotli level 5 to zstd level 9 for a similar reference of “decently compressed”), and the differences we're talking about is a variable couple of % of compression ratio difference at very variable compression speed difference. you can also do the dictionary thing with zstd if you're benefitting from min-maxing the numbers after comma on really small files. For data at rest, does it still make sense to use lzma2 (e.g. with xz)? IIRC that had a significantly better compression ratio than zstd, but with significantly slower compression / decompression.
|
|
![]() |
|
VikingofRock posted:For data at rest, does it still make sense to use lzma2 (e.g. with xz)? IIRC that had a significantly better compression ratio than zstd, but with significantly slower compression / decompression. imo don't use xz or lzma2. https://www.nongnu.org/lzip/xz_inadequate.html explains why better than i will be able to. and https://lists.archlinux.org/pipermail/arch-dev-public/2019-March/029520.html as a real world supplement i guess lzma, however, is fine, and still gets you the 2nd best compression ratio. due to threading support and the number of mit phds involved into coding these, you're basically looking at like 5% compression ratio reductions going from zpaq to lzma, and then maybe 10% going from lzma to zstd (pulling random numbers out of my rear end because this is source material-dependant). however, and it's up to you to decided if this however matters to you, lzma is several orders of magnitude faster at de/compressing than zpaq, whereas zstd is several orders of magnitude faster at de/compressing than lzma. yeah, it's one of those "big tech spends big money to do one thing well" things.
|
![]() |
|
lzma balls
|
![]() |
|
zstd balls
|
![]() |
|
one part of zstd that intrigues me is the training and dictionary you can provide to improve compression. it theoretically fits a use case of ours very nicely, but haven’t had the opportunity to actually try it out yet. are there any other compression algorithms that have something similar?
|
![]() |
|
necrotic posted:one part of zstd that intrigues me is the training and dictionary you can provide to improve compression. it theoretically fits a use case of ours very nicely, but haven’t had the opportunity to actually try it out yet. brotli comes with a very large, pre-packaged dictionary, and with tooling to make your own dictionaries as well
|
![]() |
|
Does Intel QAT hardware accelerate zstd yet? With this new generation of Xeons, almost every Intel server out there is going to have gzip hardware acceleration, which may tilt the scales back towards good ol gzip unless zstd can use it too.
|
![]() |
|
cinci zoo sniper posted:brotli comes with a very large, pre-packaged dictionary, and with tooling to make your own dictionaries as well cool, will add that to my list to check. for us the training is enticing because we have a huge set of event types, where some of the smaller events may benefit from some training. might end up being a total wash given what the data is, but worth a look at least.
|
![]() |
|
cinci zoo sniper posted:brotli comes with a very large, pre-packaged dictionary, and with tooling to make your own dictionaries as well drat straight ![]() (USER WAS PUT ON PROBATION FOR THIS POST)
|
![]() |
|
Twerk from Home posted:Does Intel QAT hardware accelerate zstd yet? With this new generation of Xeons, almost every Intel server out there is going to have gzip hardware acceleration, which may tilt the scales back towards good ol gzip unless zstd can use it too. yes, zstd and brotli both
|
![]() |
|
Bloody posted:zstd balls id rather not
|
![]() |
|
Bloody posted:zstd balls ![]()
|
![]() |
|
mystes posted:Is zstd pronounced "zested"? no, the name of the algo is "zstandard"
|
![]() |
|
also please stop scraping that lime, wtf
|
![]() |
|
zuh steed
|
![]() |
|
cinci zoo sniper posted:no, the name of the algo is "zstandard" required pronunciation is an american impersonating a german tho Zee Standard!
|
![]() |
|
I'm a proponent of zstding come ask me about life on the z free from government control
|
![]() |
|
cinci zoo sniper posted:also please stop scraping that lime, wtf you have a problem with zest?
|
![]() |
|
nudgenudgetilt posted:required pronunciation is an american impersonating a german tho as oposed to what, "zed standard"? lomarf redleader posted:you have a problem with zest? that looks very different from zesting stuff i'm used to, e.g., microplane
|
![]() |
|
cinci zoo sniper posted:as oposed to what, "zed standard"? lomarf
|
![]() |
mystes posted:I think they just mean that you should say it with a german accent so "zee" will sound like it's supposed to be "the" with a german accent That's zjoke
|
|
![]() |
Tar, eXtracten ze files
|
|
![]() |
|
I get so tired of this poo poo. i don't do java but a bit ago a servlet would crash. Logs printed a ton about threadlocals being created and not closed, causing a possible memory leak. Asking our java devs I was met with big ??? reactions but the ops team wanted me to check it out. Checked out the recent changes, found a threadlocal (very few of them around), made sure it was closed and asked the java peeps to make sure it all looked ok. It ran fine in dev now so all good right? Of course not. They started digging into the issue and eventually came to the conclusion that "it's obviously a bug in java.lang.error()", reverted the change and disabled logging. Pushed it to prod which caused the servlet to eat poo poo over the weekend. Suggested we add in the quick fix so senior java dev adds it back in, delivers the new .class files and it's broken again which I've got to hear about all morning. Pulled down the .class files w/ yesterday's change date into a decompiler and the change isn't there. It's the same stupid file we delivered last week. jfc
|
![]() |
|
Astroid posted:came to the conclusion that "it's obviously a bug in java.lang.error()" "It can't be *our* code that is buggy, it must be the code that's deployed on hundreds of millions of devices instead!!!"
|
![]() |
|
![]()
|
# ? Mar 29, 2023 09:31 |
|
well that one time there was a bug in binary search so you see
|
![]() |