Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
His Divine Shadow
Aug 7, 2000

I'm not a fascist. I'm a priest. Fascists dress up in black and tell people what to do.

leper khan posted:

Fewer arithmetics. Readability. Real answer requires perf testing I'm too lazy to do right now for you :effort:

Satisfied with that reasoning, to my eyes I understood my own code more easily at first but it was fresh in my head.

FWIW they don't produce the same results, I'm picking over it now as an another exercise.

Adbot
ADBOT LOVES YOU

leper khan
Dec 28, 2010
Honest to god thinks Half Life 2 is a bad game. But at least he likes Monster Hunter.

His Divine Shadow posted:

Satisfied with that reasoning, to my eyes I understood my own code more easily at first but it was fresh in my head.

FWIW they don't produce the same results, I'm picking over it now as an another exercise.

I haven't tested mine in any way. Fencepost is the most likely error, but it's also possible I missed something in the description

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
I mean, your implementation is flipping N+1 bits and Leper’s is flipping N, that’s the obvious difference.

cheetah7071
Oct 20, 2010

honk honk
College Slice
In single-threaded code that isn't doing something truly bizarre, the volatile keyword should never change behavior, right?

I might have encountered my first actual compiler bug, possibly, tracking down why a test was passing in debug mode but not release mode. When I tried to track an optimized-away variable by printing the value, the test suddenly passed; declaring the variable volatile ended up having the same effect. The program isn't doing any weird external memory fuckery and the test was failing even when I ran it single-threaded.

go play outside Skyler
Nov 7, 2005


cheetah7071 posted:

In single-threaded code that isn't doing something truly bizarre, the volatile keyword should never change behavior, right?

I might have encountered my first actual compiler bug, possibly, tracking down why a test was passing in debug mode but not release mode. When I tried to track an optimized-away variable by printing the value, the test suddenly passed; declaring the variable volatile ended up having the same effect. The program isn't doing any weird external memory fuckery and the test was failing even when I ran it single-threaded.

i think the likelyhood of your code somehow doing something weird with memory is a lot higher than that of a compiler bug

cheetah7071
Oct 20, 2010

honk honk
College Slice
Yeah probably. At least I fixed it, even if I have no idea what was causing it or why it only happened when the variable was optimized away

more falafel please
Feb 26, 2005

forums poster

I think I found a compiler bug in MSVC once, but I never tried to isolate it, and it might have just been UB. It's been a while so I don't remember the details, but it was using an aligned vector instruction on memory that was not guaranteed to be 16-byte aligned (and, in this case, I think it was actually #pragma pack 4). It wasn't in a tight loop, so I just disabled optimization around the offending code.

ultrafilter
Aug 23, 2007

It's okay if you have any questions.


I had a project where we found bugs in two compilers. That was fun!

One involved a #define just not getting applied, which was weird. The other was an optimizer doing something really bizarre, which was easy to fix by turning off optimization.

Xarn
Jun 26, 2015
If you don't have multiple bugs open against the big three compilers, are you even doing C++?


cheetah7071 posted:

In single-threaded code that isn't doing something truly bizarre, the volatile keyword should never change behavior, right?

I might have encountered my first actual compiler bug, possibly, tracking down why a test was passing in debug mode but not release mode. When I tried to track an optimized-away variable by printing the value, the test suddenly passed; declaring the variable volatile ended up having the same effect. The program isn't doing any weird external memory fuckery and the test was failing even when I ran it single-threaded.

Volatile shouldn't (but does on MSVC) have any relation to multithreaded code. Volatile is purely for marking memory that does not behave according to the memory model, and using it for multithreaded sync is asking for trouble.

What you are describing sounds... Interesting. What printing and volatile are likely to do, is to force write of the register into memory. So if the compiler has bug in register liveness tracking, that can help. Or you are running into some fun unspecified behaviour.

Absurd Alhazred
Mar 27, 2010

by Athanatos
I found a MSVC bug that had to do with a weird interaction between lambda capture and templating. I created a small repro with some workarounds and they fixed it pretty quickly.

Rocko Bonaparte
Mar 12, 2002

Every day is Friday!

cheetah7071 posted:

In single-threaded code that isn't doing something truly bizarre, the volatile keyword should never change behavior, right?

Where's that code running? If the volatile variable is tied to hardware (interrupts, memory-mapped register), then it'll totally change behavior and would very much be the right thing to do. But if this is a PC doing userspace stuff, yeah, it should do anything normally and any side effects you're hitting are ghosts left behind from whatever got accidentally murdered in memory.

cheetah7071
Oct 20, 2010

honk honk
College Slice
I'm currently doing my best to track down whether its my own dumbass fault and I'm somehow writing data where I shouldn't, in a way that happened to cause weirdness when the variable was optimized away but didn't when it was given normal stack allocation. Unfortunately, just about every line of relevant code is optimized into not being breakpointable, and the variables I'm most interested in are also optimized away, so I'm going to have a fun time.

I'm mostly just worried that, if I'm somehow writing data where I shouldn't, it might just be coincidentally passing tests and fail when done on some other dataset. Otherwise I'd just declare it volatile and move on with my life

e: it was close to that; one constructor of a class left a single variable uninitialized in a way that mattered when I thought it didn't. I guess just changing the memory layout by putting one extra variable on the stack just coincidentally made the test pass. And every other place in the code where that function was called I just got lucky, because this bug has been in the code for weeks and never caused any problems before today lol

cheetah7071 fucked around with this message at 23:14 on Mar 16, 2023

cheetah7071
Oct 20, 2010

honk honk
College Slice
I guess the lesson is "volatile shouldn't change anything if you're doing something normal, but undefined behavior counts as abnormal"

I'm kind of annoyed that even with warnings as errors on, visual studio will still compile with unitialized variable warnings, and those warnings got buried in the pile of warnings from external code which I'm intentionally ignoring. Maybe there's a good reason for those warnings to not be elevated even when you have /W3 /WX?

Xarn
Jun 26, 2015
The lesson actually is "Valgrind ffs!"

Or Dr.Memory if you are Windows only shop.

Xarn fucked around with this message at 23:51 on Mar 16, 2023

Jabor
Jul 16, 2010

#1 Loser at SpaceChem

cheetah7071 posted:

I guess the lesson is "volatile shouldn't change anything if you're doing something normal, but undefined behavior counts as abnormal"

I'm kind of annoyed that even with warnings as errors on, visual studio will still compile with unitialized variable warnings, and those warnings got buried in the pile of warnings from external code which I'm intentionally ignoring. Maybe there's a good reason for those warnings to not be elevated even when you have /W3 /WX?

Can you set up that external code as actual "external headers", so you can get the build system to ignore those warnings instead of needing to do it yourself when scanning the output?

Subjunctive
Sep 12, 2006

✨sparkle and shine✨

Xarn posted:

The lesson actually is "Valgrind ffs!"

Or Dr.Memory if you are Windows only shop.

Yeah, this is the way.

cheetah7071
Oct 20, 2010

honk honk
College Slice

Xarn posted:

The lesson actually is "Valgrind ffs!"

Or Dr.Memory if you are Windows only shop.

I will look into this

Jabor posted:

Can you set up that external code as actual "external headers", so you can get the build system to ignore those warnings instead of needing to do it yourself when scanning the output?

and this as well

Plorkyeran
Mar 22, 2007

To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed
Address sanitizer works on Windows these days too.

Xarn
Jun 26, 2015
You need MSan for uninitialized memory reads though. And MSan is massive PITA due to having to recompile the stdlib as well. (Oh and it doesn't work on Windows)

His Divine Shadow
Aug 7, 2000

I'm not a fascist. I'm a priest. Fascists dress up in black and tell people what to do.

rjmccall posted:

I mean, your implementation is flipping N+1 bits and Leper’s is flipping N, that’s the obvious difference.

I did some + and - on the variables and now they both come out identical with the same data fed to them, so I guess what you said.

All this still feels obscure and hard to wrap my head around, even the easy to read function. Oh well onto exercise 2.8, wtf does it mean to rotate a bit? That was a rhetorical question. I already googled it. This actually looks easier though, I think I know how to do this... We'll see.

Foxfire_
Nov 8, 2010

This was fun.

gcc 8.5 apparently has special knowledge about what snprintf() does, but it's special knowledge is also wrong. The optimizer ends up with an incorrect range of possible values in the return, then it miscompiles later code.

It apparently thinks that all string parameters will be null terminated, even if the corresponding format string has an explicit precision, so that return (number of chars printer) will be strictly less than the length of the parameter plus fixed format string content.

Compiling
C++ code:
#include <cstdio>
#include <cassert>
#include <array>

std::array<char,200> buffer;

void String(const std::array<char,3>& data)
{
    buffer.fill(0xFF);
    const auto charsUsed = snprintf(
        buffer.data(),
        buffer.size(),
        "%.*s",
        (int)data.size(),
        data.data()
    );

    assert(charsUsed == 3);
}

int main()
{
    std::array<char, 3> text {'a','b', 'c'};
    String(text);
    printf("%s\n", buffer.data());
}
it doesn't even test the return value, it just unconditionally assert fails.

Looks like it's fixed in later gcc
https://godbolt.org/z/rreTKeMqn

Rocko Bonaparte
Mar 12, 2002

Every day is Friday!
drat, I joke about our Debian stuff using ancient stuff but that's older.

Edit: I am preemptively securing a spray bottle to use on the person that enters the chat over RHEL stuff.

His Divine Shadow
Aug 7, 2000

I'm not a fascist. I'm a priest. Fascists dress up in black and tell people what to do.
Eh, you can go older, something I did for fun

leper khan
Dec 28, 2010
Honest to god thinks Half Life 2 is a bad game. But at least he likes Monster Hunter.

His Divine Shadow posted:

Eh, you can go older, something I did for fun



I prefer modern tools with old compilers myself.

Computer viking
May 30, 2011
Now with less breakage.

Rocko Bonaparte posted:

drat, I joke about our Debian stuff using ancient stuff but that's older.

Edit: I am preemptively securing a spray bottle to use on the person that enters the chat over RHEL stuff.

I mean I did just test the New Shiny way we're all supposed to use work in the future at, uh, work. It's citrix into a windows server, and then putty (or rdp into Xrdp) into RHEL 8. I'd forgotten just how bad RHEL is as a general use distro. At least I've got sudo, so I'm currently testing distrobox to run a more reasonable distro on top of it.

Computer viking fucked around with this message at 15:09 on Mar 22, 2023

leper khan
Dec 28, 2010
Honest to god thinks Half Life 2 is a bad game. But at least he likes Monster Hunter.

Computer viking posted:

I mean I did just test the New Shiny way we're all supposed to use work in the future at, uh, work. It's citrix into a windows server, and then putty (or rdp into Xrdp) into RHEL 8. I'd forgotten just how bad RHEL is as a general use distro. At least I've got sudo, so I'm currently testing distrobox to run a more reasonable distro on top of it.

Horrors thread is over there.

Volguus
Mar 3, 2009
Just out of curiosity, would it be even possible to parse a line like this using boost spirit or some other sane-ish technologies (regex? antlr?):

pre:
some-key="some val "other thing" something" some-key2="some other "val""
where, yes, the values, while they are quoted, they can contain unescaped quotes in them?
The only way I can think of of how this could possibly be done would be to go backwards in the string and consider a value everything from " to =", and manually and painstakingly keep track of what we're doing, but maybe there are ways that I haven't thought of.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem

Volguus posted:

Just out of curiosity, would it be even possible to parse a line like this using boost spirit or some other sane-ish technologies (regex? antlr?):

pre:
some-key="some val "other thing" something" some-key2="some other "val""
where, yes, the values, while they are quoted, they can contain unescaped quotes in them?
The only way I can think of of how this could possibly be done would be to go backwards in the string and consider a value everything from " to =", and manually and painstakingly keep track of what we're doing, but maybe there are ways that I haven't thought of.

Should that be parsed as some-key being equal to some val "other thing" something and some-key2 being equal to some other "val"; or should it be parsed as some-key being equal to some val "other thing" something" some-key2="some other "val" and some-key2 being undefined?

Volguus
Mar 3, 2009

Jabor posted:

Should that be parsed as some-key being equal to some val "other thing" something and some-key2 being equal to some other "val"; or should it be parsed as some-key being equal to some val "other thing" something" some-key2="some other "val" and some-key2 being undefined?

Yes, the former. Both keys would have the value enclosed in quotes ... I guess as far as you can go.

pre:
key=some-key
value=some val "other thing" something

key=some-key2
value=some other "val"

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
What makes the second interpretation wrong?

Volguus
Mar 3, 2009

Jabor posted:

What makes the second interpretation wrong?

My desire to parse all the keys, no matter how messed up they are. I mean, yes, one could come and say that some-key2 is undefined, but I don't want to do that here. I want to be able to get its value, since I want to use it later.
What I wanted to do was store the line in a map, have a list of maps in an object (since the file can have multiple lines) and be able to filter things, search for them, etc.
Not throwing away keys is important for this little project.

Volguus fucked around with this message at 03:13 on Mar 27, 2023

Sweeper
Nov 29, 2007
The Joe Buck of Posting
Dinosaur Gum

Volguus posted:

My desire to parse all the keys, no matter how messed up they are. I mean, yes, one could come and say that some-key2 is undefined, but I don't want to do that here. I want to be able to get its value, since I want to use it later.
What I wanted to do was store the line in a map, have a list of maps in an object (since the file can have multiple lines) and be able to filter things, search for them, etc.
Not throwing away keys is important for this little project.

Imo, use a better format

I would reencode all of my data to avoid whatever nonsense is going on here

ShoulderDaemon
Oct 9, 2003
support goon fund
Taco Defender

Volguus posted:

My desire to parse all the keys, no matter how messed up they are. I mean, yes, one could come and say that some-key2 is undefined, but I don't want to do that here. I want to be able to get its value, since I want to use it later.
What I wanted to do was store the line in a map, have a list of maps in an object (since the file can have multiple lines) and be able to filter things, search for them, etc.
Not throwing away keys is important for this little project.

What Jabor is getting at is, how do you know that some-key2 is actually supposed to be a key? How do you know that whatever created that string did not intend for it to define a single key with a value that includes =" within it? What information within the string actually defines where a value ends? It's not the " character since that's allowed unescaped inside a value, and there's nothing else there for it to be.

Fundamentally, the answer to your question "would it be even possible to parse a line like this using boost spirit or some other sane-ish technologies" is no because the answer to the broader question "is it possible to parse this format at all" is no. The string you posted fundamentally does not have enough information to decide what the keys and values it is trying to encode actually are, because it's too ambiguous about what the " character means.

If you actually need to extract keys and values from this, your step one should be "change the format". As presented, your format has too much ambiguity to be parsed, period.

Volguus
Mar 3, 2009

Sweeper posted:

Imo, use a better format

I would reencode all of my data to avoid whatever nonsense is going on here

Yes, that would be nice. Unfortunately I don't have control on whoever produced this monster.


ShoulderDaemon posted:

What Jabor is getting at is, how do you know that some-key2 is actually supposed to be a key? How do you know that whatever created that string did not intend for it to define a single key with a value that includes =" within it? What information within the string actually defines where a value ends? It's not the " character since that's allowed unescaped inside a value, and there's nothing else there for it to be.

Fundamentally, the answer to your question "would it be even possible to parse a line like this using boost spirit or some other sane-ish technologies" is no because the answer to the broader question "is it possible to parse this format at all" is no. The string you posted fundamentally does not have enough information to decide what the keys and values it is trying to encode actually are, because it's too ambiguous about what the " character means.

If you actually need to extract keys and values from this, your step one should be "change the format". As presented, your format has too much ambiguity to be parsed, period.

I know that some-key2 is supposed to be a key because other lines in the file have that key with a sane value inside. Just some lines, which I wouldn't want to throw out, have this junk in there. Yes, there's a lot of ambiguity, that's for sure.

ShoulderDaemon
Oct 9, 2003
support goon fund
Taco Defender

Volguus posted:

I know that some-key2 is supposed to be a key because other lines in the file have that key with a sane value inside. Just some lines, which I wouldn't want to throw out, have this junk in there. Yes, there's a lot of ambiguity, that's for sure.
In order to write a parser, you will need to decide what the correct parse of this is:
code:
a="" b="" b="" a=""
Some possibilities:
  • a is " b="" b="" a=" and b is undefined
  • a is at first " b="" b=" and is then changed to an empty string. b is undefined
  • a is " b=" and b is " a="
  • a is an empty string and b is " b="" a="
Any or none of these interpretations might correspond to what the producer of the string intended. There's not enough information to tell what interpretation is correct.

The computer isn't smart enough to look at a string and decide what the "sane" values are, you need to tell it the actual rules for how the strings work. And given what you've shared, there just isn't a single set of well-defined rules that will actually "do the right thing" on any possible string following your format as shown.

If you can't change the format, then any solution you come up with is necessarily going to be a hack that at best approximates "correctness" on the inputs that you happen to care about, but is going to give you incorrect parses on other inputs that hopefully you just happen to not encounter. That's the best anyone could do with your format as shown.

Volguus
Mar 3, 2009
Oh, absolutely, I know that I have to make that decision, and it is me who's making it.
I was just wondering if there is a better than "manual" way to parse that. I had a spirit grammar that was doing fine with normal lines, which obviously failed on the weird ones and I was wondering if it's just me that doesn't see a way to beat it into submission.
I guess manual way it is, and as I posted earlier it would probably have to be going in reverse, from a " until it encounters =" and decide that this is a value, from there until a space it's a key, and move on from there.

go play outside Skyler
Nov 7, 2005


I would just recursively split the string by
code:
" some-key="
and call it a day. do this for each key that you expect to have.

if you get an array with 1 element it means the key is undefined

lovely format - lovely solution

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
There are all sorts of decent heuristics that will probably do a pretty good job with that amboguity. The easiest is that the grammar outside of quoted strings seems both simple and unlikely to occur in strings (assuming those strings are coming from normal humans), so if you see a quote, you can just look to see if the immediately-following thing looks like another attribute.

This is the sort of thing that good parsers for programming languages do all the time for recovery, which is part of why they usually aren’t written with parser generators like Spirit. (And a lot of programming language grammars are also formally ambiguous, though usually the ambiguity is a lot narrower than “we don’t require escapes in string literals”.)

Yes, it’s just a heuristic, and it would be better to fix the escaping in the source, but if you’ve got no choice about it…

rjmccall fucked around with this message at 07:08 on Mar 27, 2023

His Divine Shadow
Aug 7, 2000

I'm not a fascist. I'm a priest. Fascists dress up in black and tell people what to do.
I was gonna ask a question but I found the answer myself, it was why was this function I had simply copied and used earlier, was using malloc when it hardcoded the length anyway. But as I found here:

quote:

https://www.programmingsimplified.com/c/source-code/c-program-convert-decimal-to-binary
We allocate memory dynamically because we can't return a pointer to a local variable (character array in this case). If we return it to a local variable, then the program may crash, or we get an incorrect result.

I do have a follow up however.

So I was using this function directly in a printf statement to format an integer, I never assigned it to a variable. By doing that I assume I have created a memory leak.

So the right way to handle this is to assign the return value to a variable that I can free later right? I guess it's not possible for a function that calls malloc to clean up after itself either or it would erase the pointer it was supposed to return.

Adbot
ADBOT LOVES YOU

Qwertycoatl
Dec 31, 2008

Yes, anything malloc'ed should be free'd later, so you need to keep the pointer that function returns around.

(Personally I don't like that function much, I'd rather an API where you allocate the string yourself and pass it in)

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply