Coding Horrors: You can gather all your technical debt into one easy framework!

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Coding Horrors: You can gather all your technical debt into one easy framework!

«‹›1503 »

Phayray: Feb 16, 2004

QuarkJets posted:

The second one could be someone's unfortunate conception of what a floor operator should do when applied to a string

I think simply reading this thread has made me a better programmer

But like most physicists, I don't actually have any real CS training

I studied engineering but am working on a PhD aka doing hard science and I really wish I had gotten some proper CS training. The only "programming" class I've ever had to take was for MATLAB. I learned C when I was 10 writing code for MUDs and have been actively C++ing for about 7 years and I *still* feel like I'm learning new stuff about how to do C++ on a regular basis, including from reading this thread.

I see a lot of garbage code and my greatest fear is writing something with a bug in it that ends up giving me an incorrect result that I interpret as correct and then publish. I recently found a bug in another (graduated) PhD student's code where he didn't sample over a sphere correctly to create an isotropic particle source so his results were biased towards the poles. My worst nightmare.

# ? Sep 6, 2017 02:03

Adbot: ADBOT LOVES YOU

# ? May 14, 2024 06:36

TooMuchAbstraction: Oct 14, 2012; I spent four years making
Waves of Steel
Hell yes I'm going to turn my avatar into an ad for it.; Fun Shoe

Hm, yes, nobody ever writes new C code these days. Or C++ or Java or Python (26 years old!) or C# (17 years old). It's all about chasing the new hotness, like Javascript (22 years old).

# ? Sep 6, 2017 02:05

Bongo Bill: Jan 17, 2012

Phayray posted:

I studied engineering but am working on a PhD aka doing hard science and I really wish I had gotten some proper CS training. The only "programming" class I've ever had to take was for MATLAB. I learned C when I was 10 writing code for MUDs and have been actively C++ing for about 7 years and I *still* feel like I'm learning new stuff about how to do C++ on a regular basis, including from reading this thread.

I see a lot of garbage code and my greatest fear is writing something with a bug in it that ends up giving me an incorrect result that I interpret as correct and then publish. I recently found a bug in another (graduated) PhD student's code where he didn't sample over a sphere correctly to create an isotropic particle source so his results were biased towards the poles. My worst nightmare.

It's normal to feel this way when working in C++, which is unfathomably vast and bad.

Broadly speaking, there is the problem of: how do we know that our implementation is correct? How do we know that the problem our code is solving is the problem we think/hope it is solving? You can gain confidence in two ways: by proof, ensuring the correctness of a subset of the code with e.g. a static type system; and by induction, demonstrating the correctness of a subset of the code with e.g. lots of good unit tests.

# ? Sep 6, 2017 02:09

QuarkJets: Sep 8, 2008

Meanwhile actual new things like julia seem to get a lot of initial interest but are left to wither and die on the vine because all of the legacy languages are already widely used

# ? Sep 6, 2017 05:24

KernelSlanders: May 27, 2013; Rogue operating systems on occasion spread lies and rumors about me.

Thermopyle posted:

If you're self-aware enough to read the coding horrors thread there's a good chance you're not one of The Bad Ones.

I'm just here to get some ideas.

# ? Sep 6, 2017 05:37

NihilCredo: Jun 6, 2011; iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

https://twitter.com/RosuGrigore/status/899837499299827713

# ? Sep 6, 2017 13:54

Munkeymon: Aug 14, 2003; Motherfucker's got an
armor-piercing crowbar! Rigoddamndicu𝜆ous.

Wouldn't the smallest be something like

C code:

int main(){
  int* x = null;
  return *x;
}

or could you just return *null; without the compiler complaining? If not, I bet there's a flag that'd shut it up.

# ? Sep 6, 2017 14:35

Meat Beat Agent: Aug 5, 2007; felonious assault with a sproinging boner

I assume they meant the smallest undefined program that always finishes normally.

# ? Sep 6, 2017 15:53

Jeb Bush 2012: Apr 4, 2007; A mathematician, like a painter or poet, is a maker of patterns. If his patterns are more permanent than theirs, it is because they are made with ideas.

Meat Beat Agent posted:

I assume they meant the smallest undefined program that always finishes normally.

if not, this is shorter (and I think the shortest possible, if you require that gcc will compile it without adding any flags)

code:

int main;

# ? Sep 6, 2017 15:58

leper khan: Dec 28, 2010; Honest to god thinks Half Life 2 is a bad game. But at least he likes Monster Hunter.

Meat Beat Agent posted:

I assume they meant the smallest undefined program that always finishes normally.

There's no guarantee for that to ever be the case for any undefined program..?

# ? Sep 6, 2017 15:59

Zopotantor: Feb 24, 2013; ...und ist er drin dann lassen wir ihn niemals wieder raus...

Zopotantor posted:

Today I found a bug in RHEL 7.2's rather old version of GCC that transforms a simple
code:
for (int i = 0; i < 8; ++i) {
  ...stuff...
}
into an endless loop. "i" is only used as a subscript inside the loop, no trickery is going on. It will be fun breaking this down to a minimal example tomorrow.

Most of the time I spent on this was to convince myself that yes, it's the compiler loving up and not me.

It was totally me. Undefined behavior (integer overflow) in the loop body. :shepicide:

# ? Sep 6, 2017 16:20

Jethro: Jun 1, 2000; I was raised on the dairy, Bitch!

Meat Beat Agent posted:

I assume they meant the smallest undefined program that always finishes normally.

Maybe shortest "non-trivial" program.

# ? Sep 6, 2017 17:57

Qwertycoatl: Dec 31, 2008

He's looking for the smallest undefined program that different compilers produce different results for.

# ? Sep 6, 2017 19:03

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

When there are hundreds of lines like this (names changed), do you wait for the crashes or just start rewriting it all?

code:

dev_num = self._hardware_client.get("devNumber")
dev_num = dev_num['ford']['graphics_components']['fordGraphicsServer']
dev_num = dev_num['fordGraphics-console']['device']['devNumber']

# ? Sep 6, 2017 21:34

Nude: Nov 16, 2014; I have no idea what I'm doing.

NihilCredo posted:

https://twitter.com/RosuGrigore/status/899837499299827713

Can someone explain why it could be 4? :psyduck:

# ? Sep 6, 2017 22:08

Red Mike: Jul 11, 2011

I know almost nothing of C minutiae, but I'm willing to bet that it's something silly like the optimiser for gcc doing its best and seeing the assignment to x, therefore running the assignments ahead of the return statement because you're not really meant to modify the same variable twice in the same statement.

Or some silliness like what happens when you make it run i++ twice in the same statement.

It's always something like this with C undefined behaviour.

# ? Sep 6, 2017 22:18

Plorkyeran: Mar 22, 2007; To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed

Nude posted:

Can someone explain why it could be 4?

x = 1; x = 2; return x + x; is a perfectly sensible decomposition of that expression.

# ? Sep 6, 2017 22:32

Eela6: May 25, 2007; Shredded Hen

Just because you CAN be recursive doesn't mean you should.
I just had to refactor this:

code:

func main() {
	foo(0)
}
func foo(n int) error {
	if err := baz(n); err != nil {
		return err
	}
	if err := bar(2); err != nil {
		return err
	} //yes, literally just '2'

	if n++; n < 5 {
		if err := foo(n); err != nil {
			return err
		}
	}
	return nil
}
func bar(page int) error {
	if err := butt(page); err != nil {
		return err
	}
	if page++; page < 3 {
		if err := bar(page); err != nil {
			return err
		}
	}
	return nil
}

To this:

code:

func foo2() error {
	for i := 0; i < 5; i++ {
		if err := baz(i); err != nil {
			return err
		}
		if err := bar2(); err != nil {
			return err
		}
	}
	return nil
}

func bar2() error {
	for page := 2; page <= 3; page++ {
		if err := butt(page); err != nil {
			return err
		}
	}
	return nil
}

# ? Sep 6, 2017 22:53

Gazpacho: Jun 18, 2004; by Fluffdaddy; Slippery Tilde

Plorkyeran posted:

x = 1; x = 2; return x + x; is a perfectly sensible decomposition of that expression.

... and it's the one that gcc generates, even without optimization flags. Here's the basic block output, edited for readability.

code:

main ()
{
  int x;
  int _4;
 
  x_1 = 0;
  x_2 = 1;
  x_3 = 2;
  _4 = x_3 + x_3;
  return _4;
}

# ? Sep 6, 2017 22:56

Nude: Nov 16, 2014; I have no idea what I'm doing.

Plorkyeran posted:

x = 1; x = 2; return x + x; is a perfectly sensible decomposition of that expression.

Oh ya don't know why that didn't occur to me. Thank you.

# ? Sep 6, 2017 23:06

Coffee Mugshot: Jun 26, 2010; by Lowtax

Isn't that just a consequence of LLVM IR necessarily being in SSA form versus Gimple's tree representation? Maybe that question is about the smallest program that demonstrates any computable difference in the Gimple/LLVM IR (with no optimizations).

# ? Sep 6, 2017 23:55

fritz: Jul 26, 2003

TooMuchAbstraction posted:

It's all about chasing the new hotness, like Javascript (22 years old).

Yeah but isn't that 1 year 22 times.

# ? Sep 7, 2017 02:23

KernelSlanders: May 27, 2013; Rogue operating systems on occasion spread lies and rumors about me.

Plorkyeran posted:

x = 1; x = 2; return x + x; is a perfectly sensible decomposition of that expression.

There are no sensible decompositions of that expression. C trying to pretend there are in the first place is the horror here. Compiler should just puke.

fritz posted:

Yeah but isn't that 1 year 22 times.

# ? Sep 7, 2017 12:36

zergstain: Dec 15, 2005

Plorkyeran posted:

x = 1; x = 2; return x + x; is a perfectly sensible decomposition of that expression.

Is modifying the same variable more than once in a statement UB? I'd have thought the UB would be in the order of evaluation of the subexpressions, which should only affect the final value of x, which is discarded anyway.

# ? Sep 7, 2017 14:18

Xerophyte: Mar 17, 2008; This space intentionally left blank

KernelSlanders posted:

There are no sensible decompositions of that expression. C trying to pretend there are in the first place is the horror here. Compiler should just puke.

C allows expressions to modify state, for better or worse (mostly worse). It's a lot simpler (as in: only really possible) to make a compiler evaluate non-ordered state-modifying expressions if it doesn't have to try to statically prove that those expressions have no inter-dependencies. To be strictly correct the compiler basically need to prove that a program never results in a pre-expression state that would trigger the undefined behavior when evaluating it, which is hard when ?: exists. Very stupid toy example: (z ? y : x) = x++ can be well defined, if you know that z != 0. You could of course be conservative and make it an error if the expression smells undefined, but then you'd be rejecting valid programs and C doesn't do that sort of namby-pamby baby stuff.

This sort of insanity is also why later languages decided that having assignment be an expression and not a statement is not even remotely worth all the headaches it causes, but C is pretty much stuck with that decision.

Also, if we're golfing: int main() { int x = 0; return x = ++x; } for a shorter example of the same problem. Or just main() { int x; return x; } for something undetermined.

E:

zergstain posted:

Is modifying the same variable more than once in a statement UB? I'd have thought the UB would be in the order of evaluation of the subexpressions, which should only affect the final value of x, which is discarded anyway.

I wish it was as nice as once per statement, but no. Instead C has the concept of the sequence point, which includes statements, the two sides of ? in a selection operator, the , operator and so on. Basically, if the expression itself guarantees an order of evaluation on its subexpressions then you are allowed to modify the same variable in both of them.

E2: Fixed my examples. Clearly I should not be writing dumb example code on internet forums this early in the morning...
E3: Oh, hey, attempt at fixed example -- (x ? y : x) = x++ -- is also undefined behavior. Not sure if this proves that I'm bad, that the result of ?: possibly being an rvalue is bad, that the C assignment expression not being a sequence point is bad, that assignment expressions are bad, or all of the above.

Xerophyte fucked around with this message at 15:14 on Sep 7, 2017

# ? Sep 7, 2017 14:33

Gazpacho: Jun 18, 2004; by Fluffdaddy; Slippery Tilde

KernelSlanders posted:

There are no sensible decompositions of that expression. C trying to pretend there are in the first place is the horror here. Compiler should just puke.

Nope. It is not possible in the presence of pointer expressions to detect at compile time that an expression assigns a variable more than once. Requiring detection in expressions like this one, where it is possible, would still leave undefined situations and therefore would be something of a wasted effort and misleading assurance.

With the -Wall option GCC does detect it though.

e: This may sound contradictory, but all I'm saying is that while compilers are free to try to detect multiple assignments, they can't always do it, and the problem of specifying cases in the standard that must be detected is itself not well-defined, and the code generator has to be prepared to deal with it anyway.

Gazpacho fucked around with this message at 16:35 on Sep 7, 2017

# ? Sep 7, 2017 15:43

zergstain: Dec 15, 2005

Xerophyte posted:

This sort of insanity is also why later languages decided that having assignment be an expression and not a statement is not even remotely worth all the headaches it causes, but C is pretty much stuck with that decision.

I wish it was as nice as once per statement, but no. Instead C has the concept of the sequence point, which includes statements, the two sides of ? in a selection operator, the , operator and so on. Basically, if the expression itself guarantees an order of evaluation on its subexpressions then you are allowed to modify the same variable in both of them.

Does that mean in those languages, assignments can't be chained eg. a = b = c = 0?

I gather the + operator doesn't guarantee an order. I'm not sure that fully answers my question, though I recognize what GCC does would give the expected result in normal cases.

# ? Sep 7, 2017 19:30

Coffee Mugshot: Jun 26, 2010; by Lowtax

Xerophyte posted:

E3: Oh, hey, attempt at fixed example -- (x ? y : x) = x++ -- is also undefined behavior. Not sure if this proves that I'm bad, that the result of ?: possibly being an rvalue is bad, that the C assignment expression not being a sequence point is bad, that assignment expressions are bad, or all of the above.

Why is this undefined behavior? It seems pretty well-defined in the C++14 spec and AFAICT gets lowered into something similar to

code:

if (x) {
   y = x++;
} else {
   x = x++;
}

And neither of those expressions fits into UB, either. Perhaps you're referring to something more like "unspecified behavior": https://en.wikipedia.org/wiki/Unspecified_behavior#Implementation-defined_behavior.

# ? Sep 7, 2017 20:16

eth0.n: Jun 1, 2012

Isn't

C code:

x = x++

undefined behavior, by itself? I.e., should x be set to x + 1 (via the increment), or to the original x (since that's what the post-increment returns)?

# ? Sep 7, 2017 21:19

ulmont: Sep 15, 2010; IF I EVER MISS VOTING IN AN ELECTION (EVEN AMERICAN IDOL) ,OR HAVE UNPAID PARKING TICKETS, PLEASE TAKE AWAY MY FRANCHISE

eth0.n posted:

Isn't
C code:
x = x++
undefined behavior, by itself? I.e., should x be set to x + 1 (via the increment), or to the original x (since that's what the post-increment returns)?

It depends on your language. In C# and Java (and presumably most non-C languages) that is specified.

In C it is undefined. http://c-faq.com/expr/ieqiplusplus.html

# ? Sep 7, 2017 22:20

Xerophyte: Mar 17, 2008; This space intentionally left blank

Coffee Mugshot posted:

Why is this undefined behavior?

Since assignment doesn't create a sequence point it could just as well be lowered into

code:

tmp = x++;
if (x) {
  x = tmp;
}
else {
  y = tmp;
}

I believe that, in C, depending on the order of evaluation for expressions is unspecified and modifying a value twice between two sequence points is undefined. So (x ? y : x) = x++ is definitely unspecified, and possibly undefined. I admit it's not really a distinction I've worried about much so I could be completely wrong about that.

# ? Sep 8, 2017 01:08

Coffee Mugshot: Jun 26, 2010; by Lowtax

ulmont posted:

It depends on your language. In C# and Java (and presumably most non-C languages) that is specified.

In C it is undefined. http://c-faq.com/expr/ieqiplusplus.html

Hmm, I don't want to lawyer it up, so I'm asking y'all to help me understand. I'm looking at the C11 spec (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf).

Speaking about the postfix increment operator, in particular, 6.5.2.4 says

quote:

The result of the postfix ++ operator is the value of the operand. As a side effect, the
value of the operand object is incremented (that is, the value 1 of the appropriate type is
added to it)... The value computation of the result is sequenced before the side effect of
updating the stored value of the operand.

Under 6.5 Expressions, we see the following definition for undefined behavior:

quote:

If a side effect on a scalar object is unsequenced relative to either a different side effect
on the same scalar object or a value computation using the value of the same scalar
object, the behavior is undefined.

Under 6.5.16.1 Simple assignment:

quote:

If the value being stored in an object is read from another object that overlaps in any way
the storage of the first object, then the overlap shall be exact and the two objects shall
have qualified or unqualified versions of a compatible type; otherwise, the behavior is
undefined.

x = x++ is defined to be sequenced. The value is stored into the LHS first, which is a no-op, and then the value computation occurs, adding one.

They also produced a couple of (slightly) related examples as undefined statements:

code:

i = ++i + 1;
a[i++] = i;

Note this is because prefix increment is unsequenced as are subexpressions in array index evaluations.

# ? Sep 8, 2017 01:26

Eela6: May 25, 2007; Shredded Hen

I used to think I understood C, like, at all.

# ? Sep 8, 2017 01:28

KernelSlanders: May 27, 2013; Rogue operating systems on occasion spread lies and rumors about me.

All of these replies only further prove my point. The problem is C. To use Xerophyte's example, the fact that you can use the ternary operator to set the assignment location of a RHS expression is insane. That an assignment is a valid LHS returning the assigned to value is also insane.* It's hard to imagine a modern language that would allow such nonsense. The core problem though is that C allows tons of syntax that is nonsensical but is not a compile error.

I'd love to see someone go through the last thousand pages of this thread and count how many horrors by language. I'm guessing C is a majority.

* Side note: I don't want to completely rule out assignment returning success or failure. One could imagine a language in which a[4] = 0 evaluates to true if the assignment works and false on an array out of bounds error.

# ? Sep 8, 2017 01:58

Gazpacho: Jun 18, 2004; by Fluffdaddy; Slippery Tilde

- The overlap clause is not relevant to the expression at hand, which involves only one object.
- 6.5.16.1 ¶3 says that the effect of the assignment is sequenced after the value of x++ is calculated.
- 6.5.2.4 ¶2 says that the side effect of x++ is also sequenced after the value of x++ is calculated.
- Absent any constraint on how the two effects are sequenced relative to each other, 6.5 ¶2 leaves the behavior undefined.

KernelSlanders posted:

All of these replies only further prove my point. The problem is C. To use Xerophyte's example, the fact that you can use the ternary operator to set the assignment location of a RHS expression is insane. That an assignment is a valid LHS returning the assigned to value is also insane.* It's hard to imagine a modern language that would allow such nonsense. The core problem though is that C allows tons of syntax that is nonsensical but is not a compile error.

It's not insane if it happens to be what you want to do, which it occasionally is.

Gazpacho fucked around with this message at 02:08 on Sep 8, 2017

# ? Sep 8, 2017 02:05

Xerophyte: Mar 17, 2008; This space intentionally left blank

Coffee Mugshot posted:

Hmm, I don't want to lawyer it up, so I'm asking y'all to help me understand.

Oh, me too. I'm (clearly) not an expert on the C spec, but I do like to know when I'm shooting myself in the foot.

The sequence point annex in the C11 spec you linked specifies the sequence points as:

C11 Standard, Annex C posted:

� Between the evaluations of the function designator and actual arguments in a function call and the actual call. (6.5.2.2).
� Between the evaluations of the first and second operands of the following operators: logical AND && (6.5.13); logical OR || (6.5.14); comma , (6.5.17). ∗
� Between the evaluations of the first operand of the conditional ? : operator and whichever of the second and third operands is evaluated (6.5.15).
� The end of a full declarator: declarators (6.7.6);
� Between the evaluation of a full expression and the next full expression to be evaluated. The following are full expressions: an initializer that is not part of a compound literal (6.7.9); the expression in an expression statement (6.8.3); the controlling expression of a selection statement (if or switch) (6.8.4); the controlling expression of a while or do statement (6.8.5); each of the (optional) expressions of a for statement (6.8.5.3); the (optional) expression in a return statement (6.8.6.4).
� Immediately before a library function returns (7.1.4).
� After the actions associated with each formatted input/output function conversion specifier (7.21.6, 7.28.2).
� Immediately before and immediately after each call to a comparison function, and also between any call to a comparison function and any movement of the objects passed as arguments to that call (7.22.5).

The assignment operator is not listed, so my understanding is that the expression statement x = x++; has no internal sequence points. The side-effect of the postfix operator and the side-effect of the assignment operator are unsequenced and the "if a side effect on a scalar object is unsequenced relative to [...] a value computation using the value of the same scalar object [then] the behavior is undefined" clause applies. I'd be happy to be wrong about that, though.

I guess mostly my point is that reasoning about state changes is a lot harder when your language allows state to be modified by expressions with at least ambiguous sequence. C chose to allow that and can't really get out of it. Later C-like languages have often chosen to make assignment a statement specifically to dodge this type of headache.

# ? Sep 8, 2017 02:15

Gazpacho: Jun 18, 2004; by Fluffdaddy; Slippery Tilde

Seems I wasn't understood earlier about UD being not always being statically detectable in the presence of pointer expressions. Consider a little spin on the tweet code:

main.c:

C code:

int main() {
  return (*f() = 1) + (*g() = 2);
}

evil.c:

C code:

static int x;

int * f() { return &x; }

int * g() { return &x; }

Both of these files get compiled and linked into one program. Each compilation unit (preprocessed file) in a C program gets compiled separately. How's the compiler going to detect that main.c is modifying the same object twice in an expression? It can't. The information it would need isn't present in the file. Heck, I could even add a coin toss to evil.c so that one couldn't know even with both files available.

As mentioned, the developers of GCC have helpfully included detection for some cases. But why write those cases into the standard, when they would be incomplete anyway?

You can extend your objections beyond the UD rule to cover assignment expression values, and pointers in general, but pretty soon you're not talking about the C language, which is what the standard exists to specify.

Gazpacho fucked around with this message at 02:31 on Sep 8, 2017

# ? Sep 8, 2017 02:28

vOv: Feb 8, 2014

If ternaries couldn't return an rvalue I bet you'd see people doing

code:

*(f() ? &g : &h) = "lmao";

or some poo poo.

# ? Sep 8, 2017 04:56

Bongo Bill: Jan 17, 2012

Euthanize C.

# ? Sep 8, 2017 05:00

Adbot: ADBOT LOVES YOU

# ? May 14, 2024 06:36

Foxfire_: Nov 8, 2010

code:

int foo()
{
  int x = 0;
  return (x = 1) + (x = 2);
}

code:

def bar():
  x = []
  return x.append(1) + x.append(2)

Only real difference is that C doesn't specify an order to its + operator and Python does. :shrug:

# ? Sep 8, 2017 05:44

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Coding Horrors: You can gather all your technical debt into one easy framework!

«‹›1503 »