Coding Horrors: You can gather all your technical debt into one easy framework!

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Coding Horrors: You can gather all your technical debt into one easy framework!

«‹›14 »

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

ShoulderDaemon posted:

The only reason that isn't type information is because languages like C don't have strong typing on typedefs. Being able to do typedef unsigned int index_t; typedef unsigned int age_t; and have the compiler warn when you use an index_t where it expects an age_t would completely remove even that flimsy justification for Hungarian notation, as you would encode all of your "usage context" in the type system where it belongs.

You can do this in C++, but it's an enormous pain in the rear end because you have to hand-code every operation you want to provide, even those which seem to be logically derivable from others. Haskell makes that substantially easier, but converting between types is still a pain in the rear end.

But yes, a shop that demands Hungarian notation would generally be better off demanding the consistent use of type-safe wrapping types.

# ¿ Oct 4, 2008 09:11

Adbot: ADBOT LOVES YOU

# ¿ Apr 28, 2024 10:05

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

ShoulderDaemon posted:

I've never found converting between types in Haskell to be very painful - I just pattern match to get at the contained type I care about, or make my newtypes instances of the typeclasses I care about.

It's certainly much better than C++, but you and I seem to have different attitudes about obnoxiousness --- I think having to introduce little function calls everywhere is annoying. I fully understand why I have to do it, I just think it's annoying.

ShoulderDaemon posted:

I'm sure all these layers of indirection represent something useful within the context of the proof-checker, but when the code it's spitting out is so far removed from what any human would write even on simple examples like this, it's very hard to correctly use extracted code on complicated models, where it would be most useful.

If you're curious, list_rect is the axiom of structural induction for lists, derived from the structure of the type declaration. list_rec is probably there to catch some corner case in the code generation which doesn't happen to arise with this type --- possibly something with higher-rank polymorphism.

Also, while I agree that the code is far from what a human would write, I don't understand why it's any more difficult to use: the sum function, which is what you actually extracted, is called in exactly the usual manner.

# ¿ Oct 4, 2008 11:02

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

Flobbster posted:

Not only that, but even in the case of appending a bunch of string constants and string variables together, the Java compiler is smart enough (that's a phrase you don't hear too often!) to compile that into a StringBuilder construction and sequence of append() calls. I never knew this until I had to decompile a class for something and saw that.

Actually, this (1) is mandated by the language specification and has nothing to do with the quality of the compiler, (2) applies to any value that you concatenate into a string, not just string values, and (3) works only within a single expression. Also, weren't we talking about JavaScript, not Java?

# ¿ Nov 20, 2008 00:29

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

Painless posted:

I have no idea what you're talking about. It produces pretty much a direct translation from Java to bytecode with no optimization or anything. It's easy to be the "best" in a nearly completely trivial process.

EDIT: I just realized I skipped a page in my rush to yell at Java, oh well.

When javac's hands aren't tied by inter-class abstraction boundaries, it can actually do quite a lot; but of course that's a huge limitation.

# ¿ Nov 21, 2008 00:32

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

Painless posted:

You might mean it could do. It doesn't actually do poo poo.

Bah, you're right. Apparently I haven't looked at this since they decided to deprecate -O.

# ¿ Nov 21, 2008 03:10

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

ani47 posted:

My co-worker said that having a single 'int i' in the beginning of a function instead of one per for loop saves memory.

It has certainly been true in the past that GCC doesn't put a lot of effort into optimizing stack space, including (1) failing to re-use stack slots for different variables based on scope/liveness and (2) failing to "free" stack slots that were promoted to registers. Furthermore, a good compiler shouldn't be harmed by re-using the variable. That said, it's almost certainly not worth the readability hit to "optimize" this by hand.

# ¿ Dec 9, 2008 22:52

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

seiken posted:

I'm not a professional programmer or anything (nothing of the sort) but I never make functions const because I've always thought accidentally modifying an object when you didn't mean to would be a fairly uncommon thing, and I always forget to make some const that should be and then have to search through files adding const to other functions that were called by const ones. I've always gotten by fine, is this a horrible thing?

That's pretty much what always happens with const: you run into some code that doesn't tag its types properly, and you either (1) strip the consts out of your new code, (2) spend hours propogating consts through the old code, or (3) add a const_cast and make weeping promises to yourself that you'll clean them all up someday.

# ¿ Dec 10, 2008 01:52

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

Mikey-San posted:

Don't get me wrong, I've seen really long functions in C programs before, but 3,000+ lines for one function? What's the weird case here that makes such a beast necessary?

Language interpreters regularly end up like this, although "large functions" tends to be the least of the craziness. But if you're worried at all about interpreter performance, you really can't afford to make a function call per instruction or AST node.

# ¿ Dec 23, 2008 04:14

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

Zombywuf posted:

Your compiler should decide when to inline the functions. It's probably better at it than you are.

This is excellent advice for general programming, but it's not so hot for core interpreter loops.

# ¿ Dec 23, 2008 20:27

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

Zombywuf posted:

Get a better compiler or write it in asm. If you're second guessing your compiler's interpretation of instruction cache usage, you have a problem.

"Large functions are hard to maintain, so interpreters should be written in assembly." Thanks, that's very helpful. And which compiler would you suggest? I need one that can calculate the occurrence rates of different cases in an arbitrary switch statement; sadly, my current inliner just uses sophisticated heuristics about calling context and callee size and behavior.

I'm really not sure why you're arguing with me here; all I'm saying is that portable interpreters sometimes need to sacrifice cleanness for performance. I would absolutely agree that second-guessing the optimizer is usually just shooting yourself in the foot, especially when moving to different platforms and compilers.

# ¿ Dec 23, 2008 23:13

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

TheSleeper posted:

What's the problem with, for instance, GCC's:
code:
inline void foo (char c) __attribute__((always_inline));
and equivalents?

A few things, some minor, some major. First, it's a gcc-ism without equivalents on a lot of compilers; if you're worried about compiler portability, that's a big issue, particularly if you start relying on inlining to get good performance out of re-used code (passing "invert the logic" flags to comparison functions, that sort of thing). Second, it restricts the data used by each case, which can make it slightly more awkward to redesign the core data structures; but that's not that important. Third, it's more awkward to implement control flow.

Overall, though, it's a good approach if you can rely on it.

# ¿ Dec 24, 2008 00:35

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

Ugg boots posted:

Where does it say that 00000001 is 1? What if hardware specifies that the sign bit is the least significant bit instead of the most significant bit? What if it stores the least significant bit in the highest significant bits place for integers?

The LSB that is the LSB is not the true LSB! Hail Discordia!

You're right in a global sense, of course, at least on some globe where you said something right.

# ¿ Jan 6, 2009 01:31

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

rotor posted:

for the record, (n&1) is about twice as fast as (n%2) in javascript

Good lord, it had better be a lot faster than that. n%2 is a floating-point operation in JavaScript.

EDIT: also, JavaScript obviously doesn't have an optimizing compiler.

rjmccall fucked around with this message at 01:37 on Jan 6, 2009

# ¿ Jan 6, 2009 01:33

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

rotor posted:

iunno man, run the benchmark urself

Folly of microbenchmarks, I think. Your benchmark time is probably dominated by loop and general interpretation overhead. What did you run it in?

# ¿ Jan 6, 2009 01:40

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

narbsy posted:

Uhhhhhhh it's the same loop.

Right. So let's say operation A takes 1ms to do a million iterations in 1ms, and operation B takes 100ms to do a million iterations, and loop overhead costs you 100ms. So the benchmark takes 101ms for A and 200ms for B; voil�, a 100x difference turns into in a 2x difference. And trust me that that sort of loop overhead is not unreasonable in naive interpreters.

There are tons of articles out there about how silly this sort of microbenchmarking is, but you should at least subtract out the empty-loop overhead.

The reason I asked about the browser you were using was just curiosity (also you should always mention the host platform/configuration when reporting benchmark results); Firefox's JS implementation leapt forward in 3.1.

rjmccall fucked around with this message at 01:55 on Jan 6, 2009

# ¿ Jan 6, 2009 01:53

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

rotor posted:

it's the exact same loop, but like I said feel free to make a better one. I'm sorry, but simply saying "It must be X, regardless of what the benchmarks may say" is a little too close to faith-based computing for my tastes.

That's not what I'm saying at all. I'm saying your benchmark is crap and doesn't say anything useful, so we should ignore it.

Look, I adjusted your benchmark to also post the results of an empty loop, then got stable results for a bunch of a different implementations on the machine I'm using to post this. There are some good reasons why that still doesn't capture the performance difference properly, but whatever, you don't care. All these results are on a 2.16GHz core duo, 2GB RAM, MacOS 10.5.6, fairly heavy background load.

Firefox 3.0.5: 600ms empty, 1150ms modulus, 800ms twiddle, i.e. 550ms modulus, 200ms twiddle.
Firefox Jan. 5th nightly: 45/172/54, i.e. 127ms modulus, 7ms twiddle.
Safari 3.2.1: 650/1450/1100, i.e. 800ms modulus, 450ms twiddle.
WebKit nightly: 50/163/68, i.e. 113ms modulus, 18ms twiddle.

So I see the performance difference as anything from 2x (when you would have said 1.3x) to 18x (when you would said 3x).

# ¿ Jan 6, 2009 02:35

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

A compelling display of the underlying principle.

# ¿ Feb 11, 2009 06:09

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

I'm curious if there's a difference between DynamicArray::Size() and DynamicArray::GetNumb().

# ¿ Feb 12, 2009 22:29

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

If I had to guess, I would say a source-to-source compiler of some sort � possibly an extremely buggy one.

# ¿ Feb 23, 2009 02:57

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

Janin posted:

References are supposed to guarantee that they point to a valid, allocated region of memory.

It's as much a guarantee as anything else in C++, which is to say that it's basically worthless except as a signal to other programmers.

I think there's a good case to be made that "silent" pass-by-reference is a bad idea, but complaining that references aren't totally sound in a language that's basically unsound in a thousand different ways is a bit over-dramatic.

# ¿ Mar 17, 2009 06:03

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

Janin posted:

The problem is I'm working on code written by people who have been told "pointers are not safe, don't use them, always use references, they are safe!".

Yeah, this is a terrible attitude for plenty of reasons. It's even true in an extremely specialized sense: you can't make an initially-invalid reference without somehow going through a pointer, you can't easily access memory outside the referenced object, etc. But ultimately, you do have to treat a reference like the pointer it is.

# ¿ Mar 17, 2009 06:26

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

heeen posted:

Until you want to have a development database. Or until you want to have a second window. Or a second monitor.

Actually, this is the second of two problems that singletons are specifically designed to solve.

The first problem is the inherent "staticness" of a call to a global function; by using an instance method instead, you grant yourself the option of converting that to a virtual call without modifying any of the call sites. Obviously you could instead just modify the global function, but then you're stuck with an extra level of indirection; C++ would also allow you to turn the original function into a function-pointer variable of the same name, which is, er, not an appealing alternative.

The second problem is that global data often doesn't stay global, in which case you'll eventually need an instance anyway. The important point here is that you're not supposed to repeatedly assume the globalness of a singleton; if every access to the singleton goes through the global accessor, you're completely sabotaging this advantage of the pattern. Instead, you should have only a few places in the code that read the global reference, and those places should pass around the reference to everything else that needs it. That way, you only need to modify those few places if the data ever becomes global.

# ¿ Mar 19, 2009 20:33

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

ymgve posted:

Is this actually enough for a computer to be Turing complete?

Note that you do need either memory cells to be unbounded or an unbounded number of cells. Models with bounded memory are never Turing-equivalent.

# ¿ Apr 13, 2009 02:42

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

C is Turing-complete. Turing-completeness does not require that an unbounded amount of memory be simultaneously addressable; after all, that is not true of Turing machines themselves.

j4cbo posted:

You need the cells themselves to be unbounded, since there's no way to refer to an unbounded number of cells with a value of bounded range.

With a single value, no, but that's not what I said.

EDIT: of course that's inapropos if you're assuming a single-instruction machine without indirect addressing, which you might be, since I suppose that's what I was directly responding to.

rjmccall fucked around with this message at 05:59 on Apr 13, 2009

# ¿ Apr 13, 2009 05:55

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

TRex EaterofCars posted:

I think it depends on the definition of Turing Complete you use. In the strictest sense, a Universal Turing machine requires an unbounded tape and any resource-limited machine cannot fit that definition.

Unbounded, not infinite. If you formalize C as running on a theoretical machine with finite memory, then yes, your statement is correct; but if you formalize it as running on a machine with the full capabilities of, say, an IBM PC AT, then its space is theoretically unbounded, because if it runs out of space locally it can pause execution and say "Insert a new disk in drive A:".

# ¿ Apr 13, 2009 17:37

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

TRex EaterofCars posted:

Yes but you'd run out of matter eventually. I feel dirty for how pedantic this is getting

Heh. That's clearly a limitation on a practical Turing machine, too � in fact, I seem to remember one of the fundamental papers addressing exactly this problem for Turing machines, although I can't find that section now.

There are models of computations requiring infinite simultaneous space, but I don't know much about them. Lenore Blum's work on real-number computability uses abstract operations on abstractly-represented numbers.

# ¿ Apr 13, 2009 18:36

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

If print were actually printf, then print a would indeed have format-string vulnerabilities. But it isn't; the printf-like thing in that code is python's % operator, which is essentially sprintf.

EDIT: VV I think my reading is correct and twodot just doesn't understand the python code, but I will sharpen the idiot hat nonetheless.

rjmccall fucked around with this message at 18:31 on May 4, 2009

# ¿ May 4, 2009 18:23

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

pokeyman posted:

I get why post/pre doesn't matter, but why is the overall behaviour undefined? To my (admittedly untrained) eyes, assuming i and j are numbers, that line increments i by one then ensures it didn't just grow larger than j, wrapping around to 0 if so. Did I get that wrong?

With a few exceptions, the order of execution of side-effects within expressions is not specified in C/C++, so it is permitted for the increment to logically happen after the assignment. The major exceptions are the comma operator (left before right), the ternary operator (condition before chosen expression), and call-like operations (function and arguments (in any order) before call). (n.b. this list is not guaranteed to be exhaustive)

# ¿ May 10, 2009 20:30

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

I agree with you? But we can argue if you like, I am not picky. I just thought people might care why x = 7, ++x was well-defined and this wasn't.

# ¿ May 10, 2009 21:42

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

Oh christ, if there is any language that does not need a secret cabal of elitist programmers, it is Java.

Anyway, Java is prolix. Three root causes of this:

1. Java culture encourages it. Package names are ridiculously prolix, but at least those are usually hidden at the top of the file. Whole-word naming conventions just inevitably pile up appositional modifiers; if it's worse in Java, it's only because the language makes it so easy to over-abstract every problem.

2. The import system lacks shortened qualified import, which turns name collisions into minor catastrophes (because everyone uses prolix package names), which means you see some really silly workarounds like Swing's J prefix on every single class in the library.

3. There are several missing language features that force/encourage you to name things that don't really need names, like all the silly interfaces required by the lack of first-class functions, and the AbstractFoo variants to provide default implementations for interface functions.

# ¿ May 14, 2009 02:37

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

The really terrifying thing is the people who take this horrible name-resolution hack and apply it in their own projects as if there were some deeply legitimate design principle behind it.

# ¿ May 14, 2009 02:57

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

pthread_cond_broadcast, and the pthreads library in general, is a C API; its calls should never throw exceptions. Moreover, even if pthreads calls did throw exceptions in general, pthread_cond_broadcast never fails on valid input: the spec lists only one failure mode, when you give it an uninitialized condition variable. Moreover, the call is surrounded by a generic exception-silencer, which raises the question of why: either (a) an exception from that spot once caused some bug that some fool can't be bothered to properly fix (impossible here) or ~~(b) the program and someone can't happen again or~~ (b) someone just instinctually adds them around most every call they make without thinking about it.

rjmccall fucked around with this message at 17:40 on Jun 4, 2009

# ¿ Jun 4, 2009 07:37

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

Mustach posted:

What was this supposed to say?

Editing mistake. Edited for justice.

# ¿ Jun 4, 2009 17:41

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

Otto Skorzeny posted:

By the way, does Java make any guarantees as to the layout of the class based on the order of declaration of members?

In case you weren't being ironic: no. IIRC there was a paper about implementing profile-directed field layout in the Jikes RVM.

# ¿ Aug 7, 2009 21:57

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

Incoherence posted:

The only time I've ever seen a SIGILL was a situation where I managed to compile something whose dependencies included two classes with the same name, and somehow the linker got sufficiently confused that rather than, I don't know, complaining that there were two classes with the same name, or combining them sensibly, it decided to merge the two together into a single vtable that was missing several methods from each, so that when I called one of the missing methods the program of course crashed.

I would love for someone to explain to me (slowly) just why that happened and why the person who finally figured this out gave me the impression that this behavior wasn't a bug.

It's pretty easy to reproduce this with objects with C linkage (e.g. everything in C):

code:

a.c:
  int val = 0;

b.c:
  int val();
  int main() {
    if (val()) return 1;
  }

If you understand why this compiles and links, and why it doesn't link if you compile the files in C++ instead, you'll be a long way towards understanding what went wrong with your case.

# ¿ Oct 20, 2009 18:01

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

Impressive, particularly since that entire thing is just:

code:

private static double getTaxRate(String strTaxMode, ServletContext application, HttpServletRequest request, String strPostalCode) {
  if (strPostalCode == null || strPostalCode.length() == 0)
    return 0;	
  try {
    Object[] args = { application, request, strPostalCode };
    Class[] argClasses = { ServletContext.class, HttpServletRequest.class, String.class };
    Method m = Tax.class.getDeclaredMethod("getTaxRate" + strTaxMode, argClasses)
    if (m != null) return ((Double) m.invoke(null, args)).doubleValue();
  } catch (Exception exc) {
    Utils.reportException(exc);
  }
  return 0;
}

(and even smaller if you're in Java 1.5)

# ¿ Nov 11, 2009 21:38

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

If a variable's (or function's return value's) type is inferred as ArrayList instead of List, users of the variable/function can write code which only works with ArrayList; thus you get implicit dependencies on some type which was meant to be purely an internal implementation detail.

In practice I doubt this is a serious concern; I find it's very rarely useful to swap implementation types without making other structural/semantic changes that require client code to be updated anyway.

# ¿ Jan 28, 2010 01:30

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

So there are actually several strong advantages to breaking out of a loop instead of using goto. You don't have to invent a unique-in-the-function label; it's immediately obvious where the jump goes to, so your readers aren't forced to search for the label and hope it's in a sensible place; and most importantly, it guarantees that you can't skip variable initializations (*). These advantages are simply the general advantages of structured programming.

JavaScript does this very well.

You're still forced to use goto for state machines, though; tail-recursive calls to nested functions are better, but aren't available (or optimized reliably to jumps) in any mainstream imperative languages.

(*) C++ restricts this so that you can't goto (or switch) past a non-trivial initialization. C's only restriction is that you can't skip a VLA declaration.

# ¿ Feb 27, 2010 22:18

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

Flobbster posted:

I can see it being an issue if you're using preprocessor macros to do code generation, but the argument could be made that a goto with a well-named label provides better intent documentation than a mere "break;" statement with no other information.

Sure. On the other hand, if you would naturally give the same well-named label to two different loops in a function, it becomes really easy to make tragic mistakes.

Flobbster posted:

I'd say searching for a named label is easier than starting inside a loop where the break statement is, scanning up to find the nearest while/for loop, and then scanning back down to find the matching closing brace to see where you're going to end up. I can eyeball a named label far quicker than I can pinpoint which of any number of right curly braces correspond to the exit point of a block of code.

That's a fair point; break and continue are readable only if they're nested in pretty simple control flow. If you're literally just jumping out of the entire thing, goto works very well, especially if you give the label an obvious name to tells people where to start looking. On the other hand, functions do this better if you don't need to thread a huge amount of state through it (and if you aren't addicted to single-point-of-exit).

Flobbster posted:

I like that Java attempted to address the problem by providing a labeled break statement.

This is what I meant about JavaScript � JavaScript actually allows you to attach a label to an arbitrary statement, so you can break out of an arbitrary block statement if you want without introducing a fake loop. Unfortunately, IIRC some old versions of IE didn't actually implement the spec on this and just handled the familiar Java-like cases.

# ¿ Feb 27, 2010 23:38

Adbot: ADBOT LOVES YOU

# ¿ Apr 28, 2024 10:05

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

ZorbaTHut posted:

Worth noting that Clang (which is written in C++) now builds itself, and that series of boxes has gotten substantially greener since I first inspected it. It wouldn't surprise me at all if it built the vast majority of non-template-magic programs totally fine by the end of the year, if not building the majority of Boost.

Actually, we should be fine on template-magic programs; metaprograms that don't compile are pretty firmly in the "individual bug" category rather than "serious architectural work required". Our major holes are all around virtual inheritance (lots of code-generation problems), access control (lots of implementation gaps), and exceptions (lots of rigorous testing needed).

# ¿ Mar 15, 2010 02:23

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Coding Horrors: You can gather all your technical debt into one easy framework!

«‹›14 »