Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
JoeNotCharles
Mar 3, 2005

Yet beyond each tree there are only more trees.
Ok, well, after telling that guy he's insane for wanting to work with the Windows API, guess what I need to ask about!

I'm bugfixing a client's code:

code:
	BOOL clipboardOpen = ::OpenClipboard(NULL);
	if (! clipboardOpen) 
	{
		OnBadUserInput ();
		return false;
	}

//#if 0
#if 1
	// A little debugging hack for windows - sometimes helpful to turn this on
	// to peek in the debugger at what is on the clipboard - LGP 960430

	long	clipFormat	=	0;
	while ( (clipFormat = ::EnumClipboardFormats (clipFormat)) != 0) {
		TCHAR	buf[1024];
		int		nChars	=	::GetClipboardFormatName (clipFormat, buf, Led_NEltsOf (buf));
		int		breakHere	=	0;
	}
	DWORD lastError = ::GetLastError();	// error codes at [url]http://msdn.microsoft.com/en-us/library/ms681381[/url](VS.85).aspx
#endif

	...
OpenClipboard returns 1. Then it skips the OnBadUserInput block (as it should) and goes into the debugging code, where EnumClipboardFormats returns 0 immediately and GetLastError returns 1418. The MSDN link reveals that 1418 is "Clipboard not open".

The docs for OpenClipboard couldn't be simpler - it returns non-zero on success and zero on failure. 1 is not 0. But immediately after the OpenClipboard call, I get an error saying it's not open (and calls to actually get data from the clipboard later fail with the same error.)

What the gently caress?

EDIT: apparently I need to pass it a window HWND, which isn't what MSDN says.

EDIT again: in case anybody cares, it turns out what was happening is that the clipboard state gets all screwed up when you're stepping through this code in the debugger, probably because it's flipping back and forth between the process that's manipulating the clipboard and Visual Studio.

JoeNotCharles fucked around with this message at 05:31 on Jul 2, 2008

Adbot
ADBOT LOVES YOU

floWenoL
Oct 23, 2002

That Turkey Story posted:

C++ nerd meltdown

Trap loving sprung!

Edit:
Actually kind of backfired because now I want to do a point-by-point rebuttal. :argh:

floWenoL fucked around with this message at 09:53 on Jun 30, 2008

beuges
Jul 4, 2005
fluffy bunny butterfly broomstick

Avenging Dentist posted:

I never understood people who advocate spaces in general. Tabs for indenting, spaces for vertical alignment. It's not hard to format code that's portable to different tab-widths.

I just got this email last monday:

quote:

Hi all,

I keep finding SQL programs with imbedded tabs in the SQL; this means that the program can not be SQL compiled with output to screen; you get wild cursor jumps.

I am now going to INSIST that all of you change the Visual Studio editor settings to replace tabs with spaces.

Go to Tools->Options->Text Editor->C/C++->Tabs. I prefer a tab size and indent size of 3. I will not insist on all of you keeping to this, but it does make life easier if we all adhere to the same standards.

Please re-read the ROT (attached) that I sent out 3 years ago, ESPECIALLY the sections about indentation. Specifically, BE CONSISTENT. I imported the XML_Receiver code this morning, and the indentation was all over the place, making in VERY difficult to follow the logic. Remember that you as author will not necesseraly be the next person working on the code. Don't make it harder than necessary to figure out your logic.

To fix existing code, open a code, select all the code (ctrl-A), and do edit->advanced->untabify selection.

To be fair though, it is a (somewhat) valid point because of the platform we develop on (C with embedded SQL for HP non-stop Tandem servers).

Zombywuf
Mar 29, 2008

quote:

You should never use auto_ptr.

Wrong wrong wrong, bad and wrong. auto_ptr defines ownership of heap objects and will defend you from the problems of memory leaks caused by exceptions they complain about elsewhere. Also, exceptions, use them.

I think most of the rebuttal to to those style guidelines is "Don't hire Java programmers to do C++."

Incoherence
May 22, 2004

POYO AND TEAR

Zombywuf posted:

Wrong wrong wrong, bad and wrong. auto_ptr defines ownership of heap objects and will defend you from the problems of memory leaks caused by exceptions they complain about elsewhere.
The implication is that scoped_ptr is used instead; given the aversion to copying seen elsewhere, this makes some sense.

quote:

I think most of the rebuttal to to those style guidelines is "Don't hire Java programmers to do C++."
Java programmers would be using exceptions. :v:

Zombywuf
Mar 29, 2008

Incoherence posted:

The implication is that scoped_ptr is used instead; given the aversion to copying seen elsewhere, this makes some sense.
They do fundamentally different things. Sometimes auto_ptr is simply what you need, it's kinda like manual region inference.

quote:

Java programmers would be using exceptions. :v:
Not if they had to manage memory while using them. In fact they wouldn't use auto_ptr to manage the lifetimes of objects on the heap and thus avoid the problem of lack of a finally block. Instead, they'd probably not use exceptions.

vanjalolz
Oct 31, 2006

Ha Ha Ha HaHa Ha

Incoherence posted:


Java programmers would be using exceptions. :v:

I think he's trying to say that google made a lot of those rules to stop their clueless java programmers doing silly things in c++ :D

That Turkey Story
Mar 30, 2003

floWenoL posted:

Trap loving sprung!

Edit:
Actually kind of backfired because now I want to do a point-by-point rebuttal. :argh:

Let the games begin!

floWenoL
Oct 23, 2002

Zombywuf posted:

Also, exceptions, use them.

Boy howdy it sure must be nice to live in your world where performance doesn't matter!

That Turkey Story
Mar 30, 2003

Zombywuf posted:

Also, exceptions, use them.

I agree, but the explanation given in their decision section makes sense (their "con" list for exceptions on the other hand is terrible). The overall rationale for their decision is that a lot of the code they already have written is not exception safe and their new code needs to correctly operate with old code. I can only imagine the type of rewrite it would take to make all of the previously written C++ code at google exception safe.

As for auto_ptr, I agree with you. In particular, there really isn't an alternative to auto_ptr for returning a dynamically allocated object and having the result be properly automatically managed. You simply cannot do this with scoped_ptr. When C++0x comes around this will be a different story, but at the moment, auto_ptr is the ideal standard solution.

Edit:

floWenoL posted:

Boy howdy it sure must be nice to live in your world where performance doesn't matter!

That was initially what I thought, Flowenol, but if you read their rationale, performance isn't even mentioned:

quote:

On their face, the benefits of using exceptions outweigh the costs, especially in new projects. However, for existing code, the introduction of exceptions has implications on all dependent code. If exceptions can be propagated beyond a new project, it also becomes problematic to integrate the new project into existing exception-free code. Because most existing C++ code at Google is not prepared to deal with exceptions, it is comparatively difficult to adopt new code that generates exceptions.

Given that Google's existing code is not exception-tolerant, the costs of using exceptions are somewhat greater than the costs in in a new project. The conversion process would be slow and error-prone. We don't believe that the available alternatives to exceptions, such as error codes and assertions, introduce a significant burden.

Our advice against using exceptions is not predicated on philosophical or moral grounds, but practical ones. Because we'd like to use our open-source projects at Google and it's difficult to do so if those projects use exceptions, we need to advise against exceptions in Google open-source projects as well. Things would probably be different if we had to do it all over again from scratch.

There is an exception to this rule (no pun intended) for Windows code.

The bolded sentence at least makes me happy :p

That Turkey Story fucked around with this message at 17:57 on Jun 30, 2008

floWenoL
Oct 23, 2002

That Turkey Story posted:

That was initially what I thought, Flowenol, but if you read their rationale, performance isn't even mentioned:

It isn't? Perhaps I've said too much. :tinfoil:

fartmanteau
Mar 15, 2007

beuges posted:

I just got this email last monday:

quote:


Please re-read the ROT (attached) that I sent out 3 years ago, ESPECIALLY the sections about indentation. Specifically, BE CONSISTENT.

If you are working with a lot of different people's/projects' code, as in, not from within your group with the pristine coding conventions, BE CONSISTENT is a simple and effective rule of thumb. What pisses people off most is combining spaces and tabs. An unbelievable lot of people carelessly miss this. Consistency takes precedence over whatever convention agenda you want to push. Be considerate and make that simple adjustment in your editor whenever possible.

On the other hand, some OSS projects are just crap. Sometimes I would make a clean patch, but then mess it up before committing just to be consistent. :(

Dransparency
Jul 19, 2006

That Turkey Story posted:

Edit: I don't know why you're getting all upset about naming and indentation, etc. You can't really fault anyone for something that's nearly entirely a subjective choice.

Well, he asked for our opinions, so I gave mine. I really do think having a different naming scheme for functions that deal with member variables is bad, since it ties interface to implementation.

Zombywuf
Mar 29, 2008

floWenoL posted:

Boy howdy it sure must be nice to live in your world where performance doesn't matter!

How much of a performance hit are exceptions in modern compilers? Obviously throwing exceptions is a big hit, but does just having the exception handling code there cause much of a slowdown?

floWenoL
Oct 23, 2002

Zombywuf posted:

How much of a performance hit are exceptions in modern compilers? Obviously throwing exceptions is a big hit, but does just having the exception handling code there cause much of a slowdown?

It's more of a memory hit I believe, but yes, it's there (mostly from RTTI). Don't believe the C++ 'pay only for what you use' hype.

floWenoL
Oct 23, 2002

That Turkey Story posted:

This is C++. You can pass objects as references, that's how the language works. Using non-const references for parameters which modify an object as opposed to a pointer is the preferred C++ way of having functions which modify arguments as it removes the issue of making it unclear about the handling of null pointers and overall it simplifies syntax. If it is unclear that an argument to a function should take a non-const reference then that is generally a poorly named function and/or the person trying to use it doesn't understand it's purpose, which either way is a programmer error. The only time I see people have issues with this are when they come from a C background where references just didn't exist so they just got used to always passing out parameters by pointer. There is a better solution in C++ so use it and stop clinging to an out-dated C-style approach.

Just because it's the "preferred C++ way" doesn't mean it's the best way. Your argument as to the clarity of a function taking in a non-const reference equally applies to whether it can take in a NULL pointer argument. If the only time you've seen issues with this is with people from a strong C background (not necessarily a bad thing) then you obviously have not done much maintenance programming; I'd hate to have to grep through mounds of header files just to find where a variable is modified instead of simply searching for where a pointer to said variable is passed in, which would (mostly) suffice with this convention.

That Turkey Story posted:

As for auto_ptr, I agree with you. In particular, there really isn't an alternative to auto_ptr for returning a dynamically allocated object and having the result be properly automatically managed. You simply cannot do this with scoped_ptr. When C++0x comes around this will be a different story, but at the moment, auto_ptr is the ideal standard solution.

Leaving aside the brokenness of having assignment mean 'move' instead of 'copy' sometimes (C++ overloads so many things already, why not poach an operator for this? I suggest "x << y;"), honestly once you remove exception handling that removes maybe 80% of the need for smart pointers, and in the remaining cases auto_ptr is hardly an 'ideal' solution; look at all the people that have tried to write replacements for it, even Alexandrescu!

That Turkey Story
Mar 30, 2003

floWenoL posted:

Just because it's the "preferred C++ way" doesn't mean it's the best way. Your argument as to the clarity of a function taking in a non-const reference equally applies to whether it can take in a NULL pointer argument.
Wait what do you mean? What I meant with respect to null pointers here is that you can't legally work with a "null reference" without doing something that implies undefined behavior. On the other hand, null pointers are perfectly fine and allowable in the language. Therefore, if your function takes a reference it's clear without even it having to be explicitly documented that a constructed object must be passed. On the other hand, with a pointer, that restriction is not implicitly there. I'm sure that this is evident in code which uses this convention simply by looking at the body of the functions that take such out parameters -- is one of the first lines "assert( some_out_parameter );" or a similar kind of check? If not, it probably should be. With references, issues like that simply don't exist.

Aside from that, it's the "preferred" c++ way not by authority, but because it's the exact use the tool was made for and because it does everything the pointer version does for this situation but has more fitting requirements. If I personally saw code that was passing a pointer rather than the object itself, all I would have is more questions. If all you wanted to do is make it more explicit where data was being modified, which is what it seems from your argument, rather than using pointers, why not make it convention to use something like boost::ref instead, or a similar basic reference wrapper. This way you can visually see at the call-site what's an out parameter without removing the strict requirements that a reference implies. You get the explicitness that you desire without the side-effect of making it look like you want to pass a pointer to an object rather than simply the object itself as an lvalue, and you make it so that you can't legally pass a null pointer as well. Would this not be the best of both worlds? You can still easily grep -- perhaps more easily than with a pointer-pass convention.

In other words, just use the convention "some_function( a, b, out( c ) );" where "out" is a function template that returns a reference wrapper. Now you are making your coders be explicit as well as abide by the limitations that references have and which pointers do not.

Going off on a tangent for a second, what do you think about non-const member functions? By this I mean that calling a const member function is the same syntax as calling a non-const one (which boils down to are you passing your object by const or non-const reference once you remove the syntactic sugar). Do you see this as a similar problem to what you're describing? In other words, if you are using a non-const member function should it be explicitly noted at the call site by using a pointer to call the function instead of the object directly, such as what you do when it is a more traditional style function argument? I'd imagine you'd say that always using a pointer here would be ridiculous -- is that just because it's more expected that member functions modify the data or because there is one place to look for all member functions or is there some other reason entirely?

floWenoL posted:

Leaving aside the brokenness of having assignment mean 'move' instead of 'copy' sometimes (C++ overloads so many things already, why not poach an operator for this? I suggest "x << y;")
Firstly because without rvalue references you simply can't make that syntax work with temporaries as you can't bind a temporary to a non-const reference. For instance, "string a; a << ( some_string + some_other_string);" would not compile without rvalue references unless you made the operator's second argument a const-reference, which would incorrectly pull in const lvalues (not to mention that you'd have to use a potentially incorrect const_cast). Then there is the fact that "x << y" already has an entirely different meaning for many common types (I.E. unsigned << unsigned is a left shift and so it would conflict with a move operation for unsigned).

Not only that, but it's extremely beneficial to have move and copy dispatched with the same syntax based on whether you are working with an rvalue or lvalue since it makes things such as the above string statements work with exactly the same written code whether a move operation is implemented or not. This is good because at a high level, a copy of a temporary here is the same logical operation as a move of a temporary since you can't access the temporary again to observe a potentially different sate. You can look at it as an implementation detail that pulls in an optimization based on type information known at compile-time.

This has huge benefits. In generic code, or simply in code that was written prior to a move constructor or move assignment operation being coded, this means that the program is able to instantly take advantage of move semantics once they become available without having to go back and change anything as it was written, since it will simply be picked up by overload resolution (as it logically should be). All you have to do is recompile and your code is suddenly faster with the same meaning. For instance, in modern C++ code, once standard C++ types such as strings adopt proper move support, users of those types instantly have faster code without having to change any of their written code at all, whereas if a separate operator were used for moves instead, you'd only be able to take advantage of move constructions and move-assignment operations if you explicitly went through and changed your own code to use this new "move" operator everywhere instead (not to mention that it would be error-prone to go back and do and you would likely miss some cases).

floWenoL posted:

honestly once you remove exception handling that removes maybe 80% of the need for smart pointers, and in the remaining cases auto_ptr is hardly an 'ideal' solution;
How does having no exception handling remove the need for smart pointers, even by "80%"? Maybe I'm weird, but I don't think I've ever used raw dynamic memory allocation without some kind of smart pointer in 4 or 5 years, and in many of those situations they are in code which can't throw exceptions at all anyway and where the code will likely never need exceptions in the future.

The idea of smart pointers is that the memory management is handled automatically so you don't have to manually keep track of ownership, and when all owners are gone, the memory is automatically deallocated. Without any exceptions at all, it's still extremely beneficial to use smart pointers since it makes it impossible to forget to deallocate your memory, especially if you are doing something like returning from a function in multiple places. No matter where you return, your memory is properly deallocated, whereas with explicit management you have to remember to deallocate at every path and you'll just get a silent memory leak if you happen to forget. This kind of memory leak goes away entirely with proper use of smart pointers.

It's true that RAII helps make exception safe code, but saying that 80% of the use of RAII goes away without exceptions is a pretty big generalization, and even if it did happen to be true for many domains, the remaining 20% is still an important 20% that you shouldn't overlook. Even if it were somehow 1%, that's enough to make it important.

floWenoL posted:

look at all the people that have tried to write replacements for it, even Alexandrescu!
That's only because there is no way to replace it in current C++, at least not in the ways one would like, but that will all change come 0x. For now, you just have to use auto_ptr or something similar.

Alan Greenspan
Jun 17, 2001

Does anybody know of a C++ parser framework that can load BNF grammar files and generate a parser for them on the fly?

The situation is the following: I need to be able to parse assembler code of various platforms which I do not know during development (because it's actually our customers who create the grammar files, not me).

Basically I need to be able to do:

> parser.exe -grammar_file=x86.gra -input=some_asm_file.asm

as well as

> parser.exe -grammar_file=weirdo_cpu.gra -input=some_weirdo_asm_file.asm

The BNF grammars in the gra-files will have predefined constants like MNEMONIC, INTEGER_OPERAND, REGISTER_NAME, and so on and only the order of the tokens in each grammar rule changes.

bcrules82
Mar 27, 2003
whats the best way to extact the even bits (of a uint64_t) and stick them in a uin32_t? looking for a macro here

StickGuy
Dec 9, 2000

We are on an expedicion. Find the moon is our mission.

bcrules82 posted:

whats the best way to extact the even bits (of a uint64_t) and stick them in a uin32_t? looking for a macro here
Maybe an 8 bit lookup table?

schnarf
Jun 1, 2002
I WIN.

bcrules82 posted:

whats the best way to extact the even bits (of a uint64_t) and stick them in a uin32_t? looking for a macro here

I'll show you taking an 8-bit type and extracting the even bits, and you can extend it to 64:

code:
#define EVENBITS(a) (a & 1) | ((a & (1<<2)) >> 1) | ((a & (1<<4)) >> 2) | ((a & (1<<6)) >> 3)
The pattern is, left shifts increase by two (this is the part that grabs the even bits), and right shifts increase by one (this part moves them over to where they belong).

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe
I've been following this tutorial in order to learn some of the things I need to know. On the page I linked it talks about overloading functions, and gives an example of adding two vectors together. Something it doesn't appear to talk about, though, is overloading an operator so that it can be applied to objects a and b belonging to classes A and B respectively. For example, suppose I wanted to be able to multiply vectors by a scalar; that is, suppose I define "vector" to be a class with two public "int" members, x and y, and given a "vector" v and an "int" a, I want to be able to write a*v (a multiplied by v) which is to be a "vector" defined in the obvious way. Can this be done?

Vanadium
Jan 8, 2005

Yeah. The syntax is what you would expect.

tef
May 30, 2004

-> some l-system crap ->

Alan Greenspan posted:

Does anybody know of a C++ parser framework that can load BNF grammar files and generate a parser for them on the fly?

As mentioned on irc - it sounds like you're after a DSL for parsing, but you would be better off sticking to well known languages rather than inventing your own.

For your problem I would suggest using an api to allow the end-users to write their own parsers, and perhaps some sample ANTLR code for well known assembly.

If you do end up going down the dsl route, I would suggest PEGs or OMeta to look at, but when speed is your main concern it will be easier to get the end user to write a fast parser than to write a just in time grammar compiler.

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

Vanadium posted:

Yeah. The syntax is what you would expect.

Thanks.

I have another question. I need to be able to concatenate two strings, but none of the methods I find online seem to work. Most don't compile, one or two have compiled but the application crashes when it reaches the relevant part of the program. The problem, I guess, is one of matching together strings that are of different data types.

The problem is as follows: There are 4 players, referred to by their colours, and a start player is chosen randomly. I want to generate a message that says, "It's Red's first turn. Click OK when ready." or whoever it is to go first.

The line of code as it currently is:

code:
line1msg="It's "+*coloursay[(int)turnorder[0]]+"'s first turn. Click OK when ready.";
This line does not compile. As mentioned, I've tried a few different ways of doing this.

Here line1msg is of type LPSTR. turnorder is a 1-dimensional array of length 4 and of type colour, where colour is an enumeration with the possible values being red, yellow, green and purple. coloursay is also a 1-dimensional array of length 4, and of type LPSTR, given by the following:

code:
LPSTR coloursay[4] = {"Red","Yellow","Green","Purple"};
How do I change the above line so that it will work? I find it hard to work out what type of string data type I should be using / how to get it (LPSTR, CHAR*, char[36], const char, the list seems to go on!)

Smackbilly
Jan 3, 2001
What kind of a name is Pizza Organ! anyway?

Hammerite posted:

Thanks.

I have another question. I need to be able to concatenate two strings, but none of the methods I find online seem to work.

...

How do I change the above line so that it will work? I find it hard to work out what type of string data type I should be using / how to get it (LPSTR, CHAR*, char[36], const char, the list seems to go on!)

The + operator does not concatenate C-style strings (i.e. character arrays). You need to use the strncat function. And if you're in C++ and not C, you should probably use the std::string class instead of character arrays.

StickGuy
Dec 9, 2000

We are on an expedicion. Find the moon is our mission.

Hammerite posted:

Thanks.
Be aware though, that if you want to the an integer times a vector with that class, then that operator would not be a member function. It would look something like:
code:
CVector operator+(int a, const CVector &b) {
    ...
}

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

Smackbilly posted:

The + operator does not concatenate C-style strings (i.e. character arrays). You need to use the strncat function. And if you're in C++ and not C, you should probably use the std::string class instead of character arrays.

Thanks. I followed part of your advice, in changing the types of all my strings to char arrays (with plenty of redundant length - there are only like 5 of them I'm using, so storage isn't an issue). I didn't follow your advice on changing to using std::string, because I couldn't work out how to make that work; if I write std::string I get errors about needing to define a constructor, and if I write string it just says string doesn't define a class. I guess I am missing a header reference at the top of the cpp file. The current header references I have are

code:
#include "resource.h"
#include <windows.h>
#include <time.h>
where time.h is just there so that I can seed the random number generator.

I haven't any problems at the moment, although it was hard to get the program to clear a string (so that it's blank, or appears blank) because except when initialising the string, it complains that the character array "" and the array I'm assigning it to aren't the same length. The workaround I found is to tell it the first element of the string is the terminating character, i.e.

code:
line1msg[0]=*""
As you can probably tell, I still don't really understand the way strings work.

Vanadium
Jan 8, 2005

You need to #include <string> to use std::string.

TSDK
Nov 24, 2003

I got a wooden uploading this one

Hammerite posted:

I didn't follow your advice on changing to using std::string, because I couldn't work out how to make that work;
The std::string class isn't all that difficult, so it's worth your while finding a tutorial.
code:
#include <string>

...

std::string coloursay[4] = { "Red", "Yellow", "Green", "Purple" };

...

std::string line1msg = "It's " + coloursay[turnorder[0]] + "'s first turn. Click OK when ready.";
Should work.

Smackbilly
Jan 3, 2001
What kind of a name is Pizza Organ! anyway?

Hammerite posted:

I haven't any problems at the moment, although it was hard to get the program to clear a string (so that it's blank, or appears blank) because except when initialising the string, it complains that the character array "" and the array I'm assigning it to aren't the same length. The workaround I found is to tell it the first element of the string is the terminating character, i.e.

code:
line1msg[0]=*""
As you can probably tell, I still don't really understand the way strings work.

Like the last two people have said, #include <string> and use std::string, it's much easier almost all of the time. However, it is worth you while to learn how C-style strings work, since you will encounter them:

A C-style string is an array of characters which is terminated by a special character called NUL. This has ASCII value 0, and is represented as '\0'. Anything after the NUL character in a C-string is ignored. This means that the 6-character array 'H' 'e' 'l' 'l' 'o' '\0' and the 10-character array 'H' 'e' 'l' 'l' 'o' '\0' 'x' 'y' 'z' '!' are both considered to be the same string "Hello". If a C-string lacks a terminating NUL, you've got problems because there's no way to find out how big an array is at run-time, so functions expecting a C-string won't know when to stop reading from the array.

Consequently, if you have some string char mycstr[10], to "clear it", it is sufficient to write mycstr[0] = '\0'. Note that when you write an initializer like char mycstr[10] = "Hello", the compiler will automatically add in the trailing \0, so long as your array is large enough to hold it. Note also that this means that an array which holds a C-string must always be at least one character larger than the number of printable characters that it intends to store.

The key thing to remember about C-strings is that they are really only character arrays with this agreed-upon convention that they terminate with '\0'. The language does not treat them any differently than other arrays, and consequently operators like = and + don't do special string assignment and concatenation operations like you might want them to. For example, you can't do this:

code:
int intArrA[5] = {1, 2, 3, 4, 5};
int intArrB[7] = {10, 20, 30, 40, 50, 60, 70};

intArrA = intArrB; // doesn't work!
so you also can't do this:

code:
char charArrA[5] = "Bill";
char charArrB[7] = "Ernest";

charArrA = charArrB; // doesn't work either!

charArrA = "Eric"; // also doesn't work 
// ^^^ This kind of syntax only works in initializations. 
Therefore, the C standard library has functions like strncat (concatenate strings) and strncpy (copy a string into another array) and strlen (how many characters are before the \0 character?). The way these functions are implemented is dead simple. strncpy, for example, takes three parameters, a destination array, a source array, and a maximum number of characters to copy. All it does is go through a while loop that copies each character from source to destination until either it copies a '\0' character or it copies the maximum number of characters. No special string magic involved.

But the thing to be careful about with all of this is making sure that you arrays have the appropriate size. Arrays in C do not automatically grow to accomodate new data. If you try to write 10 characters into a 5-character array, your program will crash if you're lucky or have other "interesting" and tough to find bugs if you're not.

This means that if you do not know at compile-time how long a string is going to be (user input, for example), you need to dynamically allocate memory for the character array. And then you have to make sure you keep track of that memory and free it later to avoid a memory since C/C++ have no garbage collector.

Oh, and then there's the interesting part where you will see C-strings referred to as char* (character pointers) instead of character arrays. In most cases in C, an array is interchangable with a pointer to its first element. So it's perfectly legal to have a character pointer char* foo, and say foo[5] = 'x';. This is of course assuming that you know for sure that foo is size at least 5, which you can't determine at run-time, which is why we have this whole business about '\0' characters earlier.

Phew. And that's really just the half of it.

In short, C strings are a huge pain in the rear end and are ridiculously error prone. Even many professional commercial programs have bugs and security holes because some programmer forgot to be careful when dealing with arrays (usually character array). You should be aware of how they work because you will encounter them, but if you are fortunate enough to be programming in C++, there's no reason to subject yourself to one of C's biggest annoyances. That's why God (Bjarne Stroustrup) invented std::string.

std::string works like you would expect a string to work. = copies/assigns, + concatenates, it has a .size() method to find out how long it is, and it grows and shrinks as necessary with no need for manual memory management. It even has a .c_str() method which will convert it to a C-string if you encounter a function that doesn't accept std::string. It really is worth it to learn how that works.

C++ even has other classes (e.g. std::vector) with which you can usually entirely avoid using C-style arrays in a C++ program. As you learn more about C and C++'s memory management, pointers and arrays will make more sense to you and you'll 'get' why C-style arrays and strings are the way they are, but even when you do understand it, there's no reason to subject yourself to all the inconvenience of working with them when there's a perfectly good and intuitive C++-style way to address the issue.

Smackbilly fucked around with this message at 15:39 on Jul 2, 2008

ZorbaTHut
May 5, 2005

wake me when the world is saved

That Turkey Story posted:

Google style guide posted:

You should not use the unsigned integer types such as uint32_t, unless the quantity you are representing is really a bit pattern rather than a number. In particular, do not use unsigned types to say a number will never be negative. Instead, use assertions for this.

Holy moly. Goodnight.

I strongly agree with the Google style guide here :v:

I've seen a surprising number of bugs caused by people writing reasonable, sensible equations that break in unexpected horrifying ways when the types involved are unsigned. For example:

code:
string mystring = getdata();
for(int i = 0; i < mystring.size() - 1; i++) doThings(mystring[[b][/b]i]);
If mystring.size() is 0, it underflows and wraps around to 4.3 billion. Whee!

Conversely, I've almost never seen a situation where adding a single bit to a number's magnitude provably fixes a bug. It just plain doesn't happen - if int isn't enough, unsigned int probably isn't going to be enough either. Real-world numbers tend to follow a power distribution, and in power-distribution-land, that extra bit increases your number's range by a piddling 3%. That's usually not anything worth mentioning.

As I see it, with signed numbers you have two "danger zones" you have worry about - the area near overflow, and the area near underflow. With unsigned numbers you have the same danger zones, but the "underflow" zone is at 0, and 0 is a very common case with programming. That puts your normal working area way way way too close to "danger" for me to feel comfortable.

The only situation where I use unsigned values is when I actually know that the range is going to be precisely within 32 unsigned bits and when RAM is at a premium, or when I'm dealing with bitfields. The former case is extremely rare and the latter case is usually very obvious. I have found myself to be generally happy with this decision.

Super Mario Shoeshine
Jan 24, 2005

such improper posting...
oh god i posted in the wrong thread

Super Mario Shoeshine fucked around with this message at 17:05 on Jul 2, 2008

TSDK
Nov 24, 2003

I got a wooden uploading this one

ZorbaTHut posted:

For example:
code:
string mystring = getdata();
for(int i = 0; i < mystring.size() - 1; i++) doThings(mystring[[b][/b]i]);
Do you have something against the last character in a string or something?

To me, that just looks like a bug caused by someone mixing and matching C style string handling code with C++ string types. That class of bug is quite hard to introduce when using iterators as designed.

more falafel please
Feb 26, 2005

forums poster

ZorbaTHut posted:

Real-world numbers tend to follow a power distribution, and in power-distribution-land, that extra bit increases your number's range by a piddling 3%. That's usually not anything worth mentioning.

You have absolutely no understanding of what you're talking about.

edit: OK, I think it's just phrased badly and you're talking about power distribution. I thought you were actually arguing that adding bits to a number increased the range of values the number could store linearly instead of exponentially.

more falafel please fucked around with this message at 21:23 on Jul 2, 2008

POKEMAN SAM
Jul 8, 2004

more falafel please posted:

You have absolutely no understanding of what you're talking about.

Doesn't adding another bit simply double the range (assuming you're only using non-negative numbers anyways.)

e.g. a signed byte is 0-127 (again, assuming you aren't using the negatives), but an unsigned byte can be 0-255. If you're adding another bit doesn't that, by definition, double your range?

TSDK
Nov 24, 2003

I got a wooden uploading this one

more falafel please posted:

You have absolutely no understanding of what you're talking about.
I think ZorbaTheHut probably just phrased that badly there, rather than being out and out wrong.

If the 'real world' distribution of numbers that your application is handling follows a power law distribution:
http://en.wikipedia.org/wiki/Power_law
Then let's suppose out of 100,000 cases your variable foo will be asked to handle values in the range 0-2147483647 in 96,000 of those cases. There will then only be something like 3,000 cases where it'll be asked to handle values in the range 2147483648-4294967296.

When talking about failures caused specifically by unchecked integer overflow, using an unsigned int instead of an int doesn't actually improve all that much on your rate of failure in general. You then have to weight up the relatively small gain in failure rate versus the likelihood that unsigned ints could cause a problem.

In my opinion, the likelihood that an unsigned int would cause a problem relative to a signed int is very very small indeed, so I'll take that failure rate improvement happily and go on my merry unsigned way.

In a related anecdote, I've been bitten before by using video encoder software that just gave up after the first 2Gb of a 3Gb file and blanked the image (but not the sound) for the last 3rd of the movie clip. Clearly someone with an always-int mentality had written the file loader for the image compression part, and someone with an unsigned-int-mentality had written the file loader for the sound compression part :)

floWenoL
Oct 23, 2002

ZorbaTHut posted:

I strongly agree with the Google style guide here :v:

I've seen a surprising number of bugs caused by people writing reasonable, sensible equations that break in unexpected horrifying ways when the types involved are unsigned. For example:

I was going to reply to that (and I still plan to address TTS's other points) but I think TTS meant that he was just tired, and he wasn't replying to that specifically?

floWenoL
Oct 23, 2002

TSDK posted:

When talking about failures caused specifically by unchecked integer overflow, using an unsigned int instead of an int doesn't actually improve all that much on your rate of failure in general. You then have to weight up the relatively small gain in failure rate versus the likelihood that unsigned ints could cause a problem.

In my opinion, the likelihood that an unsigned int would cause a problem relative to a signed int is very very small indeed, so I'll take that failure rate improvement happily and go on my merry unsigned way.

I'd say it's more probable that you'd get bitten by unsigned integer underflow than overflow. You also have to take into account that using unsigned inhibits compiler optimizations and so that 'free' range boost may come at a performance cost.

Really, 'signed' and 'unsigned' are truly horrible names as the differences between the two are more than just their signedness. I guess 'signed-with-undefined-overflow' and 'modulo-some-power-of-2' don't exactly roll off the tongue quite as easily.

quote:

In a related anecdote, I've been bitten before by using video encoder software that just gave up after the first 2Gb of a 3Gb file and blanked the image (but not the sound) for the last 3rd of the movie clip. Clearly someone with an always-int mentality had written the file loader for the image compression part, and someone with an unsigned-int-mentality had written the file loader for the sound compression part :)

That's not a problem with unsigned vs. signed; that's a problem with using 32 bits instead of 64 for file/memory sizes. :v

Adbot
ADBOT LOVES YOU

That Turkey Story
Mar 30, 2003

ZorbaTHut posted:

I've seen a surprising number of bugs caused by people writing reasonable, sensible equations that break in unexpected horrifying ways when the types involved are unsigned. For example:

code:
string mystring = getdata();
for(int i = 0; i < mystring.size() - 1; i++) doThings(mystring[[b][/b]i]);
If mystring.size() is 0, it underflows and wraps around to 4.3 billion. Whee!

First, before going into anything at all, I'd recommend using iterators here which as a side-effect avoids the issue of sign entirely, and if for some reason you didn't do that, I'd still say say use string::size_type instead of int and just don't write an algorithm which relies on negative values. The reason is, if you write your loop correctly and your loop uses the proper size type, your code works for all strings. If, on the other hand, you use int, your loop variable will risk overflow on larger strings no matter how you write it. Your algorithm is not going to be correct, at least in the general sense, not including limits from context. In the end, I'd rather write the code properly and have it work for all cases as opposed to write it in a way that works only for some subset simply because of a fear and/or misunderstanding of working with unsigned types. I'd hope that in your example you're at the very least asserting that mystring.size() <= numeric_limits< int >::max().

Either way, your for loop example is hardly indicative of why you should never use unsigned types even if that code actually were able to handle all strings. If anything, at least just recommend defaulting to signed if you are unsure rather than disallowing unsigned types completely for arithmetic calculations, though even that I'd disagree with. Use the type that makes sense for the job. Period. Strictly disallowing unsigned types for arithmetic calculations is foolish and is going to push you to write code that is incorrect in very subtle ways. Programming can be tricky, but things like forcing people to use signed types only gives the illusion of making your code safer.

floWenoL posted:

You also have to take into account that using unsigned inhibits compiler optimizations and so that 'free' range boost may come at a performance cost.
Wait, what? If anything I can think of cases where unsigned operations can be optimized whereas signed operations cannot, not the other way around. Maybe you know something I don't, but either way, that doesn't change the fact that using an unsigned type may be correct whereas a signed type isn't (or vice versa). Pick your type based on what makes your code more correct.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply