Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
pokeyman
Nov 26, 2006

That elephant ate my entire platoon.
Bahahaha .net doesn't come with an ini parser. That's awesome.

Adbot
ADBOT LOVES YOU

Kilson
Jan 16, 2003

I EAT LITTLE CHILDREN FOR BREAKFAST !!11!!1!!!!111!
Spring is the devil for XML configuration. It's basically like writing code, except it's not type-checked, spell checked, or anything else. So you have potentially thousands of lines of 'configuration', where if something is wrong it can silently (or noisily) blow up at some unspecified time in the future.

Additionally, here are a couple real pieces of XML we get back from a Nortel CS1000 webservice call:

code:
<KEY>
<KEY_VAL>28 RLT</KEY_VAL>
</KEY>
<KEY>
<KEY_VAL>29</KEY_VAL>
</KEY>
<KEY>
<KEY_VAL>30</KEY_VAL>
</KEY>
<KEY>
<KEY_VAL>31</KEY_VAL>
</KEY>
code:
<VIRTUAL></VIRTUAL>
<TYPE>2616</TYPE>
<CDEN>8D</CDEN>
<CTYP>XDLC</CTYP>
<CUST>0</CUST>
<AOM>0</AOM>
<ERL>0</ERL>
<FDN></FDN>
<TGAR>1</TGAR>
<LDN>NO</LDN>
<NCOS>0</NCOS>
<SGRP>0</SGRP>
<RNPG>0</RNPG>
<SCI>0</SCI>
<SSU></SSU>
<XLST></XLST>
<CLS>CTD FBD WTA LPR MTD FND HTD ADD HFD MWD LMPN RMMD SMWD AAD IMD XHD IRD NID OLD VCE DRG1 POD DSX VMD SLKD CCSD SWD LND CNDD CFTD SFD MRD DDV CNID CDCA MSID DAPA BFED RCBD ICDD CDMD LL
CN MCTD CLBD AUTU GPUD DPUD DNDD CFXD ARHD CLTD ASCD CPFA CPTA ABDD CFHD FICD NAID BUZZ AGRD MOAD AHD DDGA NAMA DRDD EXR0 USMD USRD ULAD RTDD RBDD RBHD PGND FLXD FTTC DNDY DNO3 MCBN CDMR
PRED MCDD T87D PKCH</CLS>
The whole thing is like this, going on for several hundreds or thousands of lines. It's almost completely flat, and everything is an abbreviation or acronym of some kind.

:eng99:

etcetera08
Sep 11, 2008

HappyHippo posted:

I used XML for a config file because the data is very basic and C# has built in libraries for serializing/deserializing it. Am I a bad person? :ohdear:

There is also a built in JSON parser so you could go that direction if you desired.

trex eaterofcadrs
Jun 17, 2005
My lack of understanding is only exceeded by my lack of concern.

pokeyman posted:

Bahahaha .net doesn't come with an ini parser. That's awesome.

Jesus Christ I just assumed there was one cause win32 C API has GetPrivateProfileString/Int.

lamentable dustman
Apr 13, 2007

ðŸÂ†ðŸÂ†ðŸÂ†

etcetera08 posted:

There is also a built in JSON parser so you could go that direction if you desired.

Have a good example of one because every json config file looks like a XML one to me with quotes tracking instead of tag tracking.

Either way I prefer a properties file for my configuration

pokeyman
Nov 26, 2006

That elephant ate my entire platoon.

etcetera08 posted:

There is also a built in JSON parser so you could go that direction if you desired.

Unfortunately they both (yup there's more than one) blow and the one that blows less isn't available in the client profile.

Scaramouche
Mar 26, 2001

SPACE FACE! SPACE FACE!

I used to bitch about old school EDI file formats (hard white space, type enforcement, magic numbers) but after having to try and untangle some of the grottier XML implementations I've seen I almost miss them.

Novo
May 13, 2003

Stercorem pro cerebro habes
Soiled Meat

Kilson posted:

The whole thing is like this, going on for several hundreds or thousands of lines. It's almost completely flat, and everything is an abbreviation or acronym of some kind.
:eng99:
I call this kind of thing "almost XML". We have a product which stores its database as a series of repeating tags, it looks a lot like your second example. There are no container tags, you just assume the next record has started when you see the first tag again. I don't even know why they bothered with XML.

Oh, the product also had different defaults for escaping HTML entities at different stages of its lifetime, so you can find both & and &amp; in the "tags". When stuff like this breaks the indexer the company says that it is "user error" because after all, you entered the data that way! (We don't edit the XML directly, you use their proprietary client).

gibbed
Apr 10, 2006

trex eaterofcadrs posted:

Jesus Christ I just assumed there was one cause win32 C API has GetPrivateProfileString/Int.
Which are deprecated and are only present for compatibility with 16-bit applications. :science:

.ini format is terrible anyway.

MasterSlowPoke
Oct 9, 2005

Our courage will pull us through
Reposting this because I don't know if there is a better way:

MasterSlowPoke posted:

I'm coding an app that reads data from user generated, but otherwise static, XML files like this one:

http://pastebin.com/As98K9S0

It is awfully verbose, but I don't know of anything else that would make sense to use. I don't really have an extensive coding background, though.

hobbesmaster
Jan 28, 2008

gibbed posted:

Which are deprecated and are only present for compatibility with 16-bit applications. :science:

.ini format is terrible anyway.

That's why I put everything in the registry!

(Qt :allears:)

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

hobbesmaster posted:

That's why I put everything in the registry!

The core idea of the registry isn't a bad one. If you have some other process that changes a setting, you can get a notification that the key was changed, and update your behavior. Unfortunately, it's a terrible execution, leading to everybody just doing their own thing.

Fun fact: did you know Steam invented their own binary format that emulates the Windows registry for some reason, including the STRING/DWORD/BINARY mess?

gibbed
Apr 10, 2006

Suspicious Dish posted:

Fun fact: did you know Steam invented their own binary format that emulates the Windows registry for some reason, including the STRING/DWORD/BINARY mess?
Are you talking about VDF? It's not just a binary format if you're talking about that.

Also I don't see how it emulates the Windows registry.

sklnd
Nov 26, 2007

NOT A TRACTOR
One time, while porting a Win32 C++ server app to Linux, I helped a guy implement the Windows registry API using XML and Xerces complete with key change notifications through inotify.

I still feel pretty dirty about this.

SixPabst
Oct 24, 2006

Just had a client tell me I was an idiot because I should be catching "the generic Java catch-all exception - NullPointerException - in my code."

The error message he's seeing is coming from javascript. In his browser. In a totally different web application. This man is the lead developer at a very large institution.

:negative:

Plorkyeran
Mar 22, 2007

To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed

Suspicious Dish posted:

If you have some other process that changes a setting, you can get a notification that the key was changed, and update your behavior.
I've been tempted to use the registry just for this. Synchronizing changed settings across multiple running copies of a program is kinda annoying, and just discarding changes made in all but one is bad UX.

trex eaterofcadrs
Jun 17, 2005
My lack of understanding is only exceeded by my lack of concern.

gibbed posted:

Which are deprecated and are only present for compatibility with 16-bit applications. :science:

.ini format is terrible anyway.

Well in my defense Win95 was big when I was last writing win32 software.

And sure .ini isn't as good as simple properties files but it's a far cry better than xml for configuration.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

gibbed posted:

Are you talking about VDF? It's not just a binary format if you're talking about that.

Also I don't see how it emulates the Windows registry.

No, I'm talking about the format that ClientRegistry.blob is in. It's a custom binary format, magics 0x5001 (uncompressed) and 0x4301 (zlib compressed). I don't think the format has been written up before. I, along with a few other people, reverse engineered it, so let me write a bit about it.

Basically, there's two formats in play here. There's the blob format, which is a nested binary format - a blob has a bunch of key/value pairs, and the value may itself be a blob, with magic and everything (yes, this means that you can nest zlib compressed blobs).

On top of that is the registry format, which uses blobs as storage. In the top-level blob there is a key calle, TopKey, which points to a RegistryKey.

A RegistryKey has children blobs 1 and 2 (keys are longs, \x01\x00\x00\x00, \x02\x00\x00\x00). Blob 1 contains the subkeys. Blob 2 contains RegistryValues.

A RegistryValue has children blobs 1 and 2. Blob 1 has the value's type as a long. 0 is STRING, 1 is DWORD, 2 is BINARY. Blob 2 contains the value.

Here's some Python code I wrote a while ago:


It's quite messy and most of it was written before I was proficient with Python.

beuges
Jul 4, 2005
fluffy bunny butterfly broomstick

Otto Skorzeny posted:

256k of ram? What monster uC are you using?

The device actually has 1MB of ram. 512KB gets used by the manufacturer's OS. The other 256KB is available for code, stack and heap. My builds are around the 256KB mark, leaving 256KB to actually play around with. The fun part is that the heap just grows up from where your codespace ends, and the stack just grows down from the 1MB maximum, meaning a huge malloc can overwrite your stack, or a huge call stack can overwrite stuff on your heap. Also, if you try to malloc more memory than you have available, it just segfaults instead of returning null.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

Suspicious Dish posted:

No, I'm talking about the format that ClientRegistry.blob is in. It's a custom binary format, magics 0x5001 (uncompressed) and 0x4301 (zlib compressed). I don't think the format has been written up before. I, along with a few other people, reverse engineered it, so let me write a bit about it.

Basically, there's two formats in play here. There's the blob format, which is a nested binary format - a blob has a bunch of key/value pairs, and the value may itself be a blob, with magic and everything (yes, this means that you can nest zlib compressed blobs).

On top of that is the registry format, which uses blobs as storage. In the top-level blob there is a key calle, TopKey, which points to a RegistryKey.

A RegistryKey has children blobs 1 and 2 (keys are longs, \x01\x00\x00\x00, \x02\x00\x00\x00). Blob 1 contains the subkeys. Blob 2 contains RegistryValues.

A RegistryValue has children blobs 1 and 2. Blob 1 has the value's type as a long. 0 is STRING, 1 is DWORD, 2 is BINARY. Blob 2 contains the value.

Here's some Python code I wrote a while ago:


It's quite messy and most of it was written before I was proficient with Python.

I would love to hear the story from Valve about why the hell they did this.

gibbed
Apr 10, 2006

Suspicious Dish posted:

No, I'm talking about the format that ClientRegistry.blob is in. It's a custom binary format, magics 0x5001 (uncompressed) and 0x4301 (zlib compressed). I don't think the format has been written up before. I, along with a few other people, reverse engineered it, so let me write a bit about it.
D'oh. I completely forgot about ClientRegistry.blob, yeah, that's one giant crapshoot.

I deal with it in a passive way (I wrote code for it ages ago, but no longer use that).

Thermopyle posted:

I would love to hear the story from Valve about why the hell they did this.
Valve have never professed to be good at coding things. :v:

Munkeymon
Aug 14, 2003

Motherfucker's got an
armor-piercing crowbar! Rigoddamndicu𝜆ous.



pokeyman posted:

Bahahaha .net doesn't come with an ini parser. That's awesome.

Or a CSV parser.

Thermopyle posted:

I would love to hear the story from Valve about why the hell they did this.

Mac and Linux ports, obviously :)

ymgve
Jan 2, 2004


:dukedog:
Offensive Clock

Thermopyle posted:

I would love to hear the story from Valve about why the hell they did this.

Even better: Some of their new protocol stuff uses compression. Which they do by placing the data into a .zip file. No, not just a DEFLATE block, a full zip file with headers and stuff. And the file inside the .zip containing the original data is just called "zip".

pokeyman
Nov 26, 2006

That elephant ate my entire platoon.
Funnily enough Python has an ini-ish parser in its standard library.

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe
I know it's not nice to make fun instead of helping, but goddamn.

zeekner
Jul 14, 2007

Hammerite posted:

I know it's not nice to make fun instead of helping, but goddamn.

Oh god the varchar(2) that he will most assuredly use as the item count. 99 Erasers is enough for anyone!

Munkeymon
Aug 14, 2003

Motherfucker's got an
armor-piercing crowbar! Rigoddamndicu𝜆ous.



Hammerite posted:

I know it's not nice to make fun instead of helping, but goddamn.

The best part is (well, was) the guy with 22k reputation who comes in and says "I don't see any injection here nosiree" and then taking down his comment and question after you pointed out he doesn't know PHP for poo poo. It'd be kind of interesting if SO only displayed your reputation garnered for the tags relevant to the question you were answering.

The worst part is that I kind of envy his blissful ignorance :smith:

Rocko Bonaparte
Mar 12, 2002

Every day is Friday!

pokeyman posted:

Bahahaha .net doesn't come with an ini parser. That's awesome.
Why should it when you can cram all your configuration data in the app.config like it's a giant dictionary!

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

Geekner posted:

Oh god the varchar(2) that he will most assuredly use as the item count. 99 Erasers is enough for anyone!

I didn't know whether to be more offended by the database design or the injection vulnerability.

Wheany
Mar 17, 2006

Spinyahahahahahahahahahahahaha!

Doctor Rope
JSLint has a new option: "tolerate stupidity"

What it does is that it without it, JSLint will warn about properties containing the substring 'Sync'.

I think I'll fork JSLint and rename all the options. The option to tolerate ++ will be renamed to "tolerate poopyheadedness" and the option to tolerate != and == will become "tolerate being a fartface"

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

Wheany posted:

JSLint has a new option: "tolerate stupidity"

What it does is that it without it, JSLint will warn about properties containing the substring 'Sync'.

Um? What is this for?

Wheany
Mar 17, 2006

Spinyahahahahahahahahahahahaha!

Doctor Rope

Suspicious Dish posted:

Um? What is this for?

http://tech.groups.yahoo.com/group/jslint_com/message/2835

Douglas Crockford posted:

JSLint now warns when properties contain the substring 'Sync'. The use of that
substring in Nodejs is to identify methods that can cause blockage. Such methods
should never be used.

These useful warnings can be suppressed by using the new Tolerate stupidity
option.

Catalyst-proof
May 11, 2011

better waste some time with you
Douglas Crockford's shtick is really getting old.

Bhaal
Jul 13, 2001
I ain't going down alone
Dr. Infant, MD

MasterSlowPoke posted:

Reposting this because I don't know if there is a better way:
In terms of data format I wouldn't say XML is necessarily a terrible way to go in this case, BUT:

If these are being generated by the public, especially if the node structure your software is expecting is very dynamic and/or specific to its requirements, hand editing XML like that can be a real pain especially for end users. (That being said, I don't think hand editing in any format, JSON, YAML, or whatever will be any better in this situation). For instance, if your software encounters node N, say it needs at minimum child nodes X, Y, Z defined, and say it's parsing a user submitted a document with a Node N with only X and Y defined. Do you have a fancy system of defaults to fill in any gaps, or will your internal data structures be incomplete without Z defined externally? If the latter, you'll want to make it easy for them to spot and fix this.

One solution is you could offer a validator so they can quickly fix any mistakes right away. User generates the document, runs it through the validator (web tool or whatever), and it comes back with "Hey dummy! Node N at line 123 needs child node Z defined!".

A better solution, though involves a lot more dev time, is to make an editor for them. It's on the fly validation as you can have the editor demand input where input is required, generate required child nodes and so on, so your user doesn't get lost in the weeds on your data structures. Also they don't have to worry about syntax or any of that meta data and instead just fill in the pertinent stuff. You could do this in a native app or make a page that builds out form elements with js/jquery/etc, submits to your server, and you feed them back the resulting xml. Note: I don't know how much use those xml documents will see from your userbase, though, so putting in the time to make a tool like this may not be an efficient sub-project to work on if you don't anticipate a lot of user generated documents. If 1 in 10 of your dedicated userbase will ever seriously dabble in it, maybe it's best to provide clear documentation & examples and leave it at that.

Bhaal fucked around with this message at 20:29 on May 1, 2012

Powerful Two-Hander
Mar 10, 2004

Mods please change my name to "Tooter Skeleton" TIA.


Suspicious Dish posted:

What if you're writing an XML parser for compatibility with a broken XML parser in a commercial product?

We have a vendor application that uses some lovely old encoding format in all its xml style sheets so anything that isn't part of whatever character set was hip back in 1980 or whenever (such as "foreign hyphens") causes it to vomit out a stream of random characters converted from raw hex all over the place. What was the vendors response when we told them that they needed to change it to support UTF-8 because we're an international firm and get all sorts of crazy poo poo entered in to our systems? "The XML is working as intended, if you have a problem with it, we suggest you take it up with Microsoft".

:stare:

Flobbster
Feb 17, 2005

"Cadet Kirk, after the way you cheated on the Kobayashi Maru test I oughta punch you in tha face!"
Back when I was a lowly undergrad, I spent a semester maintaining and adding some features to our crufty old automated grading system that we used in some of our intro C++ courses. One of the things I dealt with was the result reports that get generated for students and instructors.

Not knowing any better at the time and wanting to make use of "hip new technologies", I had the servlet generate the report in XML format that was then converted to HTML using an XSLT processor when the user requested it :gonk:

My shame...

TOO SCSI FOR MY CAT
Oct 12, 2008

this is what happens when you take UI design away from engineers and give it to a bunch of hipster art student "designers"

Munkeymon posted:

The best part is (well, was) the guy with 22k reputation who comes in and says "I don't see any injection here nosiree" and then taking down his comment and question after you pointed out he doesn't know PHP for poo poo. It'd be kind of interesting if SO only displayed your reputation garnered for the tags relevant to the question you were answering.

The worst part is that I kind of envy his blissful ignorance :smith:
Stack overflow's reputation system is completely broken.

I started on SO during beta, and continued until a month or two after (when it had been overrun with horrible people). Despite having not seriously used it in almost three years, my reputation continues to go up, I keep getting all sorts of badges, and I've only dropped from page one to page seven on the user list.

My single top-scoring answer of all time is the difference between self and $this in PHP.

My second highest is how to cancel a setInterval call in Javascript.

Basically, StackOverflow's reputation is merely a metric of how common the questions are that you answer. If you spend an hour researching the answer to a thoughtful and interesting question, expect to get ~maybe~ 30 karma out of it. But if you rush for every "help how do I codes" question and drop a vaguely-related solution, you'll be rolling in karma forever.

tef
May 30, 2004

-> some l-system crap ->

Janin posted:

Stack overflow's reputation system is completely broken.

If you meet someone with a high stack overflow reputation, chances are they'd make a great copy editor. It is a game that rewards those who can cannibalise other peoples answers into a way that garners approval from the idiot who can't google.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

Fren posted:

Douglas Crockford's shtick is really getting old.

Yep. Go look at his commit messages. They're all terrible. I shouldn't have to explain what commit messages are for to someone like Douglas Crockford: they're there to summarize a change, and more importantly, detail why it changed. In this commit, what happened? Did JSLint crash? Did it give a false positive warning? Maybe it identified a real error, but botched up its error reporting output?

Now go look at the license. Despite evidence to the contrary, it is not an open-source license, as it contains the clause "This software should be used for good, not evil", and Crockford has the exclusive privilege of deciding who is good and who is evil.

Crockford doesn't accept changes from others, instead opting to duplicate work in his own name. He has never merged a single pull request from others. Take this pull request for example. He closed the request, said "Thanks", and then fixed it on his own along with a ton of other changes in a commit so big GitHub won't show a diff online, with the helpful commit message "comments". (You can compare the two files at the commit vs. the parent, or generate a diff offline to verify this is the case)

Adbot
ADBOT LOVES YOU

tef
May 30, 2004

-> some l-system crap ->
Read the JSON rfc sometime, and see if you don't go wat.

quote:

To escape an extended character that is not in the Basic Multilingual
Plane, the character is represented as a twelve-character sequence,
encoding the UTF-16 surrogate pair. So, for example, a string
containing only the G clef character (U+1D11E) may be represented as
"\uD834\uDD1E".

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply