Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Janitor Prime
Jan 22, 2004

PC LOAD LETTER

What da fuck does that mean

Fun Shoe
TopologicalSorterNonSquareMatrixException

:catstare:

Adbot
ADBOT LOVES YOU

trex eaterofcadrs
Jun 17, 2005
My lack of understanding is only exceeded by my lack of concern.

Biowarfare posted:

Today I saw a MySQL database for an old internal webapp. It has one column that contains base64-encoded XML for data.

I got one up on that:

A clob column with an xml document inside, that gave instructions to the java app that selected that data to use reflection to load the specified class with the specified constructor arguments, which were always HttpServletRequest and something else. The moment I saw that I immediately ripped it out and claimed it "stopped working".

NotShadowStar
Sep 20, 2000

Biowarfare posted:

Today I saw a MySQL database for an old internal webapp. It has one column that contains base64-encoded XML for data.

And I'd put money down if you showed the person who designed that horror NoSQL document style databases, they'd dismiss it outright for some stupid reason.

xarph
Jun 18, 2001



brb changing every password ever.

POKEMAN SAM
Jul 8, 2004
From the IronPython mailing lists:

quote:

I have a large RE (223613 chars) that works fine in CPython 2.6, but
seems to produce an endless loop in IronPython (see below). I'm using
Mono 2.10 (.NET 4.0.x) on Ubuntu, with IronPython 2.7. Anyone have
pointers to the differences between them? Is
System::Text::RegularExpressions in .NET configurable in some fashion
that might help?

I'm a .NET newbie.

TIA,

Bill

code:
import sys, os, re

try:
    # we use the name lists in nltk to create person-name matching patterns
    import nltk.data
except ImportError:
    sys.stderr.write("Can't import nltk; can't do name lists.\nSee [url]http://www.nltk.org/.\n[/url]")
    sys.exit(1)
else:
    __MALE_NAME_EXCLUDES = ("Hill",
                          "Ave",
                          )
    __FEMALE_NAME_EXCLUDES = ()
    __FEMALE_NAMES = [x for x in
                      nltk.data.load("corpora/names/female.txt", format="raw").split("\n")
                      if (x and (x not in __FEMALE_NAME_EXCLUDES))]
    __FEMALE_NAMES += [x.upper() for x in __FEMALE_NAMES]
    __MALE_NAMES = [x for x in
                    nltk.data.load("corpora/names/male.txt", format="raw").split("\n")
                    if (x and (x not in __MALE_NAME_EXCLUDES))]
    __MALE_NAMES += [x.upper() for x in __MALE_NAMES]
    __INITS = [chr(x) for x in range(ord('A'), ord('Z'))]

PERSON_PATTERN = re.compile(
    "^((?P<honorific>Mr|Ms|Mrs|Dr|MR|MS|MRS|DR)\.? )?"         # honorific
    "(?P<firstname>" +
    "|".join(__FEMALE_NAMES + __MALE_NAMES + __INITS) + # first name
    ")"
    "( (?P<middlename>([A-Z]\.)|(" +
    "|".join(__FEMALE_NAMES + __MALE_NAMES) +         # middle initial or name
    ")))?"
    " +(?P<lastname>[A-Z][A-Za-z]+)",             # space then last name
    re.MULTILINE)

print PERSON_PATTERN.match("Mr. John Smith")

Opinion Haver
Apr 9, 2007

So wait, he's assuming that every person is going to have a name in his corpus? That seems like a really bad assumption.

POKEMAN SAM
Jul 8, 2004

yaoi prophet posted:

So wait, he's assuming that every person is going to have a name in his corpus? That seems like a really bad assumption.

He has replied, and I don't know if it excuses him:

quote:

I'm used to working with a full-featured
finite-state machine (PARC's xfst; see
http://www.cis.upenn.edu/~cis639/docs/xfst.html), and was wondering if
we could do similar things with Python's RE machinery. Long lists like
these names are often used for lists of companies or cities or such.
People's names are actually a fairly simple and short example of this :-).

Malloc Voidstar
May 7, 2007

Fuck the cowboys. Unf. Fuck em hard.
http://schema.org/docs/gs.html
code:
<div>
 <h1>Avatar</h1>
 <span>Director: James Cameron (born August 16, 1954)</span>
 <span>Science fiction</span>
 <a href="../movies/avatar-theatrical-trailer.html">Trailer</a>
</div>
It's better to write this as:
code:
<div itemscope itemtype="http://schema.org/Movie">
  <h1 itemprop="name">Avatar</h1>
  <div itemprop="director" itemscope itemtype="http://schema.org/Person">
  Director: <span itemprop="name">James Cameron</span> (born <span itemprop="birthDate">August 16, 1954)</span>
  </div>
  <span itemprop="genre">Science fiction</span>
  <a href="../movies/avatar-theatrical-trailer.html" itemprop="trailer">Trailer</a>
</div>
I didn't know SEO companies had lobbyists...

pseudorandom name
May 6, 2007

I don't see what's so bad about marking up data so that search engines can identify, extract and index it.

pokeyman
Nov 26, 2006

That elephant ate my entire platoon.

pseudorandom name posted:

I don't see what's so bad about marking up data so that search engines can identify, extract and index it.

1. It's ugly.

2. That's the whole job of a bloody search engine!

Brecht
Nov 7, 2009

pokeyman posted:

1. It's ugly.
Not really. It's extra, meaningful data provided in a canonical, straightforward way.

quote:

2. That's the whole job of a bloody search engine!
If search engines can know the context of data, they get a lot better. I don't see why this is controversial.

iow the coding horror is you

pokeyman
Nov 26, 2006

That elephant ate my entire platoon.

Brecht posted:

Not really. It's extra, meaningful data provided in a canonical, straightforward way.

I see HTML as markup that's meant to be almost as human-readable as computer-readable. To my mind, that's why manually editing XML is usually hopeless, and why browsers interpret HTML so leniently. "Extra, meaningful data provided in a canonical, straightforward way" can be, and in this case is, done in an ugly way.

quote:

If search engines can know the context of data, they get a lot better. I don't see why this is controversial.

Nowhere did I say otherwise, your controversy is imagined.

1337JiveTurkey
Feb 17, 2005

pokeyman posted:

1. It's ugly.
Then don't look at the page source. Or if you really need to, use some XML schema that describes your data in a structured manner and then transform it. It's similar to how CSS is used to separate stylistic elements from the underlying document structure. Instead it's separating the data's logical structure from the document's structure so you don't have spans, divs, tables and the like cluttering up the biographical information of a film director.

quote:

2. That's the whole job of a bloody search engine!
Removing semantically meaningful data is something that should only be done with a reason. If a website were to use PNGs instead of text, it'd be impossible for a search engine to index it in any meaningful way. Removing metadata which is in a standardized format makes it harder for search engines and other tools to use the remaining data effectively.

Malloc Voidstar
May 7, 2007

Fuck the cowboys. Unf. Fuck em hard.

1337JiveTurkey posted:

Then don't look at the page source.
writing code without looking at it is some seriously zen poo poo

pokeyman
Nov 26, 2006

That elephant ate my entire platoon.

1337JiveTurkey posted:

Instead it's separating the data's logical structure from the document's structure so you don't have spans, divs, tables and the like cluttering up the biographical information of a film director.

This admirable goal has nothing to do with the posted example, which in fact adds one div and one span that were not present in the original.

quote:

Removing semantically meaningful data is something that should only be done with a reason. ... Removing metadata which is in a standardized format makes it harder for search engines and other tools to use the remaining data effectively.

We're talking about a new specification that adds data. As of a month ago, precisely zero websites were using it. What data is being taken away here?

Just as data should be removed only with reason, so should data be added only with reason. There is such a thing as too much.

pokeyman
Nov 26, 2006

That elephant ate my entire platoon.

Aleksei Vasiliev posted:

writing code without looking at it is some seriously zen poo poo

I write like this:

code:
read -s && echo $REPLY | gcc -x c -

Zombywuf
Mar 29, 2008

pseudorandom name posted:

I don't see what's so bad about marking up data so that search engines can identify, extract and index it.

code:
<div itemtype="movie">
  <h1 itemprop="name">Avatar</h1>
  <div itemprop="description">
    <a href="http://shittypills/>Buy cheap viagra online!</a>
  </div>
</div>

Brecht
Nov 7, 2009

pokeyman posted:

Just as data should be removed only with reason, so should data be added only with reason. There is such a thing as too much.
So you're saying if I'm writing a movie website, and I know this H3 contains the director's name, signaling that context with a type="director" tag is too much information?

Milotic
Mar 4, 2009

9CL apologist
Slippery Tilde

pseudorandom name posted:

I don't see what's so bad about marking up data so that search engines can identify, extract and index it.

It's not just search engines. Enterprise / public sector is also interested in this stuff. You can glean, or attempt to glean a lot of useful information about inflation and prices by screen-scraping information off Tesco.com and the like. And then they change their layout, or add multiples/weights, or add offers, and it turns into a bit of an arms race until you finally say "gently caress it". schema.org and whatnot would be a godsend.

NotShadowStar
Sep 20, 2000
Every single time I've seen adding extended attributes to HTML it always ends up terribly.
For instance, Drupal is really into FOAF and RDF. So much that every image tag it spits out looks like

<img src="beecock.jpg" typeof="foaf:Image" alt="penis">

OF COURSE IT IS TYPEOF IMAGE YOU loving DOLTS, IT IS A loving IMG TAG fffffffffffffclownpensifart

pseudorandom name
May 6, 2007

Keep in mind that Google already supports three different incompatible syntaxes for this kind of markup, so if anything, schema.org is an improvement over the current situation.

It is even more compact than their existing microdata schema, by virtue of "schema.org" being shorter than "data-vocabulary.org".

1337JiveTurkey
Feb 17, 2005

Aleksei Vasiliev posted:

writing code without looking at it is some seriously zen poo poo

This is generated HTML 99 times out of 100 so apart from reviewing developed code for correctness people won't be looking at it anyhow.

1337JiveTurkey
Feb 17, 2005

pokeyman posted:

This admirable goal has nothing to do with the posted example, which in fact adds one div and one span that were not present in the original.
HTML doesn't do structured data very well and it shows. XML with client-side XSLT does much better in that regard since then you can just have an XML document.
code:
<movie>
  <name locale="us">Avatar</name>
  <genre>Science Fiction</genre>
  <director ref="../directors/jamescameron.xml"/>
  <trailer ref="../movies/avatartrailer.mpeg"/>
</movie>
It's not perfect but it's useable as the source for a styled website as well as a ReST API.

quote:

We're talking about a new specification that adds data. As of a month ago, precisely zero websites were using it. What data is being taken away here?

Just as data should be removed only with reason, so should data be added only with reason. There is such a thing as too much.
If nobody makes it available then nobody will ever be able to use it. Google's adopted similar microdata formats in the past. That's why you see stuff popping up on their search results that's for sale from various online retailers.

pokeyman
Nov 26, 2006

That elephant ate my entire platoon.

Brecht posted:

So you're saying if I'm writing a movie website, and I know this H3 contains the director's name, signaling that context with a type="director" tag is too much information?

Not necessarily, I'm saying adding ten attributes, a div, and a span to signal that context is too much.

pokeyman
Nov 26, 2006

That elephant ate my entire platoon.

1337JiveTurkey posted:

HTML doesn't do structured data very well and it shows. XML with client-side XSLT does much better in that regard since then you can just have an XML document.

HTML does documents very well, and I'm not sure it was ever seriously offered as a language to describe structured data, so why should it do that very well? As you point out, there are better tools for that job. Yet another attempt to bolt some attributes on to HTML and call it a data description language seems wrongheaded. And ugly.

<structured data store> + <store to html transformation> will generally be better than HTML for any values of the two variables, if your problem is storing and presenting structured data.

Bozart
Oct 28, 2006

Give me the finger.
gently caress XSLT, seriously.

Zhentar
Sep 28, 2003

Brilliant Master Genius

NotShadowStar posted:

OF COURSE IT IS TYPEOF IMAGE YOU loving DOLTS, IT IS A loving IMG TAG fffffffffffffclownpensifart

Funny you should mention that...

code:
<a href="bob.html" itemprop="url">Bob Smith</a>
P.S. XSLT is awesome

1337JiveTurkey
Feb 17, 2005

pokeyman posted:

HTML does documents very well, and I'm not sure it was ever seriously offered as a language to describe structured data, so why should it do that very well? As you point out, there are better tools for that job. Yet another attempt to bolt some attributes on to HTML and call it a data description language seems wrongheaded. And ugly.

<structured data store> + <store to html transformation> will generally be better than HTML for any values of the two variables, if your problem is storing and presenting structured data.

Really in that respect I couldn't agree with you more. HTML really does suck at structuring data in any meaningful manner. If there's any reason people use HTML with microformats at all, it's institutional inertia. My issue is that the second half of <structured data store> + <store to html transformation> should be pushed back as far as possible and the structured data should be available as long as possible. Microformats at least keep some structured data available longer and therefore more useful to tools other than plain old web browsers.

SiliconCow
Jul 14, 2001
I built a poor-man's 'splunk' with PHP that basically dumps form fields into shell_exec(/usr/bin/ssh) to shell into multiple application servers with root ssh keys, grab logs(or anything really), and stream back the aggregated output with optional regex highlighting/filters. Oh and the front end uses a jquery plugin to continuously poll the script every second so users can interactively change logs/servers/search parameters without even hitting submit or anything. Every second for every user I'm spawning like 20 processes across multiple machines.

:negative:

(don't worry - it's cool - I sanitize the user input so they can't tamperdata the request and tail /etc/shadow or something)

SiliconCow fucked around with this message at 09:19 on Jun 5, 2011

Zombywuf
Mar 29, 2008

1337JiveTurkey posted:

Really in that respect I couldn't agree with you more. HTML really does suck at structuring data in any meaningful manner.

HTML has lists, sets and dictionaries. It also has a bunch of other stuff like semantically marked up hyperlinks, what more could you need?

Jabor
Jul 16, 2010

#1 Loser at SpaceChem

Zombywuf posted:

HTML has lists, sets and dictionaries. It also has a bunch of other stuff like semantically marked up hyperlinks, what more could you need?

Any setup where the structure of data is inherently tied up and munged in with its presentation is pretty sucky. It basically means you have to choose between presenting well-structure data for the benefit of automated systems, or well-presented data for the benefit of humans looking at it, and it's a right pain in the rear end to get both.

Zombywuf
Mar 29, 2008

Jabor posted:

Any setup where the structure of data is inherently tied up and munged in with its presentation is pretty sucky. It basically means you have to choose between presenting well-structure data for the benefit of automated systems, or well-presented data for the benefit of humans looking at it, and it's a right pain in the rear end to get both.

CSS is for humans.

NotShadowStar
Sep 20, 2000
Or since it is a huge pain in the rear end to have text markup also provide the structure and hints to the data in the same document, you could, ya know, provide an alternate method of retrieving documents that more suited to data processing. I believe kids call them 'APIs'

haveblue
Aug 15, 2005



Toilet Rascal

NotShadowStar posted:

Or since it is a huge pain in the rear end to have text markup also provide the structure and hints to the data in the same document, you could, ya know, provide an alternate method of retrieving documents that more suited to data processing. I believe kids call them 'APIs'

Search engines are a very bad place for that, any separation between the human- and machine-targeted representation will be exploited by SEO crooks so they have to read the exact same bytes as a human would at all times. That's the entire driving force behind these embedding systems.

Zombywuf
Mar 29, 2008

NotShadowStar posted:

Or since it is a huge pain in the rear end to have text markup also provide the structure and hints to the data in the same document, you could, ya know, provide an alternate method of retrieving documents that more suited to data processing. I believe kids call them 'APIs'

I've grabbed data from web scraping non-semantically-marked up content and I've grabbed the same data from APIs. The web scraping has nearly always been easier.

tef
May 30, 2004

-> some l-system crap ->

Zombywuf posted:

I've grabbed data from web scraping non-semantically-marked up content and I've grabbed the same data from APIs. The web scraping has nearly always been easier.

This is normally because the human facing content is actually checked and the third party api is not used internally.

Also, 'rest api's are often less restful (i.e hateoas) than the website. :v:

Zombywuf
Mar 29, 2008

tef posted:

This is normally because the human facing content is actually checked and the third party api is not used internally.

Also, 'rest api's are often less restful (i.e hateoas) than the website. :v:

Yup, it's also because a website is designed to give you information and an API is usually designed to restrict access to it.

Room Temperature
Oct 16, 2007

mildly amusing
Here, have a makefile I wrote about a year ago: http://pastebin.com/CLrVrday

I reopened it today because I wanted to see if I could re-use some of the concepts :v:. In my defense, it was for a school project, and I was the only one who was going to be maintaining it for the 4 months it was in use, so I took it as an opportunity to learn all about make. It was also an exercise in creating a non-recursive makefile that could support modules, automatically generate dependency files, and basically automate all the poo poo I needed to do between the dev/test cycles. It was hella fun, and once written it actually made the build process quite convenient. On the other hand, by now I've forgotten more than half the black magic that makes it work, and I die a little inside every time I look at it.

I think my favorite line is
code:
$(1) : %$(2) : $$$$(subst /$(BUILDDIR)/,/,$$$$*)$(3) depend.sh $(MAKEOPT) Makefile
if only because of the doubly escaped '$'s.

Zombywuf
Mar 29, 2008

Just fixed the craziest bug, we were getting a web page failing consistently on different browsers (worked in IE, not in FF or Chrome) across multiple servers. It only manifested in the production environment and not in the test environment and centred around this overcomplicated representation of the numbers from 1 to 5:
code:
class Base:
  def __lt__(self, other):
    self.id < other.id

  def __eq__(self, other):
    self.id == other.id

  ...

class A(Base):
  def __init__(self):
    self.id = 1

class B(Base):
  def __init__(self):
    self.id = 2

...
def factory(id):
  return [A, B, C, D, E][id]()
Yes really, these classes had no other responsibilities other than representing ordinal values from 1 to 5. The problem came when you needed to compare the object to another class (presumably to store in a collection), .id wouldn't exist in the other class. So the following "fix" was implemented (presumably last thing on a Friday):
code:
class Base:
  def __lt__(self, other):
    if self.__class__ != other.__class__:
      return self.__class__ < other.__class__
    return self.id < other.id

  ...
Clearly the developer had not considered that __class__ will not be the base class but the child class, so the != test would always succeed when doing (for example) A() < B(). Miraculously though, this worked! Most of the time.

Solution spoilered for those who want to work it out :-)

It seems that when comparing two objects that do not have a defined __lt__ python compares the pointers to those objects, which is a completely reasonable arbitrary order. It also seems that objects will usually get assigned pointers in ascending order corresponding to their declaration order. However, minor changes in the memory layout due, for example, to the length of the UserAgent string received from the browser can cause effectively random pointers to be assigned. Hence, consistent failures across browsers.

It's been a fun morning. :2bong:

Adbot
ADBOT LOVES YOU

Wheany
Mar 17, 2006

Spinyahahahahahahahahahahahaha!

Doctor Rope
Our code is full of stuff like this (we use MooTools):

code:
function makeFart( className ) {
    var fart = $('fart');
    
    var negativeClassName = 'not-' + className;

    if( fart.hasClass( negativeClassName ) ) {
        fart.toggleClass(negativeClassName);
    }

    if( !fart.hasClass( className ) ) {
        fart.toggleClass(className);
    }
}
Why use .toggleClass()? Why not just use .removeClass() and .addClass()?

I know this in itself is a small horror, but these things add up.

Why the hell not:
There is also a shitton of functions that create side-effects for no reason, commented out code (checked in version control), outdated or incomplete comments, global variables, eval() and much more.
So poo poo like this:
code:
function doSomething( options ){
    if( !options.color ){
         options.color = globalOptions.defaultColor;
    }
    var opts = options;

    var color = opts.color;
    //var color = "brown";

    makeFart(opts.color);
}

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply