Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
trex eaterofcadrs
Jun 17, 2005
My lack of understanding is only exceeded by my lack of concern.

Look Around You posted:

It's also only a 1-pass compiler, which I only found out after asking for help in the lisp thread because there's pretty much no documentation of it only being 1-pass at all. Also a functional language that flips poo poo because functions aren't defined in the right order is loving stupid and very counter-productive.

If you wanna see the reason: http://news.ycombinator.com/item?id=2467359

Adbot
ADBOT LOVES YOU

Blotto Skorzany
Nov 7, 2008

He's a PSoC, loose and runnin'
came the whisper from each lip
And he's here to do some business with
the bad ADC on his chip
bad ADC on his chiiiiip

The whole thread ought to see this IMO, Rich Hickey is cool as heck

Rich Hickey posted:

The issue is not single-pass vs multi-pass. It is instead, what constitutes a compilation unit, i.e., a pass over what?
Clojure, like many Lisps before it, does not have a strong notion of a compilation unit. Lisps were designed to receive a set of interactions/forms via a REPL, not to compile files/modules/programs etc. This means you can build up a Lisp program interactively in very small pieces, switching between namespaces as you go, etc. It is a very valuable part of the Lisp programming experience. It implies that you can stream fragments of Lisp programs as small as a single form over sockets, and have them be compiled and evaluated as they arrive. It implies that you can define a macro and immediately have the compiler incorporate it in the compilation of the next form, or evaluate some small section of an otherwise broken file. Etc, etc. That "joke from the 1980's" still has legs, and can enable things large-unit/multi-unit compilers cannot. FWIW, Clojure's compiler is two-pass, but the units are tiny (top-level forms).

What Yegge is really asking for is multi-unit (and larger unit) compilation for circular reference, whereby one unit can refer to another, and vice versa, and the compilation of both units will leave hanging some references that can only be resolved after consideration of the other, and tying things together in a subsequent 'pass'. What would constitute such a unit in Clojure? Should Clojure start requiring files and defining semantics for them? (it does not now)

Forward reference need not require multi-pass nor compilation units. Common Lisp allows references to undeclared and undefined things, and generates runtime errors should they not be defined by then. Clojure could have taken the same approach. The tradeoffs with that are as follows:

1) less help at compilation time 2) interning clashes

While #1 is arguably the fundamental dynamic language tradeoff, there is no doubt that this checking is convenient and useful. Clojure supports 'declare' so you are not forced to define your functions in any particular order.

#2 is the devil in the details. Clojure, like Common Lisp, is designed to be compiled, and does not in general look things up by name at runtime. (You can of course design fast languages that look things up, as do good Smalltalk implementations, but remember these languages focus on dealing with dictionary-carrying objects, Lisps do not). So, both Clojure and CL reify names into things whose addresses can be bound in the compiled code (symbols for CL, vars for Clojure). These reified things are 'interned', such that any reference to the same name refers to the same object, and thus compilation can proceed referring to things whose values are not yet defined.

But, what should happen here, when the compiler has never before seen bar?

(defn foo [] (bar))

or in CL:

(defun foo () (bar))


CL happily compiles it, and if bar is never defined, a runtime error will occur. Ok, but, what reified thing (symbol) did it use for bar during compilation? The symbol it interned when the form was read. So, what happens when you get the runtime error and realize that bar is defined in another package you forgot to import. You try to import other-package and, BAM!, another error - conflict, other-package:bar conflicts with read-in-package:bar. Then you go learn about uninterning.
In Clojure, the form doesn't compile, you get a message, and no var is interned for bar. You require other-namespace and continue.

I vastly prefer this experience, and so made these tradeoffs. Many other benefits came about from using a non-interning reader, and interning only on definition/declaration. I'm not inclined to give them up, nor the benefits mentioned earlier, in order to support circular reference.

Rich

Zombywuf
Mar 29, 2008

Using Python feels like traveling backwards in time http://bugs.python.org/issue6625

A bug report from 2010 posted:

pydoc fails with a UnicodeEncodeError for properly specified Unicode
docstrings (u"""...""") on the command line interface.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

Zombywuf posted:

How does Python dump non ascii characters to the console (and it is non ascii because it doesn't bother looking at your locale settings unless you tell it to)? Like this: '\xff'. If I want to do that I have to register my own error handler, note that is literally installing an error handler into the runtime, not just passing an error handling function to the encode method.

Python code:
>>> u'\xff'.encode('ascii', 'backslashescape')
'\\xff'

Zombywuf
Mar 29, 2008

Suspicious Dish posted:

Python code:
>>> u'\xff'.encode('ascii', 'backslashescape')
'\\xff'

code:
/usr/share/doc/python$ grep -ir backslashescape .
/usr/share/doc/python$ 

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
Gah. It's backslashreplace. See this section for details.

tef
May 30, 2004

-> some l-system crap ->
str.encode([encoding[, errors]])
Return an encoded version of the string. Default encoding is the current default string encoding. errors may be given to set a different error handling scheme. The default for errors is 'strict', meaning that encoding errors raise a UnicodeError. Other possible values are 'ignore', 'replace', 'xmlcharrefreplace', 'backslashreplace' and any other name registered via codecs.register_error(), see section Codec Base Classes. For a list of possible encodings, see section Standard Encodings.

New in version 2.0.

Changed in version 2.3: Support for 'xmlcharrefreplace' and 'backslashreplace' and other error handling schemes added.

Changed in version 2.7: Support for keyword arguments added.


is this coding horrors or zombywuf is too lazy to read the docs horrors

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine
I just had to do this and honestly, what were the PHP devs thinking when when they made array sorts operate in place?

PHP code:
uasort($aReport, 'SortEmployeeArray');
return $aReport;
It always grates on me that the sort functions don't return a sorted copy of the array. All they do return is a boolean, true on success and false on failure. Looks like if you pass in a non-array, that counts as failure. Except... what about PHP's loose typing? And the fact that the prototype specifies array, and usually (but not always!) PHP will cast your variable to what the prototype expects?

PHP code:
$x = 'foobar';
var_dump(asort($x)); // Returns bool(false)

$x = (array)'foobar';
var_dump(asort($x)); // Returns bool(true)
And of course, since the sorting functions operate by reference, you can't run 'asort(array('a', 'c', 'b'))' because, well, it's sorting in place.

This probably falls under that PHP fractal of fail article but some things can't be restated enough.

Zombywuf
Mar 29, 2008

tef posted:

is this coding horrors or zombywuf is too lazy to read the docs horrors

Ah of course, the online docs differ wildly from the generated docs.

The only thing worse than no documentation is incorrect documentation.

pokeyman
Nov 26, 2006

That elephant ate my entire platoon.
New to me, if not new to the world...

cmd.exe posted:

code:
>copy nul hi.txt
        1 file(s) copied.

>copy nul hi.txtcomeon
        1 file(s) copied.

>dir /b hi*.txt
hi.txt
hi.txtcomeon

>dir /b hi*.tx
File Not Found

And a little bonus (although this one sort of makes sense):

cmd.exe posted:

code:
>dir /b *1.txt
hi.txtcomeon

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine

pokeyman posted:

New to me, if not new to the world...


And a little bonus (although this one sort of makes sense):

I'm guessing this is on a modern Windows system? So that hi.txtcomeon is probably internally represented as hi~1.txt? That's my only guess...

pokeyman
Nov 26, 2006

That elephant ate my entire platoon.
Oh, I get it now. I know the second one matched because searching short filenames, but extending that to the first one didn't occur to me.

ToxicFrog
Apr 26, 2008


tef posted:

str.encode([encoding[, errors]])
Return an encoded version of the string. Default encoding is the current default string encoding. errors may be given to set a different error handling scheme. The default for errors is 'strict', meaning that encoding errors raise a UnicodeError. Other possible values are 'ignore', 'replace', 'xmlcharrefreplace', 'backslashreplace' and any other name registered via codecs.register_error(), see section Codec Base Classes. For a list of possible encodings, see section Standard Encodings.

Python has first-class functions, why are you expected to pass in a string describing the predefined error handler you want rather than the actual error handler :psyduck:

pokeyman posted:

Oh, I get it now. I know the second one matched because searching short filenames, but extending that to the first one didn't occur to me.

The entire windows command line is a horror. I dare you to figure out consistent rules for wildcard expansion and quoting in that fucker. There aren't any because that isn't always handled by the shell but by the program you're invoking.

Lysidas
Jul 26, 2002

John Diefenbaker is a madman who thinks he's John Diefenbaker.
Pillbug

ToxicFrog posted:

The entire windows command line is a horror. I dare you to figure out consistent rules for wildcard expansion and quoting in that fucker. There aren't any because that isn't always handled by the shell but by the program you're invoking.

Close, but the truth is worse. I'm having a hard time finding a definitive source for this, but everything I've read says that wildcard expansion is handled by the filesystem driver :stonk:

EDIT: This might count as definitive: the FindFirstFile Win32 API function can take a wildcard as its argument. You're correct that programs are free to interpret wildcards themselves (and I bet a depressing number of programs do), but there is a common function available.

Lysidas fucked around with this message at 19:59 on Jul 20, 2012

qntm
Jun 17, 2009

ToxicFrog posted:

The entire windows command line is a horror. I dare you to figure out consistent rules for wildcard expansion and quoting in that fucker. There aren't any because that isn't always handled by the shell but by the program you're invoking.

I still haven't figured out how to call a program called %PATH%.exe at the Windows command line.

pokeyman
Nov 26, 2006

That elephant ate my entire platoon.
Prefix each % with ^.

PrBacterio
Jul 19, 2000

Lysidas posted:

Close, but the truth is worse. I'm having a hard time finding a definitive source for this, but everything I've read says that wildcard expansion is handled by the filesystem driver :stonk:

EDIT: This might count as definitive: the FindFirstFile Win32 API function can take a wildcard as its argument. You're correct that programs are free to interpret wildcards themselves (and I bet a depressing number of programs do), but there is a common function available.
That's far from a horror. FindFirstFile/FindNextFile are just APIs offered by the operating system to handle wildcard expansion, which is precisely as it ought to be, and not part of the file system driver. Actually the way how this is done on Unix-like systems, with the expansion being handled by a command shell which then passes entire lists of files on to individual commands, is a horror if you ask me.
Which is not to say that the Windows command prompt shell is not a horror, because it is.

Gazpacho
Jun 18, 2004

by Fluffdaddy
Slippery Tilde
The command prompt has to do wildcards that way to maintain compatibility with DOS, which had to maintain compatibility with CP/M, in which each command implemented its own file pattern rules. In CP/M it was possible for a filename to be valid to one command and invalid to another. :byodood: DOS provided the wildcard expansion APIs as a way to standardize the patterns but expanding them up front the way Unix does wasn't a feasible option.

ToxicFrog
Apr 26, 2008


PrBacterio posted:

Actually the way how this is done on Unix-like systems, with the expansion being handled by a command shell which then passes entire lists of files on to individual commands, is a horror if you ask me.

Why? It ensures that - no matter what filesystem you're using, no matter what program you're running - you have a consistent mechanism for wildcard expansion, which is documented, and configured, in one place only.

If you leave it up to individual programs, you end up with different expansion behaviour for each command, each with its own documentation and configuration mechanism (if it can be configured at all). If you leave it up the filesystem, suddenly rm behaves differently depending on whether you're using it on a local filesystem, a USB key, or a network mount.

Leaving it up to the shell, which any given user will be switching around far less frequently than program and filesystem - if at all - sounds like the only reasonable way to do it.

(I mean, yes, in principle you can have a standard library for wildcard expansion that all programs then use, which is what windows does; the problem is that in practice, not all of them use it, or they use it differently, and you end up with the clusterfuck that is the windows command line.)

trex eaterofcadrs
Jun 17, 2005
My lack of understanding is only exceeded by my lack of concern.

ToxicFrog posted:

Why? It ensures that - no matter what filesystem you're using, no matter what program you're running - you have a consistent mechanism for wildcard expansion, which is documented, and configured, in one place only.

The only reason I can think of is that you can exhaust the wildcard buffer, try rm'ing a directory with a few million files in it (which is also a horror in and of itself)

Gazpacho
Jun 18, 2004

by Fluffdaddy
Slippery Tilde

ToxicFrog posted:

If you leave it up to individual programs, you end up with different expansion behaviour for each command, each with its own documentation and configuration mechanism (if it can be configured at all).
DOS didn't just leave it up to the programs. It also set global rules on what characters could be valid in a filename, something that CP/M never did. You could not create filenames with * or ?. So yeah, maybe there were stupid programmers who didn't call the APIs right in front of their faces and made their own. I think we've all come across someone like that. But the valid filename rules limited the damage they could do.

ToxicFrog posted:

If you leave it up the filesystem, suddenly rm behaves differently depending on whether you're using it on a local filesystem, a USB key, or a network mount.
I would need to see some evidence that expansion is handled by the file system driver and not by the Windows file API. Microsoft documents the behavior of the wildcard patterns, something they wouldn't be able to do if expansion depended on the device.

Gazpacho fucked around with this message at 22:35 on Jul 20, 2012

raminasi
Jan 25, 2005

a last drink with no ice

trex eaterofcadrs posted:

The only reason I can think of is that you can exhaust the wildcard buffer, try rm'ing a directory with a few million files in it (which is also a horror in and of itself)

I've heard an argument that goes something like "shell wildcard expansion means that, for example, rm can't know when it's gotten a * so it can double-extra-verify that you want to do that." I don't know if it's a terribly compelling argument, but it's there.

trex eaterofcadrs
Jun 17, 2005
My lack of understanding is only exceeded by my lack of concern.

GrumpyDoctor posted:

I've heard an argument that goes something like "shell wildcard expansion means that, for example, rm can't know when it's gotten a * so it can double-extra-verify that you want to do that." I don't know if it's a terribly compelling argument, but it's there.

Both bash and zsh can intercept that command and prompt for input. I'm not a shell commando but I think bash uses an alias (rm -i) and zsh has some function hook.

raminasi
Jan 25, 2005

a last drink with no ice
Well rm is just an example. The general form is "it's easier for any given program to know whether using it with a wildcard is dangerous than it is for the shell to know that."

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

ToxicFrog posted:

Why? It ensures that - no matter what filesystem you're using, no matter what program you're running - you have a consistent mechanism for wildcard expansion, which is documented, and configured, in one place only.

I can put a file named ./-l in the current directory, and then when I do "ls *", it will be expanded to "ls -l", and coreutils will interpret that as a flag. That's insane. I can't guarantee anything about the command "ls *".

Now think that the entire UNIX philosophy is built around command concatenation and substitution. Shell injection attacks are up there with SQL injection attacks.

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine

Suspicious Dish posted:

I can put a file named ./-l in the current directory, and then when I do "ls *", it will be expanded to "ls -l", and coreutils will interpret that as a flag.

:monocle: Holy poo poo.

Blotto Skorzany
Nov 7, 2008

He's a PSoC, loose and runnin'
came the whisper from each lip
And he's here to do some business with
the bad ADC on his chip
bad ADC on his chiiiiip
There's an old trick based off this where you put a file named -i in a directory to prevent yourself from accidentally rm -rfing it

Scaramouche
Mar 26, 2001

SPACE FACE! SPACE FACE!

Don't know if this is laugh, cry, or just a disappointed, despairing sigh:
http://www.theregister.co.uk/2012/07/20/big_boobs_in_linux/

pseudorandom name
May 6, 2007

The best part is that Microsoft can't change the constant because they've already deployed it in Azure.

Their solution is to change it from hex to decimal and hope nobody notices their sexism.

ToxicFrog
Apr 26, 2008


GrumpyDoctor posted:

Well rm is just an example. The general form is "it's easier for any given program to know whether using it with a wildcard is dangerous than it is for the shell to know that."

I trust programs to reliably warn me that I'm about to do something dangerous just as much as I trust them to use a consistent wildcard expansion mechanism when left to their own devices, which is to say, not at all. Given that I'm going to have to double-check everything anyways, I'd rather have the shell handle wildcard expansion and only have to worry about one set of rules and configuration interface for it.

I'm not saying that shell-side wildcard expansion is without problems. Just that it's not bad as the alternatives.

Golbez posted:

:monocle: Holy poo poo.

Otto Skorzeny posted:

There's an old trick based off this where you put a file named -i in a directory to prevent yourself from accidentally rm -rfing it

Of course, this only works if you're rm'ing * and not .:

code:
$ touch foo bar baz ./-l
$ ls *
-rw-r--r-- 1 ben ben 0 Jul 20 19:19 bar
-rw-r--r-- 1 ben ben 0 Jul 20 19:19 baz
-rw-r--r-- 1 ben ben 0 Jul 20 19:19 foo
$ ls .
bar  baz  foo  -l
$ ls -- *
bar  baz  foo  -l
$ ls ./*
./bar  ./baz  ./foo  ./-l

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

Otto Skorzeny posted:

There's an old trick based off this where you put a file named -i in a directory to prevent yourself from accidentally rm -rfing it

The fact that people think that's a feature and not a bug is just astonishing to me.

TOO SCSI FOR MY CAT
Oct 12, 2008

this is what happens when you take UI design away from engineers and give it to a bunch of hipster art student "designers"

yaoi prophet posted:

gently caress whoever decided that a protobuf with a field set to its default value shouldn't equal a protobuf with that field unset. gently caress them hard.
Technically, they're not equal, since a protobuf with an unset field might be incomplete.

Zamujasa
Oct 27, 2010



Bread Liar
New "policy" at work. We use Git now. Boss has no idea how this works, since I guess he wanted something magic like that checks out only one file at any given time and then promptly reuploads it, warning anybody who tries to edit that file that someone else is using it. (Thinking about it, this sounds remarkably like Word documents.)

After trying to explain how it works, having him go through his first commit, push, and resync...



Each of those commits changed one or two lines (or in a rare case, three!) in the same exact file each time.


His cycles are so fast that I was joking to a coworker that by the time you managed to successfully get your own commit ready, fetching the most recent changes, he'd already have pushed another one.


quote:

*nix and wildcard expansion poo poo :stare:

Outside of quoting all filename arguments, is there even anything you can do about that in a *nix shell? That just seems kind of dangerous. What if you had files named -r and -f and just tried to do rm * to clean up a directory?

het
Nov 14, 2002

A dark black past
is my most valued
possession

Suspicious Dish posted:

The fact that people think that's a feature and not a bug is just astonishing to me.
To be honest that's one of those things that's always mentioned but I'm always skeptical that anyone actually does it. I'm sure someone did it back in like the 80s or whatever but in my professional life I've definitely never run into anyone who does that.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

Zamujasa posted:

Outside of quoting all filename arguments, is there even anything you can do about that in a *nix shell? That just seems kind of dangerous. What if you had files named -r and -f and just tried to do rm * to clean up a directory?

:ssh: Quoting wouldn't actually solve this.

het posted:

To be honest that's one of those things that's always mentioned but I'm always skeptical that anyone actually does it. I'm sure someone did it back in like the 80s or whatever but in my professional life I've definitely never run into anyone who does that.

I've seen it done a few times before on some servers that I adminned.

Opinion Haver
Apr 9, 2007

Zamujasa posted:

Outside of quoting all filename arguments, is there even anything you can do about that in a *nix shell? That just seems kind of dangerous. What if you had files named -r and -f and just tried to do rm * to clean up a directory?

As Suspicious Dish said, quoting won't stop this. There is literally no way for a program to distinguish rm -f -r foo and rm *, if there are files in the current directory named -r -f and -foo (assuming that's the order in which they get passed). That's why most GNU programs have a -- option that stops all further arguments from being interpreted as options, so you can do rm -- *.

DeciusMagnus
Mar 16, 2004

Seven times five
They were livin' creatures
Watch 'em come to life
Right before your eyes

yaoi prophet posted:

That's why most GNU programs have a -- option that stops all further arguments from being interpreted as options, so you can do rm -- *.

All POSIX utilities should have that. It's how a compliant getopt implementation behaves.

that awful man
Feb 18, 2007

YOSPOS, bitch

Gazpacho posted:

The command prompt has to do wildcards that way to maintain compatibility with DOS, which had to maintain compatibility with CP/M, in which each command implemented its own file pattern rules. In CP/M it was possible for a filename to be valid to one command and invalid to another. :byodood: DOS provided the wildcard expansion APIs as a way to standardize the patterns but expanding them up front the way Unix does wasn't a feasible option.

The DOS command interpreter couldn't expand wildcards because the entire space allocated to store command line parameters was 127 bytes.

Gazpacho
Jun 18, 2004

by Fluffdaddy
Slippery Tilde
That too, and the fact that its dominant use case was not batch scripting but starting up an interactive program and then sitting in the background. Many reasons, but CP/M compatibility guided the design.

Adbot
ADBOT LOVES YOU

PrBacterio
Jul 19, 2000

ToxicFrog posted:

Why? It ensures that - no matter what filesystem you're using, no matter what program you're running - you have a consistent mechanism for wildcard expansion, which is documented, and configured, in one place only.
The reason it is a horror is because expanding arbitrary file patterns from the command line and passing them as a list of file names to an individual program can lead to unintended behaviour by that program when there are oddly-named files or no files matching a pattern. The individual application has no way of knowing whether any specific command line argument originates from a wildcard expansion of a file pattern or not and there is no one-to-one correspondence between actual command line arguments (as written by the user) and the command line parameter list (as seen by the program).
For instance, suppose you have a file that is called "--remove-files" in your directory:
code:
foo$ ls
-rw-r--r--  dumb   user         0 Jul 21 2012 --remove-files
-rw-r--r--  dbuser dbuser 7474585 Feb  2 2007 important-data.db
foo$ ls ..
drwxr-xr-x  dbuser dbuser       1 Feb  1 2007 important-data/
lrw-rw-rw-  dumb   user         0 Jul 20 2012 important-data.tar.bz -> /dev/null
foo$ tar -cjf ../important-data.tar.bz *
In this case, because there is no way for the program to distinguish between a command line option it was given on purpose by the user and something that exists only because of shell expansion, the tar command will remove the file "important-data.db" after having added it to the tar archive "../important-data.tar.bz" which is just a symlink to /dev/null; so the file is gone. This is because the actual tar program only sees the command line "tar -cjf ../important-data.tar.bz --remove-files important-data.db" with no way of knowing that the third and fourth arguments originated from a wildcard expansion and should be treated as file names. This is a contrived example, of course, but I think you can see where I am going with this. Arbitrarily manipulating the strings given as command line arguments to individual applications to deal with file name patterns is an ugly hack that should never have seen the light of day, imho.

(EDIT: fixed typos,added clarification)

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply