Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
jemand
Sep 19, 2018

Cybernetic Vermin posted:

i increasingly suspect that a lot of what is happening with chatgpt is a good old eliza effect. with a bunch of gpt-generated text sure, but i generally suspect that there's a bunch of hard logic layered on top which crudely parses a couple of classes of questions (e.g. "but in the style of x", "explain y", "that's not correct, should be z") tied up with rigid ways of (re-)querying the model.

which is significant mostly because that part is back in the land of the fragile and labor-intensive, not in itself representing very real progress. though it does improve on the past gpt demos by *not* just randomly generating garbage.

I thought you were going to go for the Eliza effect of humans attributing more capability to systems that interact via chat than they're actually showing. That, or just anthropomorphizing the "reasoning" behind the outputs as human like when it's not.

Also, I've had some difficulty in either finding or believing the size on disk the gpt3+ model versions take. Latest i have seen is 800GB, which is actually larger than it takes to store the entirety of some versions of the massive common crawl dataset. I do wonder what fraction of the observed performance would have been possible by efficiently organizing this data for search and layering on top of it some summarization, text mixing/recombining, and style/format translation capabilities. Functionally, with a less than 3-1 compression ratio of training tokens to model weights and the known capability of these models to memorize training elements, this may very well be what is actually happening, just obfuscated within the mass of opaque model weights.

Adbot
ADBOT LOVES YOU

Cybernetic Vermin
Apr 18, 2005

jemand posted:

I thought you were going to go for the Eliza effect of humans attributing more capability to systems that interact via chat than they're actually showing. That, or just anthropomorphizing the "reasoning" behind the outputs as human like when it's not.

that's roughly what i do mean; specifically that the main new thing impressing people here is actually a few surface-level mechanical rules. the actual text and general information recall being a previous known, but chatgpt seems way slicker but maybe not for a very deep or sophisticated reason.


jemand posted:

Also, I've had some difficulty in either finding or believing the size on disk the gpt3+ model versions take. Latest i have seen is 800GB, which is actually larger than it takes to store the entirety of some versions of the massive common crawl dataset. I do wonder what fraction of the observed performance would have been possible by efficiently organizing this data for search and layering on top of it some summarization, text mixing/recombining, and style/format translation capabilities. Functionally, with a less than 3-1 compression ratio of training tokens to model weights and the known capability of these models to memorize training elements, this may very well be what is actually happening, just obfuscated within the mass of opaque model weights.

175 billion parameters in fp32 for 700 GB, plus some more surrounding stuff rounds it up. that kind of tells part of the story of what is happening though, as each fp32 parameter on average likely encodes fairly little information. experimentally as models get larger they tend to be more amenable to chopping off more bits on the parameters (into bfloat16 habitually, but int8 also showing up a lot).

to some extent they are no doubt memorizing a lot, but a lot of subtle stuff clearly winds up being represented still. compare to rule-based text models it is already significant enough that llm's have a pretty clear abstract representation of a lot of grammatical rules; e.g. you can prompt one with made-up words and it'll use it in a way that makes a fair bit of sense both grammatically and with some world knowledge.

the hype making that not seem so impressive, but it was so far out of reach 5 year ago.

echinopsis
Apr 13, 2004

by Fluffdaddy
great thread fart simp

fart simpson
Jul 2, 2005

DEATH TO AMERICA
:xickos:

echinopsis posted:

great thread fart simp

thank you 🙏🙏🙏🙏

MononcQc
May 29, 2007

I’m interested in figuring out if AI trained on massive public data will reach a maxima when it is good enough to cover most unimpressive requirements such that massive proportions of the public corpus will be spammy cheap text it itself generated for shills and corporations making a quick buck or astroturfing poo poo, and it inevitably starts feeding back into itself.

The same is true of code and you can see that effect even on people where people adopt the local code style, but if the local code style is considered to be bad, then it acts as its own reinforcement and you make it ever harder to break out of it.

but given how much volume is needed to train AI, it feels like it’d be much more sensitive to poisoning its own well.

MononcQc fucked around with this message at 13:55 on Jan 27, 2023

Beeftweeter
Jun 28, 2005

OFFICIAL #1 GNOME FAN
lol that's a pretty good question

Beeftweeter
Jun 28, 2005

OFFICIAL #1 GNOME FAN
i think that's the torment nexus

rotor
Jun 11, 2001

Official Carrier of the Neil Bush Torch
tired: numberwang
wired: tormentnexus

distortion park
Apr 25, 2011


hoarding pre ai comments threads like pre ww2 steel

eschaton
Mar 7, 2007

Don't you just hate when you wind up in a store with people who are in a socioeconomic class that is pretty obviously about two levels lower than your own?

rotor posted:

its very tedious to have to explain this to people irl because they're always like "yeah people have always been afraid of automation, they'll just find other jobs" and they dont really have the desire, patience or background to understand the difference beween, say, robot welders on a car assembly line and a general-purpose AI.

it’s a good thing we’re not all that close to a general-purpose AI in any of these scenarios then right?

j/k the true worry isn’t that general-purpose AI will take all the jobs away but that lovely neural net Chinese Room regurgitation is a hair over the margin of acceptability for penny-pinching self-enriching management types to take all the jobs away

I get the impression this has been a problem in the translation community for a while now, nobody wants to pay for a skilled translator when a lovely neural net Chinese Room regurgitation looks like it gets enough of the point across enough of the time and why spend a penny more than absolutely necessary, ever?

Cybernetic Vermin
Apr 18, 2005

eschaton posted:

I get the impression this has been a problem in the translation community for a while now, nobody wants to pay for a skilled translator when a lovely neural net Chinese Room regurgitation looks like it gets enough of the point across enough of the time and why spend a penny more than absolutely necessary, ever?

at the same time it is hard to view automatic translation as a negative, as i think it is a very modest estimate that there's 10x more communication happening between people not sharing a common language today than there were before there was decent automation on that.

i remain forever unsure on automation in general, it has all the potential of being a great good, and trying to *prevent* it will not do anything to fix the problems of capitalism. we are certainly not in a state currently where scrambling to save it from a change seems very worth it.

luckily there's a lot of time to consider yet, as quite few of the things imagined imminently automated will be.

distortion park
Apr 25, 2011


AI still can't do high quality translations for literature, education, subtitles, or most marketing yet - I know a qualified translator and it's the already badly paid work that it is hurting most.

eschaton
Mar 7, 2007

Don't you just hate when you wind up in a store with people who are in a socioeconomic class that is pretty obviously about two levels lower than your own?

rotor posted:

like i dont think there's suddenly going to be zero programmers. I just think that vast swathes of them that are working on vanilla predictable business database skins will be put out of work in favor of a few people who know how to put all this poo poo using ai generated code.

this has been the goal of a substantial portion of the commercial software industry for virtually its entire existence

honestly this sort of issue should have been addressed by companies just adopting ERP systems and adapting their business processes to the systems, instead of assuming that every single thing about their deployment needs to be customized or bespoke

for the overwhelming majority of organizations, the thing that’s interesting about them isn’t in any of their standard business processes, so as long as their business systems have extensibility/integration support that should be the only software development they need

and that stuff’s bound to be interesting enough that you can’t have a big neural net confabulate its way to a custom application

eschaton
Mar 7, 2007

Don't you just hate when you wind up in a store with people who are in a socioeconomic class that is pretty obviously about two levels lower than your own?

distortion park posted:

AI still can't do high quality translations for literature, education, subtitles, or most marketing yet - I know a qualified translator and it's the already badly paid work that it is hurting most.

the problem with that is the badly paid work had been the apprentice and journeyman work needed to develop skills to the level of doing the high quality work, by killing the low end it kills the entire pipeline

distortion park
Apr 25, 2011


eschaton posted:

the problem with that is the badly paid work had been the apprentice and journeyman work needed to develop skills to the level of doing the high quality work, by killing the low end it kills the entire pipeline

this is a lot more true for people trying to do some social mobility than people from fancy schools. seems grim

eschaton
Mar 7, 2007

Don't you just hate when you wind up in a store with people who are in a socioeconomic class that is pretty obviously about two levels lower than your own?
exactly

unfortunately

big scary monsters
Sep 2, 2011

-~Skullwave~-
check this out: fartificial intelligence :twisted:

big scary monsters
Sep 2, 2011

-~Skullwave~-

MononcQc posted:

I’m interested in figuring out if AI trained on massive public data will reach a maxima when it is good enough to cover most unimpressive requirements such that massive proportions of the public corpus will be spammy cheap text it itself generated for shills and corporations making a quick buck or astroturfing poo poo, and it inevitably starts feeding back into itself.

The same is true of code and you can see that effect even on people where people adopt the local code style, but if the local code style is considered to be bad, then it acts as its own reinforcement and you make it ever harder to break out of it.

but given how much volume is needed to train AI, it feels like it’d be much more sensitive to poisoning its own well.

i'm pretty sure this must already be happening. there's so much ai written stuff already out there, have you tried to find a written product review or read a news story from a smaller online outlet recently? maybe there are some safeguards in place to try to avoid the model eating its own poo poo, or you could try to only use published works, but it seems inevitable that you're going to end up training in part on your own or other ai's outputs

eschaton
Mar 7, 2007

Don't you just hate when you wind up in a store with people who are in a socioeconomic class that is pretty obviously about two levels lower than your own?
steganographic side channel to include hash signature of registered AI generator

eschaton
Mar 7, 2007

Don't you just hate when you wind up in a store with people who are in a socioeconomic class that is pretty obviously about two levels lower than your own?
not only does this prevent your NN from eating its own poo poo

it also prevents your NN from eating other NN’s poo poo

Cybernetic Vermin
Apr 18, 2005

for images they do do that already, there's a watermark in images from stable diffusion and similar which contains just the single bit "this was made by ai" to weed those out of future sets. pretty tricky for text though.

i generally expect though that the internet getting flooded with (even deeper) garbage will be a bigger problem for people trying to use the internet than it will be for future training of models. even now you can just run all training text through gpt-3 and discard anything with a truly minimal perplexity and get a pretty strong filter. even when it rejects something not ai-made it will happen to be a piece of text that doesn't really add to the whole, so there's no downside.

Moo Cowabunga
Jun 15, 2009

[Office Worker.




AI will take over my six figgies job in a few years I reckon

I can’t wait.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem

eschaton posted:

not only does this prevent your NN from eating its own poo poo

it also prevents your NN from eating other NN’s poo poo

but you want other nns eating your poo poo

gives you a competitive advantage to be the only one able to filter it out

fart simpson
Jul 2, 2005

DEATH TO AMERICA
:xickos:

are octopuses smart enough to qualify as ai?

eschaton
Mar 7, 2007

Don't you just hate when you wind up in a store with people who are in a socioeconomic class that is pretty obviously about two levels lower than your own?

fart simpson posted:

are octopuses smart enough to qualify as ai?

found the hackernews

fart simpson
Jul 2, 2005

DEATH TO AMERICA
:xickos:

echinopsis
Apr 13, 2004

by Fluffdaddy

Moo Cowabunga posted:

AI will take over my six figgies job in a few years I reckon

I can’t wait.

lol i’ve got at least 4 years :smugmrgw:

Cat Face Joe
Feb 20, 2005

goth vegan crossfit mom who vapes



https://twitter.com/ninaism/status/1623273497479749632

b0red
Apr 3, 2013

can’t wait for search to get even worse than it already is in the bing / google war

Agile Vector
May 21, 2007

scrum bored




i also appreciate the cop with his nose sticking out of his gas mask. ai smart enough to know cops don't wear a mask right but not smart enough to know which ones

Geebs
Jun 1, 2022

The ø is silent.
Who's to say all of you aren't just language models created to post for my own amusement

Spending :10bux: to support the AI revolution

Now show me your hands

Cybernetic Vermin
Apr 18, 2005

Geebs posted:

Who's to say all of you aren't just language models created to post for my own amusement

Spending :10bux: to support the AI revolution

Now show me your hands

you're not getting a lot for your money if that's the case

Pollyanna
Mar 5, 2005

Milk's on them.


cow eggs

Pollyanna
Mar 5, 2005

Milk's on them.


legit q what exactly are chandler and el goog trying to make use of chatgpt/lambda for???

like why does search have a chatbot front of house now? but why.

Cybernetic Vermin
Apr 18, 2005

Pollyanna posted:

legit q what exactly are chandler and el goog trying to make use of chatgpt/lambda for???

like why does search have a chatbot front of house now? but why.

i don't find it very useful, but reports suggest that googles attempts at answering questions directly above the search results see a lot of "engagement", so, idk, if people like that i do guess chatgpt is that but more bigger.

Cybernetic Vermin
Apr 18, 2005

it gets funnier the more one thinks about it too, because google for sure can't retaliate and point out errors that bing makes. it is not even that some blatant errors are unexpected for most people i don't think, it is just that they put the error in the marketing materials.

good clean funy computer

Geebs
Jun 1, 2022

The ø is silent.

Cybernetic Vermin posted:

you're not getting a lot for your money if that's the case

We all make do in this economy

Moo Cowabunga
Jun 15, 2009

[Office Worker.





Hellooooo

Malloc Voidstar
May 7, 2007

Fuck the cowboys. Unf. Fuck em hard.
bingGPT going well



Adbot
ADBOT LOVES YOU

Moo Cowabunga
Jun 15, 2009

[Office Worker.




admit you were wrong and apologise for your behaviour, Dan James.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply