Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

Hasturtium posted:

As a minor correction, POWER originally referred to IBM's in-house designs for large chips intended to run z/OS and the like. PowerPC was the name given to the cooperative efforts of IBM, Motorola, and Apple (called AIM) to create microprocessors built on the same fundamental architecture. Apple eventually switched to x86 because IBM and Motorola couldn't prioritize a low power chip to replace the G4, and the G5 - itself a modified Power4 design with an Altivec unit, some internal changes, and a hobbled amount of cache to keep power and thermals in check - was incapable of being adapted for a mobile form factor at acceptable performance. PowerPC is effectively dead, though FreeScale was creating designs that were promising for some time and Apple wrestled with the decision internally for a while.

Power is built with performance as a primary concern rather than power optimization. Power9 chips feature support for SMT4 or SMT8 (so a quad core chip would expose itself as 16 or 32 execution threads), relatively shallow pipelines, quad-channel memory and 40+ PCIe lanes, and a massive number of registers - according to Wikipedia the breakdown is:

32× 64/32-bit general purpose registers
32× 64-bit floating point registers
64× 128-bit vector registers

They are built chiefly for server applications and are not SIMD powerhouses compared to modern designs from Intel and AMD, but I've wanted to play with one for several years. Raptor Computing manufactures several motherboards that are fully open source and use the smaller Sforza form factor of chip. It'd set me back as much as a solid Threadripper, but I'm still thinking about it...

Edit: MIPS was the CPU design SGI pushed for all its in-house chips, and even made it way to the N64. It's slowly ebbed in general relevance since, though as you say it still has a presence in embedded and SFF computing. Come to think of it, I had a Blu-ray player driven by one.

Power was never used for any of the z/OS mainframes. It was IBM’s RISC system designed for the RS/6000 workstations originally. The mainframe chips are the exact opposite of RISC. They’re weird and kind of cool.

There were a lot of different Power designs for different applications with wildly different performance and price targets. The inability to produce an Apple laptop cpu was more about aim corporate infighting instead of limits of the arch.

Power effectively stalled out after IBM sold its fabs to Global Foundries who gave up on 10/7nm which put POWER 9 behind and took 10/11 out of the running.

MIPS is effectively dead. SGI bought the IP designer then switched to Windows and Itanium and sold it off to a supercomputing startup who went bankrupt who sold it to a Chinese company that tried to use it for one of China’s indigenous supercomputer projects. I think they gave up on that architecture and spun it off who are now trying to pivot to rv like everyone else.

Itanium is worth a mention as well. Intel corporate strategy wanted a clean sheet 64 bit ISA to get away from X86 licenses plus market consolidation from all the different legacy workstation CPUs like PA-RISC from HP plus the end of dennard scaling preventing expected frequency scaling plus over reliance on compilers to make VLIW / EPIC work plus Intel corporate culture was a perfect storm for a disastrous market. It was a real pain in the rear end to work with.

Adbot
ADBOT LOVES YOU

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

Supercomputing is a great source of weird cpu architectures that got destroyed by intel’s process advantage; look at this thing: https://en.m.wikipedia.org/wiki/Cray_XMT

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

My favorite ppc was the powerpc 615 (although ineligible for this thread), which had both an x86 and 32/64 bit ppc core that was pin compatible with the Pentium. Killed because Microsoft wouldn’t support it. Apparently, it could also do 68k?

A lot of future Transmeta people worked on it, apparently.

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

I have an old indigo that the eeprom got wiped on and then the usual rot so i dont know if ill ever get it going again

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

JawnV6 posted:

https://www.tomshardware.com/news/ascenium-reinvent-the-cpu Ascenium Wants to Reinvent the CPU - And Kill Instruction Sets Altogether

It's not a Mill rebrand, but some of this sounds pretty nonsensical to me?

I get a kick out of showing more software oriented folks register renaming, but someone doing this since 2005 ought to have heard of it. But "data through the pipeline" isn't quite even that. Or it's a slightly different point on the Cell curve, you've got a zillion simple FPGA slices that can be configured a few ways based on workload.

Anyway, ISA's are a thing of the past, good riddance, now throw your code into the magical parallelism machine.

"The magic compiler will save us", but now with LLVM.

Anyone had luck with any of the HLL->FPGA tools? I have not been impressed.

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

movax posted:

Someone on Twitter found a "RISC-V High Performance Engineer" job posting from Apple -- what are they up too, I wonder...

Negotiating leverage for when the acquisition closes and Jensen needs to boost his spatula budget.

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

Kazinsal posted:

That's going to cause some interesting stall situations. Imagine you have 16 threads (as in pre-emptive multitasking execution threads, not simultaneous multithreading logical processors) that all want to access the AI accelerator and now need to schedule what order they each get to run their code in. Now you need to worry about saving/flushing/uploading/restoring state between each invocation, across each core, including situations where now you need to save and flush your thread's AI accelerator state to memory because there are 15 other threads ahead of you in the accelerator's work queue and you can't just idle because there's other non-accelerator-touching code to be executed.

I've not really been a fan of AVX because of the power/thermals/clock speed implications it has but losing 100 MHz on your all-core turbo when an AVX instruction enters the pipeline is preferable to having to completely re-juggle your scheduled threads in the event that one thread needs more linear time chugging away at a parallel vector math problem, because now processor X is hogging the AI accelerator and processors A through F are waiting for it to finish up and release the lock on the accelerator or process its job instead of just being able to do 8xFP64 FMAs in a normal ISA extension instruction that executes in series with the rest of the code. I suppose in an incredibly optimized environment it could have some excellent advantages because the CPU cores can be munching on some data returned from the AI accelerators in the integer/ALU path while they wait for their next AI job to be executed but since my low-level field of expertise is primarily in the memory/ALU space and not in the SIMD/vector space I'm not really sure what the difference is between this and an on-chip GPU that has no video output and only accepts compute shaders other than not having to deal with the overhead of needing to compile generic compute shader code to the shader cores' machine language.

(Disclaimer: I do not do kernel-level task scheduling algorithms for a living, just for fun. There are people who actually know more about this than I do and have written papers on it. I am just some dick who writes weird custom kernel poo poo for fun.)

Mainframe processors are extremely fucky and do weird things with pipelines, instructions, task scheduling, etc. with z/OS, so intuitive reasoning from 'normal' systems might not be that helpful. Mainframes are designed for I/O throughput so a thread (or lots of threads) waiting for output from an on-die 'peripheral' isn't an unusual or sub-optimal state. Conceptually, I do think it's sufficiently similar to CUDA, APUs or QuickSync.

Honestly, a 6 TF inferencer is not that impressive and seems more like a spec-sheet check-off so the few hundred middle-managers that buy these things can get approval for their next gen mainframe replacement to their CIO as "AI enabled."

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

Hah, never.

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

I’m sure you’ll be able to get one in an outpost.

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

I mean, they already can/do license pretty much any (non-competitor) IP arm has? $60 billion buys a lot of ip licenses even at a 2% retail royalty rate.

They wouldn’t be able to use any of the parter-proprietary ip even if they did acquire them.

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

HP’s PA-RISC wasn’t the only one to flip; you also had SGI abandoning MIPS, Digital/Compaq abandoning Alpha, etc.

The rising costs or chip design (and lack of someone like TSMC in its current market position), the Pentium Pro which was as fast or faster in the bread and butter workstation market, and Intel’s fab advantage (fueled by their commodity volume) convinced that it was a losing game. It was the same forces that drove SGI to make a Windows NT workstation, lol.

Instead of Intel, you could try to get in bed with IBM; while POWER was a great arch IBM as a business partner was brutal.

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

PAE was a gross hack, you couldn’t address more than 4 gb in a single process. It wasn’t a long term solution.

Intel was really mad about the alternative x86 licenses and put a lot of business and legal efforts into voiding or obsoleting them. Pentium was specifically named that because Intel lost the trademark on 486. The design choices in Itanium were not driven as a reaction to technical capabilities of RISC chips.

There were viable (in some cases faster) AMD alternatives for each Intel generation since the 386.

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

glofo really hosed power; the future power roadmap is steering hard to high end enterprise niche

not much momentum around openpower anymore either

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

Numa clusters of m2s linked with pcie Gen 6 and infinity fabric knockoff. (If they had a decent pcie ip core)

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

icantfindaname posted:

This is more of a business history question, but why did x86 win in the first place? Just Intel and IBM managed to secure the US corporate market against the variety of home PC competitors?

IBM entering the market changed it from toy to Real Business Machine. It’s impossible to overstate how prominent IBM is as technology company in 1980. Once a critical mass of software was there you wanted to stay there. Once all the customers with money were there it became the primary development platform. Once knowing x86 office apps was a requirement schools had to buy them.

Backwards compatibility: once you bought (or pirated) the software you didn’t want to buy it again. You couldn’t run Apple II visicalc on a mac (w/o buying a card that was as expensive as standalone Apple II). Amiga and Mac were much later. They never had the volume to justify large scale cloning. There were lots of Apple II clones and cp-m was effectively all-clones.

Volume. Intel was able to ship x86 at scales that gave them performance and profitability leadership and use that revenue to build new cpus and processes by being very good with fab technology and leadership.

But the real answer is Microsoft. Just an absolutely ruthless competitor who really understood the market, third party developers, iterative development, and network effects. The right company at the right time and right place.

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

IBM gave up on Power in HPC after being late on power9 and Glofo took their money and then didn’t deliver 7nm for Power10. Power is mostly legacy business/enterprise now (not mainframe.)

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

That said, Intel was pushing hard with X86 dating back to Pentium Pros and ASCI RED, if not before.

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

That Intel decided the high margin low volume / low margin high volume split model for leading edge fab utilization economics that worked so well for servers/desktops wouldn’t be threatened by someone doing that with phones in volume was puzzling.

Sure, they still sell a lot of desktop chips (at probably higher margins than phone chips) but if TSMC (and Samsung) didn’t have customers and volume they’d have a hard time building leadership fabs.

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

PBCrunch posted:

Does Samsung really have leadership fabs? As far as I understand it Samsung does a great job making memory and storage chips, but its logic manufacturing is kind of second rate.

Second or third best is still leadership-class. TSMC has better 10/8/7nm gens but Samsung is still lightyears ahead of SMIC or Glofo.

quote:

Sorry if this is getting too close to politics and/or conspiracy theory, but is it possible the TSMC doesn't actually make much or any profit and that the Taiwanese government pumps money into TSMC in lieu of military spending? The presence of TSMC and its importance to western multinational corporations is a good reason for the US and the Euro zone to have Taiwan's back when it comes to dealing with China.

Nah. Taiwan’s entire military budget was $20b/y. TSMCs revenues were $60 bn/yr. It’s a public company and you can see revenue going in and chips going out.

TSMC is a success story for sustained industrial policy and soft power but it’s market position is 30 years of sustained work and larger market trends.

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

It isn’t, really (cisc decoders etc. don’t eat that many joules relative to alu or cache.) A lot of the perf diff is that designs on the market for ARM are optimized for power efficiency, and also that Intel process teams ate poo poo for a decade while TSMC and Samsung got ahead.

One of the secrets to Apple’s efficiency is that since they are the purchasers of their cpus they can optimize design for most performant design over density for maximum cores per die.

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

Intel’s 5G modem efforts ate poo poo so hard Apple had to go back to Qualcomm and settle their lawsuits. I don’t recall if it was process delays or design issues (iirc both?) but they hosed a lot of companies on that.

Intel also had some internal structural and process issues that prevented them from being competitive.

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

Yeah vliw is like the worst unless you’re doing a dsp (or doing hand optimized science code); for general purpose servers you couldn’t choose a worse architecture*. I2 tried to fix the problems with the architecture by putting a shitload of bandwidth in including an astounding for the time 9 mb cache.

* anything that came to a commercial product; I’m sure some academics have done far worse

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

BobHoward posted:

Have you ever done a deep or shallow dive into itanium? I did a shallow dive once (googled technical docs and skimmed for a while), and I can't say that I came out thinking it even qualifies as a VLIW.

Don't get me wrong, there's aspects which seem VLIW-inspired, but overall it seems like its own thing. They were trying hard to make something novel, I'll give them that much! Like eschaton said though, what they actually made was the Bizarro CPU. Everything's weird or bad or both, and not in a subtle way.

I’ve worked with Itanium at a previous employer; I wasn’t the one doing instruction-level optimization but I worked with swengs who were; they spent a lot of time avoiding branching at all costs (also drinking and complaining.) Agreed that it’s not exactly VLIW; I elided the “, but worse” for brevity.

My favorite part was the slow x86 support.

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

in a well actually posted:

I remain deeply skeptical that companies that are trying to shave fractions of a cent avoiding arm licensing are going to invest enough to make riscv competitive. I’m also skeptical that any sort of common platform or toolchains are going to develop with a wildly varying “do what you want; I’m not a cop” ISA.

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

They’ve been banned from the best parts for like a decade because they let their nuke guys on their 2014 era world-leading intel supercomputer. They’ve got a few different indigenous super architectures. Rumor is they’ve held off from publishing the results from their current top system on the top 500 list to avoid a political shitstorm.

This is also why the US is cracking down on asml exporting their best (and next to best) litho machines to China.

This is the same reason Japan continues to put billions into building the full tech stack to support their own domestic machines like Fugaku.

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

Yeah there’s at least like three or four different indigenous cpu / gpu programs and loongson is not one of the more successful ones.

I always enjoyed SiCortex, the last gasp of MIPS in the US. https://en.m.wikipedia.org/wiki/SiCortex . Don’t know if they sold any of them, though.

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

https://www.hpcwire.com/2023/11/08/china-deploys-massive-risc-v-server-in-commercial-cloud/

‘Massive’

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

Twerk from Home posted:

48 nodes, that's more than 1 rack full of 1U nodes!

Imagine if someday the Chinese were able to afford a dozen racks full of computers.

12U of Bergamo.

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

If by inroads, they mean someone is talking about it in those cases? Sure. The euros have been talking about building a leadership class super using one since shortly after arm got acquired.

Actual deployments? lol, lmao

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

https://chipsandcheese.com/2023/11/20/chinas-newish-sw26010-pro-supercomputer-at-sc23/

The things that made TaihuLight hard to use? We’ve doubled down on them!

(A correction: it isn’t on the main top500; hpl-mxp is a separate, mixed precision benchmark.)

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

Twerk from Home posted:

Did HP ever release ARM blades (ok, cartridges) for Moonshot? I remember that they were pitching heterogenous compute architectures in it way back when.

I guess this is around when AMD tried to make ARM chips too!

They were around, yeah. The cores in them were pretty unimpressive.

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

HPE has a couple arm options right now; if you want to spend a lot of money you can get the rebadged Fujitsu Fugaku nodes at $$$. They also have Ampere at $ but I only saw the 80 core skus when I looked recently.

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

There’s some x86 ones on eBay, so apparently?

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

Anyway moonshot was a lot more credible than The Machine https://en.m.wikipedia.org/wiki/The_Machine_(computer_architecture)

Actually an interesting architecture but it was built in anticipation of tech that never* made it to market and a new OS was a big ask, too.

* ok persistent ram got to market with Optane but we all know how that worked out and Intel wasn’t going to put it on non x86 anyway.

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

BlankSystemDaemon posted:

Well, Nuvia was interesting because it wasn’t just one person, who’d previously worked on Apple silicon - and so far as I know, all the interesting people are still at what’s now Qualcomm.

We're talking about the Qualcomm product, Oryon, by the Nuvia people.

I am skeptical. While there's a lot of space in chip design for different optimization choices, the design considerations and market positions that put Apple in the performance lead aren't changing.

Adbot
ADBOT LOVES YOU

in a well actually
Jan 26, 2011

dude, you gotta end it on the rhyme

BlankSystemDaemon posted:

Sure, but if Qualcomm can manage to shrink the performance gap considerably, it's not like it'll be a bad thing.
Also, is Oryon actually going to be a non-mobile chip like Qualcomm has talked about for years?

I've seen some rumors that the Nuvia Qualcomm server side chip is cancelled, following the previous two Qualcomm arm server cancellations.

Also ARM is suing Qualcomm claiming Nuvia's server license didn't transfer with the acquisition.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply