Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Paul MaudDib
May 3, 2006

TEAM NVIDIA:
FORUM POLICE

SwissArmyDruid posted:

Paul, you're more familiar with Intel than I am, are you aware of any functional benefits to die-stacking a la Foveros with regards to power consumption? Because I think there's still an AMD patent from a handful of years back when we were still speculating that the IO die was going to be an interposer that Zen chiplets and an HBM bump were stacked onto.

foveros is a big mystery to me as well, Intel hasn't said a lot in public to extrapolate from, and the problems with stacking multiple compute dies are pretty obvious in terms of thermals / etc.

obviously a lot of the power consumption from infinity fabric isn't inherent to the protocol itself, AMD uses monolithic dies with infinity fabric attaching various parts and that is fine. the power consumption comes from having to run a beefier PHY to overpower the parasitic inductance/capacitance of the bigger+longer wires that go off the die, through the interposer, and back on. I'm unclear what exactly Foveros has to offer here vs AMD's interposer technology but I think that's going to be the relevant metric - how Foveros decreases those parasitics, because that is directly related to how much power you need to drive them.

it may be that the innovation here is that because Foveros is an "active interposer" technology that you need to drive it a lot less hard - because it's not driving a big giant wire with lots of parasitics, it's jumping the microbump (which, granted, will still be a lot more parasitics than just a trace inside a monolithic die) and then right into another transistor inside the active interposer, so the only "trace" involved is crossing the microbump. I would raise a speculative guess that the active-interposer stuff that AMD has been talking about is functionally equivalent to Foveros from a design perspective here.

Cygni posted:

Thats a good point, I hadnt considered the little cores being off on their own ~Shame chiplet~. There is gonna be a lot of complexity in this next wave of chiplet/tile, big-little, fighting ARM, fully SoC tom foolery were gonna be entering.

"shame chiplet" :lol:

it's just a guess and I suppose that's maybe not as definite as I think it is. I just think servers will definitely want "all-big" configurations and if that chiplet exists then it becomes trivial to offer an enthusiast package with all-big as well. that would be consistent with how AMD has utilized the enthusiast lineup as a binning pressure relief for the server lineup so far.

but at the same time, there will be a power penalty to big chiplet+little chiplet. I know the ideal is that a lot of time the "big" chiplet would be gated off but who knows how possible that will really be. I guess it really depends on the performance, supposedly the new Tremont cores are pushing Skylake-esque performance already and maybe if you have something like that you only power up the big chiplets on really big sustained tasks and just let the little cores handle the day-to-day.

mixed chiplets let you avoid that penalty, you could power down one chiplet entirely unless it's a really big workload, while also still being able to gate the big cores on the mixed chiplet if there's not a whole lot to work on. But I doubt servers are going to bite heavily on that since they don't care about idle power, servers are specced at some reasonable approximation of full load.

I guess talking it through, perhaps one all-big + one mixed chiplet would be an ideal configuration here. Maybe they don't do an all-little chiplet at all and just do an all-big chiplet and a mixed chiplet (and then APUs). That gives you better increments as far as powering the thing up, you have "one chiplet up, little only", "one chiplet up, all cores up", and "mixed chiplet up + big chiplet up".

edit: I think "all-big" and "all-small" (or mixed chiplets with disabled big cores) is also going to be important going forward for segregating caches to prevent timing side-channels, because it seems obvious at this point that shared cache in a speculative architecture is a bottomless pit of vulnerabilities. the fix is going to be segregating "secure tasks" where you would be concerned with data leakage onto a slower, secure chiplet that does much less speculation and ideally shares as little cache as possible between threads, likely with no SMT/hyperthreading (since that is also a bottomless pit of vulnerabilities), while letting CPU-intensive tasks that don't need security run on faster cores that do speculation/etc. this maps precisely onto the "big.LITTLE" model that both companies are now embracing - perhaps minus the OoO/speculation that Intel has adopted lately in Silvermont/Goldmont/Tremont. You just need to do the work of whitelisting which tasks are probably fine to run on a faster, insecure core.

AMD's model where you have multiple basically independent chiplets, with their own caches, that just happen to share a memory controller (but no cache on the memory controller itself) fits this concept quite well. And they can also mix architectures as well, since the only thing the IO die or CCDs care about is talking Infinity Fabric to the other side, all the caching is completely self-contained on the chiplet.

Paul MaudDib fucked around with this message at 03:57 on May 8, 2021

Adbot
ADBOT LOVES YOU

taqueso
Mar 8, 2004


:911:
:wookie: :thermidor: :wookie:
:dehumanize:

:pirate::hf::tinfoil:

I want to go the the adidasamd.com customization page and be able to select n x m chiplets to fill up the package.

Anime Schoolgirl
Nov 28, 2002

please don't bring back jaguar for the little cores :negative:

Icept
Jul 11, 2001
What's the theoretical use case for the shame chiplet? Running all the Windows / OS / services / background stuff on it and devoting the big boy package to the foreground application?

Truga
May 4, 2014
Lipstick Apathy

Icept posted:

What's the theoretical use case for the shame chiplet? Running all the Windows / OS / services / background stuff on it and devoting the big boy package to the foreground application?

theoretically that's possible, sure. but:
1) that requires placing trust in an OS scheduler to do its job properly. windows sometimes already doesn't do a terribly good job on a relatively homogenous chiplet design like ryzen and requires manual intervention with threads/core affinity
2) you get 12-16 threads on a mid-range ryzen and they're all good threads lol

Icept
Jul 11, 2001
Agreed... but why are AMD/Intel pursuing it for desktop CPUs? It makes sense for mobile because 80% of the time you're just texting or whatever so there's no reason to burn battery on the big cores.

Or is it just because the desktop CPUs are derived from a common stack with an APU/laptop focus so the smol cores got to happen just by association?

Paul MaudDib
May 3, 2006

TEAM NVIDIA:
FORUM POLICE

Anime Schoolgirl posted:

please don't bring back jaguar for the little cores :negative:

your vile wishes can't blot out my pure love for the $50 Kabini AM1 combo

Paul MaudDib
May 3, 2006

TEAM NVIDIA:
FORUM POLICE
zen4 is also supposed to introduce avx--512 support :chaostrump:

Pablo Bluth
Sep 7, 2007

I've made a huge mistake.

Paul MaudDib posted:

zen4 is also supposed to introduce avx--512 support :chaostrump:
Linus Torvalds will be happy...

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

Icept posted:

What's the theoretical use case for the shame chiplet? Running all the Windows / OS / services / background stuff on it and devoting the big boy package to the foreground application?

Yes. Consider these approximations for Apple's M1 small cores relative to M1 big cores:

Area: ~0.25x
Power @ max freq: ~0.1x
Perf @ max freq: ~0.33x

The small cores have about 3.3x perf/W (:eyepop:) and 1.3x perf/area. You wouldn't want a chip with nothing but the small cores since high ST performance is quite important for general purpose computing, but having some small cores is awesome. Using less energy to run all those lightweight system threads frees up power to run the threads you want to go fast on the big cores.

That said, will AMD and Intel have small cores as good as Apple's? Seems very doubtful! Small cores are where you expect the advantages of a clean RISC architecture to be greatest, and Apple's been putting a lot of effort into their small core designs for a long time, while AMD and Intel have not.

And will Microsoft have a scheduler as good at using small cores as Apple's? Also doubtful.

ConanTheLibrarian
Aug 13, 2004


dis buch is late
Fallen Rib
Shame chiplet has to become the accepted nomenclature.

BobHoward posted:

The small cores have about 3.3x perf/W (:eyepop:) and 1.3x perf/area.
I think this is the key to their thinking. When I first heard AMD/Intel were looking at big/little designs, I was very sceptical. However you can fit a lot of little cores in the space of a few big ones. Just taking games as an example, an engine may be better off with say 4 high speed cores for critical path logic and 16 little ones for worker threads than 8 big cores. Plus when someone is just browsing the web or writing word docs, the big cores can be powered down.

quote:

And will Microsoft have a scheduler as good at using small cores as Apple's? Also doubtful.
It's not like Apple are the only company who use big/little designs. MS can just rip off whatever Android does.

ConanTheLibrarian fucked around with this message at 12:36 on May 8, 2021

Bofast
Feb 21, 2011

Grimey Drawer

Anime Schoolgirl posted:

please don't bring back jaguar for the little cores :negative:

I rather like my old E-350 bobcat based server that is still running in my living room, but I agree :D

karoshi
Nov 4, 2008

"Can somebody mspaint eyes on the steaming packages? TIA" yeah well fuck you too buddy, this is the best you're gonna get. Is this even "work-safe"? Let's find out!

BobHoward posted:

Yes. Consider these approximations for Apple's M1 small cores relative to M1 big cores:

Area: ~0.25x
Power @ max freq: ~0.1x
Perf @ max freq: ~0.33x

The small cores have about 3.3x perf/W (:eyepop:) and 1.3x perf/area. You wouldn't want a chip with nothing but the small cores since high ST performance is quite important for general purpose computing, but having some small cores is awesome. Using less energy to run all those lightweight system threads frees up power to run the threads you want to go fast on the big cores.

A smol core EPYC with AVX512 at half the area would mean 128 cores on 7nm, with 2048 CUDA cores SP ALUs. At 3+GHz that's 6 Tera flops SP (fp32), 12 Tera flops fp16. Double for 5nm. Double for FMA if your marketing department is watching. OFC without the texture samplers graphics are out of the question and without a coherency/sorting engine so is ray tracing.
6 Tera flops would need up to 2 reads and 1 write per fp32 op, demanding peak bandwidths of 48 Terabytes/s for reading and 24 Terabytes/s for writing.

Would such a machine be a good ML training workhorse?

Arzachel
May 12, 2012
I feel that until AMD can get Fabric/IO power down, small core and mixed chiplets just don't sound very appealing. Maybe if you add a small core cluster to the IO die and power down the IF links and chiplets, but then you probably have to fab it on a advanced node.

PC LOAD LETTER
May 23, 2005
WTF?!
Speculation: They could put the small cores on the IO chiplet and for low power operation just turn off the big cores/chiplet + IF bus and save power that way. Would give the small cores slightly better main system RAM latency too for a percent or 2 more performance.

Since the small cores are supposed to be actually small + low power, and the IO die is already a fairly decent size and required for their chiplet approach, squeezing them on shouldn't be too onerous.

Yeah the process for the IO die is different ("12"nm GF process) and not nearly as good as TSMC's 7nm or 6nm processes but it'll be good enough for some low power optimized lower priority cores at ~2-3Ghz which is likely all that is necessary.

Seamonster
Apr 30, 2007

IMMER SIEGREICH
Wait, if the IMC is inside the IO die, will moving to DDR5 and its higher bandwidth increase power usage? That might necessitate dropping it down to 7nm.

PC LOAD LETTER
May 23, 2005
WTF?!
Maybe? edit: The IOD uses ~15W. I believe the IF bus power use is a bigger issue than the memory controller, especially on Epyc. We don't have any numbers for a DDR5 memory controller so all we could do is guess. From what I recall they're using GF's 12nm for the IOD right now because memory controller scaling is so abysmal with smaller nodes they get nearly no benefit while paying much higher costs and dealing with more supply constraint issues.

Its always possible they could use TSMC's 10nm process instead if they really do have to use a more power efficient + smaller feature process.

PC LOAD LETTER fucked around with this message at 08:27 on May 9, 2021

Paul MaudDib
May 3, 2006

TEAM NVIDIA:
FORUM POLICE
Supposedly AMD is moving to a 6nm IO die at some point here.

ConanTheLibrarian
Aug 13, 2004


dis buch is late
Fallen Rib

PC LOAD LETTER posted:

Speculation: They could put the small cores on the IO chiplet and for low power operation just turn off the big cores/chiplet + IF bus and save power that way. Would give the small cores slightly better main system RAM latency too for a percent or 2 more performance.

Since the small cores are supposed to be actually small + low power, and the IO die is already a fairly decent size and required for their chiplet approach, squeezing them on shouldn't be too onerous.

Yeah the process for the IO die is different ("12"nm GF process) and not nearly as good as TSMC's 7nm or 6nm processes but it'll be good enough for some low power optimized lower priority cores at ~2-3Ghz which is likely all that is necessary.

I don't think this will happen. Cores (especially small ones) occupy a surprisingly low proportion of the CPU's area. With small cores, cache would be the dominant feature. The IO die would have to be substantially larger to fit the compute elements.

For reference, here's a Zen 2 die. Purple is L3, orange is L2, green is the core.

PC LOAD LETTER
May 23, 2005
WTF?!

ConanTheLibrarian posted:

I don't think this will happen.

Why would the small low power cores have to have the same or more amount of cache though? Would a L3 even make much sense if they don't have to hop over the IF bus to the system RAM?

If its mostly doing back round or light duty stuff anyways won't the cache requirements be reduced a fair amount too?

If the cache requirements are as high as the main CPU AND you need like 4 or 8 of them then yeah it starts to make less sense to put it on the IOD and it becomes more sensible to put them on the main die with the higher power CPU's. edit: Or do a dedicated low power CPU die too of course. Either could work.

PC LOAD LETTER fucked around with this message at 13:51 on May 9, 2021

mdxi
Mar 13, 2006

to JERK OFF is to be close to GOD... only with SPURTING

PC LOAD LETTER posted:

Why would the small low power cores have to have the same or more amount of cache though? Would a L3 even make much sense

Cache and cache architectures are overwhelmingly important to the performance of modern CPUs. You don't dedicate 60%+ of your silicon to something if it ain't worth having.

It might be true that a small/light duty CPU can get away with less $L3 (compared to, say, a core which is tasked with HPC workloads) and still feel performant, but I genuinely don't think you'd enjoy using a CPU with none.

And that's before we get to the part where (I think) you'd need to re-architect the core fetching/scheduling logic.

But I feel that there's a common issue with all suggestions that core types should be blended (in any combination). AMD's stated reason for doing things the way they have with their current chiplet designs is that decoupling the compute cores from the ancillary functions of the CPU reduces the unit size for lithography purposes -- you're now just fabbing repeating tiles of core/cache which can be sliced up for maximum yield. As soon as you start blending non-compute functionality back into compute dies, or blending core types on a die, or blending compute into the IO die, you've undone that advantage. If you're gonna go big.LITTLE and use chiplets, it makes the most sense to fab the little cores on their own wafers, for even higher yield.

But IANAACPCE (I Am Not An AMD Capacity Planner/Computer Engineer) and plans do change, so this is all just guesswork based on AMD's previous statements.

gradenko_2000
Oct 5, 2010

HELL SERPENT
Lipstick Apathy

mdxi posted:

Cache and cache architectures are overwhelmingly important to the performance of modern CPUs. You don't dedicate 60%+ of your silicon to something if it ain't worth having.

IIRC Intel's Broadwell desktop parts from 2015 were still competitive with 10th gen Comet Lake not just because they had 6 MB of L3 cache, they even still had 128 MB of L4 cache

PC LOAD LETTER
May 23, 2005
WTF?!
My understanding was that for Zen the large L3's were there to help make up for deficiencies with their memory controller + the small added latency from the IOD + mitigate latency from moving things over the IF bus. All of that is important for performance of course but as a low power/low performance CPU would any of that be a priority? Particularly if 2 of those 3 issues could be eliminated by moving the little CPU cores to the IOD itself?

Yeah more cache is going to be better but I don't see what makes the L3 so much more worth it vs say more L1 or L2 which I would assume would be more valuable performance-wise even if you were much more limited in how much you could cram in vs L3.

I know Intel has a L3 with Tremont which is their 10nm low power chip....but its a much smaller one (4MB) and its shared across all 4 cores and it seems more relevant for its use in a SoC (to help coordinate things with the iGPU and chipset) rather than for straight CPU performance alone.

Anyways, I'm not a chip designer either, but going by that example at worst if a L3 is really necessary to get reasonable performance with the little CPU cores it appears to be to a significantly lesser degree than with Zen so they wouldn't necessarily be stuck with blowing over half the die space on cache for low power/performance use.

PC LOAD LETTER fucked around with this message at 18:06 on May 9, 2021

LRADIKAL
Jun 10, 2001

Fun Shoe
Since we're just making poo poo up, why not replace one of the 8 BIG cores on a CCX with 4 small cores, and give them the exact same amount of cache to share?

Arzachel
May 12, 2012

gradenko_2000 posted:

IIRC Intel's Broadwell desktop parts from 2015 were still competitive with 10th gen Comet Lake not just because they had 6 MB of L3 cache, they even still had 128 MB of L4 cache

I feel like this meme has been perpetuated by Anandtech running their benches on JEDEC memory.

LRADIKAL posted:

Since we're just making poo poo up, why not replace one of the 8 BIG cores on a CCX with 4 small cores, and give them the exact same amount of cache to share?

The IO die pulls ~15W to drive the IF links which is probably more than you'd reasonably save using the small cores.

Kazinsal
Dec 13, 2011



For comparison, dual channel DDR4-3200 averages a ~51 GB/s transfer rate and ~60 ns latency. Zen 3's L3 cache, the largest and slowest of them, has a 600 GB/s transfer rate. I can't remember what L2 speeds are like but I know they're north of 2 TB/s, and L1 read is something like 4 TB/s with sub-nanosecond latency.

Cache is *really* goddamn important.

Truga
May 4, 2014
Lipstick Apathy

Arzachel posted:

The IO die pulls ~15W to drive the IF links which is probably more than you'd reasonably save using the small cores.

yeah, ryzen master tells me my zen2 idles at sub-1W or runs a bunch of firefox tabs or a game at sub-10W, but then the SOC drain itself just sits there at 15+ baseline constantly.

there's a ton of savings to be made there, and next to no savings in the CPUs themselves. it's the price for having 1600+mhz ram, a dozen of pcie4 devices, etc, though.

Sidesaddle Cavalry
Mar 15, 2013

Oh Boy Desert Map

Arzachel posted:

I feel like this meme has been perpetuated by Anandtech running their benches on JEDEC memory.

As a former Broadwell and current coffee lake owner I agree it wasn't a particularly revolutionary turn, especially with it being a victim cache

Quaint Quail Quilt
Jun 19, 2006


Ask me about that time I told people mixing bleach and vinegar is okay

SourKraut posted:

200mm fans are still not ideal though.
How so? What's all this about negative pressure?

I've had a silverstone 90° rotated case with 3 180mm fans on bottom and one 120mm on top for 8 years and my temps are better than anyone I've ever talked to.

Almost nothing makes it through the dust shields, I clean inside like every 3 years.

Canned Sunshine
Nov 20, 2005

CAUTION: POST QUALITY UNDER CONSTRUCTION



Quaint Quail Quilt posted:

How so? What's all this about negative pressure?

I've had a silverstone 90° rotated case with 3 180mm fans on bottom and one 120mm on top for 8 years and my temps are better than anyone I've ever talked to.

Almost nothing makes it through the dust shields, I clean inside like every 3 years.

200mm fans are great for quietly moving a good amount of air at a low static pressure. So if your case setup supports the hardware configuration you want and you're happy with temp and noise, then great!

In a lot of situations though, their low static pressure will end up hurting someone's use case, such as if they have a radiator on the fan mount, the type of dust filters being used, etc. The 180/200 mm opening size is also less favorable for noise attenuation, so depending on the GPU and other components in the case, and where you place the case, you may end up hearing quite a bit more than if it were a 120/140 mm fan.

Also, a lot of the cases I've seen that support 180 or 200mm fans, usually supported two 120 or 140mm in the same spot. Two 120mm fans won't give you the same level of airflow vs noise performance, but two 140mm fans will probably exceed a single 180/200mm fan as long as you get a PWM fan and don't mind spending some time to adjust the fan curve profile.

It sounds like your case is just about the perfect setup for using 180/200mm fans though, if it truly supports three of them, because that's also the issue with most cases that do support them: It's usually just one spot within the case that can fit the fan size, so then you're stuck with either trying to use it as the sole discharge fan that is quiet and will move a lot of air and doesn't need a filter, but may not be in the best location in terms of airflow/thermodynamics, or otherwise using it as intake but losing airflow performance once you put a filter in front of it, and probably still needing at least one more fan somewhere to help maintain positive pressure.

VorpalFish
Mar 22, 2007
reasonably awesometm

SourKraut posted:

200mm fans are great for quietly moving a good amount of air at a low static pressure. So if your case setup supports the hardware configuration you want and you're happy with temp and noise, then great!

In a lot of situations though, their low static pressure will end up hurting someone's use case, such as if they have a radiator on the fan mount, the type of dust filters being used, etc. The 180/200 mm opening size is also less favorable for noise attenuation, so depending on the GPU and other components in the case, and where you place the case, you may end up hearing quite a bit more than if it were a 120/140 mm fan.

Also, a lot of the cases I've seen that support 180 or 200mm fans, usually supported two 120 or 140mm in the same spot. Two 120mm fans won't give you the same level of airflow vs noise performance, but two 140mm fans will probably exceed a single 180/200mm fan as long as you get a PWM fan and don't mind spending some time to adjust the fan curve profile.

It sounds like your case is just about the perfect setup for using 180/200mm fans though, if it truly supports three of them, because that's also the issue with most cases that do support them: It's usually just one spot within the case that can fit the fan size, so then you're stuck with either trying to use it as the sole discharge fan that is quiet and will move a lot of air and doesn't need a filter, but may not be in the best location in terms of airflow/thermodynamics, or otherwise using it as intake but losing airflow performance once you put a filter in front of it, and probably still needing at least one more fan somewhere to help maintain positive pressure.

I'm guessing he's talking about the.. RV05? FT05 is also 180s with rotated layout but it's 2 instead of 3. Either way I believe both cases make extremely effective use of the 180mm fans. They do bottom -> top airflow, probably with low impedance nylon filters (afaik silverstone filters are very good), and not much in the airflow path between the fans and the CPU/gpu. I believe both cases test very well for both acoustic efficiency and absolute cooling, even by more modern standards.

Kibner
Oct 21, 2008

Acguy Supremacy
Yeah, the RV05 and FT02 use that layout. They are not as good for GPU cooling as some modern cases but still the best or among the best CPU cooling, iirc. I'm using an FT02 right now but am wanting to switch to a O11-Mini once I can actually get GPUs again.

Gwaihir
Dec 8, 2009
Hair Elf
The FT-02 is still probably my favorite case of all time, I REALLY wish silverstone had kept refreshing that style.

Paul MaudDib
May 3, 2006

TEAM NVIDIA:
FORUM POLICE
man, this is a bummer of a comparison.

https://www.asus.com/us/Displays-Desktops/Mini-PCs/All-series/Mini-PC-PN50/

https://www.asrock.com/nettop/AMD/DeskMini%20X300%20Series/index.asp#Overview

the PN50 is a much nicer device overall (DP 1.4 support, USB 3.2 10gbps with DP 1.4 alt-mode, etc) but it's fundamentally a NUC-style device with a laptop processor and the limited boost behavior that entails. Also, Asus doesn't seem to be pushing to put Zen3 in it, the PN51 refresh only uses Lucienne (5700U/5500U/5300U) which is Zen2 again.

The Deskmini X300 is better as a mini-PC and you could put a 5700G in it (assuming they update BIOS), with an actual noctua cooler, but it's only DP 1.2 and the IO kinda sucks in comparison.

I have a DP 1.4 monitor and while I know I'm not gonna get super great fps in modern titles there are probably lightweight titles/older titles where I could go higher than DP 1.2 supports. And Zen3 would be preferable to Zen2, as would the quieter noctua cooling and the better boost behavior. One has the performance, the other has the IO to actually get it to the monitor.

Not really sure I want to go all the way up to a full mITX board in a Mini Box M350 or something, but I guess that's the other option.

gradenko_2000
Oct 5, 2010

HELL SERPENT
Lipstick Apathy
https://www.anandtech.com/show/16677/amd-and-globalfoundries-update-wafer-share-agreement-through-2024

quote:

In what AMD/GloFo are calling the “A&R Seventh Amendment”, the updated amendment sets wafer purchase targets for 2022, 2023, and 2024. The full details on these targets are not yet available, however according to the 8-K filing, AMD expects to buy approximately $1.6 billion in wafers from GlobalFoundries in the 2022 to 2024 period.

As with the previous agreement, these targets are binding in both directions. GlobalFoundries is required to allocate a minimum amount of its capacity to orders from AMD, and AMD in turn is required to pay for these wafers, whether they use this capacity or not. For finished wafers, the agreement sets new, undisclosed prices. Meanwhile for any capacity AMD does not use, they will once again be required to pay GlobalFoundries a portion of the difference. GlobalFoundries will be also getting pre-paid for some of these orders in 2022 and 2023, though the 8-K form does not disclose by how much.

Arguably the bigger news here is that, outside of AMD’s minimum wafer purchase requirements over the next three years, the latest amendment otherwise further separates AMD and GlobalFoundries going forward, as it removes all other exclusivity commitments. This leaves AMD free to place orders at any fab on any process node that the company wishes, as opposed to having to use GlobalFoundries for 12nm and beyond.

Now with that said, the net impact of this change is likely to be limited as AMD was already free to pursue other fabs for 7nm and smaller nodes – which will be the vast majority of AMD’s needs over the next three years. But it does underscore how AMD and GlobalFoundries are slowly moving farther apart, as GlobalFoundries has left the race for cutting-edge manufacturing nodes.

It should also be noted that the latest WSA does technically extend the agreement one last(?) time. The previous seventh amendment was set to expire March 31st, 2024. Whereas the new amendment expires on December 31st, 2024. However other than adjusting it to cover the full calendar year, there are no current signs that AMD plans to significantly extend their current agreement with GlobalFoundries. By dropping all exclusivity agreements – and especially in the midst of this chip crunch – it looks like AMD is slowly winding down its dealings with GlobalFoundries for high-performance logic chips.

In the meantime, however, AMD still has three years and $1.6 billion in wafer orders to place at GlobalFoundries. According to a separate statement from AMD, these 12/14nm wafer orders will be used to fulfill orders for trailing-edge logic products, as well as for I/O dies for AMD’s current-generation Ryzen and EPYC CPUs. As with their trailing-edge prodcts, the company will still need to keep producing their current-gen products for a time, even after they’re supplanted with newer technologies. And, given the ongoing chip crunch, having a contractually-guaranteed supply of chips is no doubt a great relief to some executives within AMD.

I can see continued production of Zen+ parts going well into 2022 what with all the increased demand and GPU shortage, since a Ryzen 2400G or Ryzen 1600AF is still plenty of horsepower for a basic computing or even mid-range gaming, but what would AMD even do with 12nm production well into 2023 and 2024?

Paul MaudDib
May 3, 2006

TEAM NVIDIA:
FORUM POLICE

gradenko_2000 posted:

https://www.anandtech.com/show/16677/amd-and-globalfoundries-update-wafer-share-agreement-through-2024


I can see continued production of Zen+ parts going well into 2022 what with all the increased demand and GPU shortage, since a Ryzen 2400G or Ryzen 1600AF is still plenty of horsepower for a basic computing or even mid-range gaming, but what would AMD even do with 12nm production well into 2023 and 2024?

they likely have long-term support agreements on Zen2/Zen3 (especially Epyc) and those agreements may bind them to specific part revisions without ANY changes to sub-assemblies. 2024 is 5 years from 2019 (Zen2) and 3 years from 2021 (official Milan launch).

AMD may be moving the majority of their production to IO dies based on TSMC 6nm soon, but some vendors will not want to re-qualify the parts even with a "shouldn't affect anything" change like swapping the IO die. You may think it behaves exactly the same but it'll be a slightly different microcode with a new quirk, slightly different behavior at thermal extremes, etc.

I was reading something written by an automotive engineer who was complaining that they had to change some microcontroller for their vehicle thanks to the recent shortages and they'd done all this work to requalify on the new part and everything looked good, then the actual production samples they got were a slightly different revision and this one had a thermal protection that cut in a little sooner and so when they started making vehicles with them they started having all kinds of problems... so I'd imagine that some vendors write their supply agreements so that NOTHING can change, even if you think it's inconsequential it may not be to some user.

also there is likely a long tail of production for the developing world. brazil or something is not going to pay $850 for the latest whiz-bang 5950X, but a $80 1600AF is right up their alley. tbh it's a little mystifying that they ever stopped production at all, especially after the shortages hit they should really have cranked it back up because right now they have garbage in that price range, it's like 200GE and 3000G and crap like that.

the PS3 and XB360 had incredibly long production tails for exactly that reason. PS3 production only stopped in 2017.

Paul MaudDib fucked around with this message at 05:44 on May 15, 2021

ConanTheLibrarian
Aug 13, 2004


dis buch is late
Fallen Rib
Would they use 12nm for chipsets?

Bofast
Feb 21, 2011

Grimey Drawer

Paul MaudDib posted:

man, this is a bummer of a comparison.

https://www.asus.com/us/Displays-Desktops/Mini-PCs/All-series/Mini-PC-PN50/

https://www.asrock.com/nettop/AMD/DeskMini%20X300%20Series/index.asp#Overview

the PN50 is a much nicer device overall (DP 1.4 support, USB 3.2 10gbps with DP 1.4 alt-mode, etc) but it's fundamentally a NUC-style device with a laptop processor and the limited boost behavior that entails. Also, Asus doesn't seem to be pushing to put Zen3 in it, the PN51 refresh only uses Lucienne (5700U/5500U/5300U) which is Zen2 again.

The Deskmini X300 is better as a mini-PC and you could put a 5700G in it (assuming they update BIOS), with an actual noctua cooler, but it's only DP 1.2 and the IO kinda sucks in comparison.

I have a DP 1.4 monitor and while I know I'm not gonna get super great fps in modern titles there are probably lightweight titles/older titles where I could go higher than DP 1.2 supports. And Zen3 would be preferable to Zen2, as would the quieter noctua cooling and the better boost behavior. One has the performance, the other has the IO to actually get it to the monitor.

Not really sure I want to go all the way up to a full mITX board in a Mini Box M350 or something, but I guess that's the other option.

What resolution and refresh rate is your monitor, anyway? As far as I can see on Wikipedia, Displayport 1.2 already supports up to 17.28 Gbit/s, which should allow for 1080p @ 240 Hz, 1440p @ 165 Hz or 4K @ 75 Hz.
https://en.wikipedia.org/wiki/DisplayPort#Resolution_and_refresh_frequency_limits

Paul MaudDib
May 3, 2006

TEAM NVIDIA:
FORUM POLICE

Bofast posted:

What resolution and refresh rate is your monitor, anyway? As far as I can see on Wikipedia, Displayport 1.2 already supports up to 17.28 Gbit/s, which should allow for 1080p @ 240 Hz, 1440p @ 165 Hz or 4K @ 75 Hz.
https://en.wikipedia.org/wiki/DisplayPort#Resolution_and_refresh_frequency_limits

Acer X34GS, 3440x1440 @ 180 hz. This is effectively full utilization of DP 1.4.

DP 1.2 limits you to 100 Hz without an overclock. Not sure if you can combine DSC above that (I'd hope?)

For the record, there are very lightweight titles where this would be perfectly fine even up to 180 Hz, Team fortress 2 at 3440x1440 never ate more than about 30% of a 1060 3GB for me for example, it was always CPU-bottlenecked,, and Zen3's CPU prowess would do well at that.

Paul MaudDib fucked around with this message at 14:43 on May 15, 2021

Adbot
ADBOT LOVES YOU

Bofast
Feb 21, 2011

Grimey Drawer

Paul MaudDib posted:

Acer X34GS, 3440x1440 @ 180 hz. This is effectively full utilization of DP 1.4.

DP 1.2 limits you to 100 Hz without an overclock. Not sure if you can combine DSC above that (I'd hope?)

For the record, there are very lightweight titles where this would be perfectly fine even up to 180 Hz, Team fortress 2 at 3440x1440 never ate more than about 30% of a 1060 3GB for me for example, it was always CPU-bottlenecked,, and Zen3's CPU prowess would do well at that.

Ah, yeah, then it makes sense. Those high refresh rate ultrawide monitors do use a lot of bandwidth.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply