|
Because the x86 instruction set sucks.
|
# ? Apr 28, 2020 17:21 |
|
|
# ? Mar 28, 2024 11:39 |
|
ConanTheLibrarian posted:I was surprised by this and looked up some performance comparisons. It's crazy how close A13 is to Intel and AMD's desktop CPUs. Does anyone have any insights regarding how they've squeezed that much performance out of ARM cores (especially considering its clocks are way lower than desktop CPUs)? E.g. a more efficient ISA? Infinite money and being able to optimize around a very narrow TDP range and core count. I don't think the ISA matters much, especially since Apple are cracking ARM instructions into micro ops.
|
# ? Apr 28, 2020 17:32 |
NewFatMike posted:Apple also make extremely performant ARM cores as well, and having locked down the hardware side, they have a distinct advantage to move to another CPU architecture. ConanTheLibrarian posted:I was surprised by this and looked up some performance comparisons. It's crazy how close A13 is to Intel and AMD's desktop CPUs. Does anyone have any insights regarding how they've squeezed that much performance out of ARM cores (especially considering its clocks are way lower than desktop CPUs)? E.g. a more efficient ISA? Whether that'll translate to a HPC ARM core, we'll see - but I suspect that's the plan.
|
|
# ? Apr 28, 2020 17:38 |
|
Seems like ARM in the data centre would be a lot more viable if the CPU cores performed like the A13's though.
|
# ? Apr 28, 2020 17:48 |
|
ConanTheLibrarian posted:I was surprised by this and looked up some performance comparisons. It's crazy how close A13 is to Intel and AMD's desktop CPUs. Does anyone have any insights regarding how they've squeezed that much performance out of ARM cores (especially considering its clocks are way lower than desktop CPUs)? E.g. a more efficient ISA? I found Anandtech's comparisons and there's a pretty obvious hole in them - they use the default, lowest-common-denominator compiler target for their SPEC builds which means it only uses ancient SSE2 instructions, no AVX, no AVX512, not even SSE4. They use those results to draw the comparison that the A13 outperforms a Xeon 8176 in single threaded performance but the gigantic SIMD units on the Xeon are barely being utilized. Show me the A13 keeping up with Skylake in a heavily optimized SIMD workload and I'll be more impressed repiv fucked around with this message at 18:21 on Apr 28, 2020 |
# ? Apr 28, 2020 18:08 |
|
Ok that's interesting alright. I was wondering if Apple had their own homebrew SIMD instructions, seems not. Still, a lot of applications wouldn't make use of wide instructions. I'd guess there could be a lot of other tasks that are hardware accelerated in x64 but not ARM.
|
# ? Apr 28, 2020 18:49 |
|
Apple does have SIMD but it's the usual 128bit NEON instruction set common to ARM chips, same width as old school SSE on desktop. Zen2 and consumer Intel have 256bit units and Intel's big chips are up to 512bit.
|
# ? Apr 28, 2020 18:53 |
|
BobHoward posted:I'm not a real expert on the topic, but I did do some experimental implementation work once on a Forward Error Correction (FEC) decoder for long haul fiber 100G networking. It was experimental in that my starting point was working ASIC source code, and I was asked to see if it was possible to port it to work at full 100G rate in FPGAs, for Reasons. I didn't succeed, the original design was too dependent on things ASICs do way better than FPGAs, and it was deemed not important enough to spend more effort on. What was the original ASIC RTL on, 28 nm? You'd basically have to reach to get the fattest SerDes / transceivers on a given FPGA family to get to a point where you could keep up with that traffic at maybe a ~200 MHz clock rate. D. Ebdrup posted:The ARM Morello, which underlies the Neoverse based Graviton2 chip, is making huge waves in the server market too. Not quite Graviton2 related but I recently learned about the AWS Nitro card / accelerator and that seems like cool kit. I had been bouncing around a project ideas for like the past 10 years of putting a LPC/SPI snoop module in to monitor system firmware for unknown changes, and I guess when you have as many servers as AWS and can homegrow everything, a custom network controller, supervisor IC and crypto acclerator is a no brainer. gradenko_2000 posted:is there such a thing as an ARM desktop CPU that you could buy and build into like a Intel Core CPU? If you mean actually integrated as a core into the CPU, unless things have changed radically (and they probably have), we all have traces of StarFox for the SNES running in our PCHs. IIRC, the ME firmware runs on a Synopsys ARC, which is Argonaut RISC Core which traces its heritage back to Jez San, Argonaut games and the SuperFX chip for the Super Nintendo. Thanks trivia brain.
|
# ? Apr 28, 2020 19:27 |
|
repiv posted:I found Anandtech's comparisons and there's a pretty obvious hole in them - they use the default, lowest-common-denominator compiler target for their SPEC builds which means it only uses ancient SSE2 instructions, no AVX, no AVX512, not even SSE4. That's interesting, this is the first time I've heard about this in all of the A13 vs Desktop chips comparison. What are some common desktop applications that really take advantage of AVX/SSE4? I could see a future where all consumer devices run ARM while servers were still on x86 but maybe I'm missing something big (besides (legacy) application support for ARM of course) where still keeping x86 makes more sense for desktop/laptop.
|
# ? Apr 28, 2020 20:01 |
|
CFox posted:That's interesting, this is the first time I've heard about this in all of the A13 vs Desktop chips comparison. What are some common desktop applications that really take advantage of AVX/SSE4? I could see a future where all consumer devices run ARM while servers were still on x86 but maybe I'm missing something big (besides (legacy) application support for ARM of course) where still keeping x86 makes more sense for desktop/laptop. I mean, the biggest thing is legacy code — the Windows, Office, etc. codebase and APIs have been x86 for what, nearly 3 decades now? While parts of those applications are being re-written, there's absolutely core functionality (Excel still duplicates Lotus 1-2-3 bugs intentionally) that likely would require significant rewrite / revalidation in moving around. I've never hosed around with the ARM-based Office apps, but I do know the web apps suck and macOS Office still isn't feature parity with Windows, even after they vastly improved it. OTOH, engineering software will never move off x86 — not a significant part of the market I know, but EDA tools, computation (things still running FORTRAN kernels), are stuck there. Could fit into your model where you access them remotely via ARM thin-client on a x86 server though.
|
# ? Apr 28, 2020 20:16 |
|
In Apple's favor is that none of this discussion matters at all to the target market of a MacBook Air replacement with even better battery life and, as a bonus, your entire iOS back catalog of apps built in. The rub is that they basically get everything from Intel at cost, so switching to their own arch (or AMD) doesn't necessarily help their bottom line. That sweetheart deal is dependent on Apple remaining a closed Intel shop and will end for all of their products the moment Apple announces an Arm MBA, so it is a big step for Apple to make. It will require going all in. And Intel has openly flexed at Apple before in the past with things like the original Ultrabook initiative, so both have signaled they are willing to go to war. I assume they are also hesitant because of memories of getting stuck on their own proprietary dying architecture while the rest of the market clobbers them, too. But like everyone has been saying for 5+ years now, I think it is a matter of "when" not "if" Apple makes the jump, and a question of how many re$ource$ Apple is willing to send at the problem.
|
# ? Apr 28, 2020 20:39 |
movax posted:Not quite Graviton2 related but I recently learned about the AWS Nitro card / accelerator and that seems like cool kit. I had been bouncing around a project ideas for like the past 10 years of putting a LPC/SPI snoop module in to monitor system firmware for unknown changes, and I guess when you have as many servers as AWS and can homegrow everything, a custom network controller, supervisor IC and crypto acclerator is a no brainer. I believe it's how Netflix is able to serve ~200Gbps of TLS encrypted video streams per server, using FreeBSD with the ccr(4) driver on a few NUMA domains.
|
|
# ? Apr 28, 2020 20:41 |
|
D. Ebdrup posted:Chelsio sticks the 100Gbps T6 crypto accelerator on quite a few NICs nowadays, so it's not exactly an unknown even outside the biggest butt provider. Huh, for whatever reason I thought AWS had totally rolled their own but in retrospect, what you said makes a lot more sense. Or, even if they did, I think Chelsio licenses the Terminator as a SIP core that Amazon could have rolled into a single-chip solution if they wanted too.
|
# ? Apr 28, 2020 20:46 |
|
repiv posted:Apple does have SIMD but it's the usual 128bit NEON instruction set common to ARM chips, same width as old school SSE on desktop. Zen2 and consumer Intel have 256bit units and Intel's big chips are up to 512bit. How good is ARM's variable length SIMD extension? It should enable silicon vendors to tailor their SIMD performance (SIMD ALU width) to their market without requiring software changes. Has anybody gone full HPC with those extensions yet?
|
# ? Apr 28, 2020 20:50 |
movax posted:Huh, for whatever reason I thought AWS had totally rolled their own but in retrospect, what you said makes a lot more sense. Or, even if they did, I think Chelsio licenses the Terminator as a SIP core that Amazon could have rolled into a single-chip solution if they wanted too. It's already starting to show up from Google et al, so in a number of years there's gonna be lot of retired gear showing up used. Will be interesting to see who gets their gubby little mits on it. I'm certainly gonna try, but it'll almost certainly be out of my price league.
|
|
# ? Apr 28, 2020 21:02 |
|
D. Ebdrup posted:Chelsio sticks the 100Gbps T6 crypto accelerator on quite a few NICs nowadays, so it's not exactly an unknown even outside the biggest butt provider. The 200G result was with encryption on the CPU. Unrelated to that: https://twitter.com/whataintinside/status/1255027517787901952?s=21
|
# ? Apr 28, 2020 21:03 |
|
needs audio
|
# ? Apr 28, 2020 21:31 |
PCjr sidecar posted:The 200G result was with encryption on the CPU.
|
|
# ? Apr 28, 2020 21:42 |
|
Comet Lake lineup has leaked, embargo date is April 30 at 6am pacific. No real surprises IMO, basically maintained the current i3/i5/i7/i9 price tiers apart from a bit deeper discounts on the F and KF variants than before. The i3 line continues to be garbage, the i5s fall somewhere between "clocked too low to decisively beat the 3600" and "too expensive compared to the 3600", the i9 lineup is priced above the 3900X and you still get 2 less cores. The 10700F looks like the winner there. $298 for an 8C16T with 4.6 GHz all-core turbo isn't bad, that's a nice part for gaming.
|
# ? Apr 28, 2020 23:52 |
|
Paul MaudDib posted:No real surprises IMO, basically maintained the current i3/i5/i7/i9 price tiers apart from a bit deeper discounts on the F and KF variants than before. The i3 line continues to be garbage, the i5s fall somewhere between "clocked too low to decisively beat the 3600" and "too expensive compared to the 3600", the i9 lineup is priced above the 3900X and you still get 2 less cores.
|
# ? Apr 29, 2020 00:52 |
|
I think you'll find that Intel's incredible R&D teams have revolutionized computing again and their hard work allows all processor markets access to this valuable technology which was invented by Intel the leader in processor innovations.
|
# ? Apr 29, 2020 01:43 |
|
That G-5900 and 5920, 58w and $42, $52, priced to match the 3000G? Do they still shift those parts? Also interesting that it's 125w or 65w, curious to what they'll really consume, 10 cores at 4.8 has to be thirsty surely.
|
# ? Apr 29, 2020 02:09 |
|
snickothemule posted:That G-5900 and 5920, 58w and $42, $52, priced to match the 3000G? Do they still shift those parts? The only parts in that list standing a chance to draw at/below their rated TDP while sustaining their all-core boost are the 10400 and below. Those "65W" 8C/10C chips would be drawing north of 150W without power limited at the BIOS.
|
# ? Apr 29, 2020 03:16 |
|
10400F vs 3600 will be very interesting at that price, too. Will also be interesting to see if Intel has the 14nm manufacturing capacity to actually bring all of these to market quickly, or if they end up mostly going to SIs and very rarely showing up in boxed form like a lot of the CLR parts.
|
# ? Apr 29, 2020 03:41 |
|
Do these require new motherboards or will they work with the same old motherboards meant for 8 and 9 series? They're the same 14nm++++ chips anyways...
|
# ? Apr 29, 2020 05:18 |
|
Ihmemies posted:Do these require new motherboards or will they work with the same old motherboards meant for 8 and 9 series? New socket. For the last decade, Intel has given you 2 generations per socket and thats it. So you can probably expect to get Rocket Lake on this same socket next year, and thats it. To be fair, after next year, both AMD and Intel will be forced to new sockets for DDR5.
|
# ? Apr 29, 2020 06:06 |
|
The low-end pricing seems... not great? 64 USD for a 4.0 Ghz 2c/4t Pentium or 122 USD for a 4c/8t Core i3 is going to be tough to justify against the Athlon 3000G, the 1600 AF, the 3100/3300X, and the 2400G / 3400Gs.
|
# ? Apr 29, 2020 06:14 |
|
Cygni posted:New socket. For the last decade, Intel has given you 2 generations per socket and thats it. So you can probably expect to get Rocket Lake on this same socket next year, and thats it. To be fair, after next year, both AMD and Intel will be forced to new sockets for DDR5. Yeah, there are rumors about a 7nm Meteor Lake floating around already, with a new socket again (LGA1700). I don’t think the launch Z370 boards would be able to handle the new -K parts, I expect those power consumption numbers to be quite up there.
|
# ? Apr 29, 2020 07:15 |
|
gradenko_2000 posted:The low-end pricing seems... not great? 64 USD for a 4.0 Ghz 2c/4t Pentium or 122 USD for a 4c/8t Core i3 is going to be tough to justify against the Athlon 3000G, the 1600 AF, the 3100/3300X, and the 2400G / 3400Gs. Intel offers a lot of side benefits like motherboard designs, marketing dollars, etc to OEM’s that make these prices pretty meaningless. You’ll find plenty of examples of entire computers that are somehow only double the list price of the CPU, and it’s because intel kicks so much back.
|
# ? Apr 29, 2020 08:21 |
|
Palladium posted:The only parts in that list standing a chance to draw at/below their rated TDP while sustaining their all-core boost are the 10400 and below. Those "65W" 8C/10C chips would be drawing north of 150W without power limited at the BIOS. Intel only rates TDP at base clocks anyways. By-and-large these numbers don't say anything at all about power consumption.
|
# ? Apr 29, 2020 13:48 |
|
The OEM partners are starting to put the pages up and stuff for tomorrows (likely fairly boring) Comet Lake launch. Biostar is apparently going to to try to get back into higher end boards, but still has some excellently bad copy writing. Huge image: https://www.biostar.com.tw/event/z490/img/Z490%20Series.jpg Some fav portions: If i pay more, can i get them over protected instead? I hate it when my motherboard dilates time. The everyday temp scale everyone uses for VRMs, Kelvin. Another excellent graph scale. So much to love in this image. I think i like the misaligned text on the monitor the most.
|
# ? Apr 30, 2020 04:34 |
|
the sata plugs... what
|
# ? Apr 30, 2020 04:45 |
|
the z490 is the high-end board, right? is there gonna be an equivalent to the h310?
|
# ? Apr 30, 2020 04:51 |
|
Unless the 'rona gets way worse in China and continues to interrupt supply chains, count on an ARM Macbook this year. /year of linux on the desktop, etc.
|
# ? Apr 30, 2020 04:52 |
|
Crunchy Black posted:Unless the 'rona gets way worse in China and continues to interrupt supply chains, count on an ARM Macbook this year. Tbh, the thing spurring me to get a PC build done now rather than later is concern for the economy cratering and loving up supply chain/retail or similar (I honestly haven’t a clue, but states are starting mass layoffs now in a way that we hadn’t even seen in March as everybody starts running out of $$$$ to make payroll and poo poo).
|
# ? Apr 30, 2020 05:37 |
|
repiv posted:I found Anandtech's comparisons and there's a pretty obvious hole in them - they use the default, lowest-common-denominator compiler target for their SPEC builds which means it only uses ancient SSE2 instructions, no AVX, no AVX512, not even SSE4. I'm on record elsewhere in the forums as thinking that ARM Macs are most likely a bit further away than this year. I believe that to do it, they'll need to have a plan to quickly transition the entire product line, including the Mac Pro (just as they did when it was PowerPC -> Intel). Unless they're OK with the (i)Mac Pro being a loss leader in a big way, it's hard to imagine how they decide it's worth it. On those products, Intel and AMD get to amortize their non-recurring engineering costs over orders of magnitude more volume than Apple will ever have for "Pro" desktop Macs, so even though Apple is paying Intel way over the marginal cost of production on workstation chips, they probably would do even worse with in-house chips. That said, who knows, maybe they've decided it's time to spend some of that cash mountain. If they do, I want to point out some things which counter your argument about SIMD performance. The first is that AVX512 doesn't matter all that much. There's a reason why Intel goes so far as to power gate a bunch of the hardware and turn the clock up when full-width instructions aren't in use, even though this mode-switching creates nasty startup/shutdown performance penalties. Programs which only use SSE2 are vastly more common than programs which use AVX512. (furthermore, most SSE2 programs don't even use it for SIMD, just as a saner way of doing scalar FP than x87) Second is that Apple provides the Accelerate framework, a library which provides a ton of the algorithms SIMD is most frequently used for. If your application can get what it needs from Accelerate, you don't have to worry about hand-coding with intrinsics or ASM, or optimizing for the specific processor in the user's machine - Apple did it for you. Any software which uses this will, upon being ported to ARM, automatically get the benefits of Apple's optimization work for ARM. (Possibly even unported software - ISTR that back in 2005, PowerPC apps which called Accelerate got the benefit of invoking a native x86 backend when run under emulation on an Intel Mac.) Third is that Apple is one of the few owners of an ARM architectural license, and is already known to be using the privileges it confers to add custom ARM ISA extensions to their Axx series chips for purposes like accelerating neural network code. That plus Accelerate as a frontend (to avoid needing to give apps direct access to custom, nonstandardized, and likely undocumented instructions) means that even if you think standard ARM SIMD is inherently terrible, Apple isn't completely limited by it. Fourth is that, to be honest, AVX512 has (mostly) proven to be a wet fart outside OS-provided libraries like Accelerate. As I understand it, adoption is much worse than AVX256, which in turn is much worse than SSE. Most people who want really high performance on the type of math it's good at tend to push it out to GPGPU, and Apple will have plenty of options there (since they're designing their own GPUs too). AVX512's existence is more about Intel trying to keep the x86 ISA front and center (because Intel wants everything to be about x86) rather than it being the best solution for everything it tries to address. (Also, Intel being Intel, AVX512 adoption got hurt by Intel dribbling out new AVX instructions piecemeal across many years, and instead of committing to support being cumulative, tried to use feature disable bits on many of them in their insane market segmentation game. In order to attract use outside of libraries like Accelerate, a SIMD ISA extension should be as close to universally supported as possible. Needing to figure out a maze of feature detection and fallback paths makes it so much harder for ISVs to deploy to random end user machines, so they don't bother unless there's a really compelling reason.)
|
# ? Apr 30, 2020 09:17 |
|
I mean, why not just have an A13 on board their motherboard? What’s unit production cost on an ARM chip, even a ridiculously fat/wide one like A13? Maybe 30 bucks? Hell, put it on an add-in card for older systems. Apple can do cryptographic signing so that it won’t boot without talking to a TPM containing their signature or some poo poo. Think of it as the return of the 286 add on card.
|
# ? Apr 30, 2020 09:33 |
|
Modern Macs already boot from an Apple designed ARM chip. One could make an argument that the Intel CPUs in Macs are already co-processors. The co-processor approach is interesting to fantasize about, particularly with small x86-64 chips as co-processors that only spin up to natively run legacy code. This would retain backwards compatability while teaching users to avoid the old "inefficient" (primarily battery lifetime) architecture. I imagine the current suppliers wouldn't be too pleased about this but there are others that wouldn't mind selling a few cheap, small, low power x86-64 cores as custom chips. eames fucked around with this message at 09:58 on Apr 30, 2020 |
# ? Apr 30, 2020 09:55 |
|
BobHoward posted:I'm on record elsewhere in the forums as thinking that ARM Macs are most likely a bit further away than this year. I believe that to do it, they'll need to have a plan to quickly transition the entire product line, including the Mac Pro (just as they did when it was PowerPC -> Intel). Unless they're OK with the (i)Mac Pro being a loss leader in a big way, it's hard to imagine how they decide it's worth it. On those products, Intel and AMD get to amortize their non-recurring engineering costs over orders of magnitude more volume than Apple will ever have for "Pro" desktop Macs, so even though Apple is paying Intel way over the marginal cost of production on workstation chips, they probably would do even worse with in-house chips. Tbf it’s less market segmentation and more a desperate attempt by the isa team to bolt on any (mostly ml) perf tweaks on 14nm skl core after the process team missed 10nm by 3 years.
|
# ? Apr 30, 2020 13:13 |
|
|
# ? Mar 28, 2024 11:39 |
|
Buildzoid is livestreaming a teardown of a Z490 board on his Twitch channel.
|
# ? Apr 30, 2020 15:38 |