|
karoshi posted:Games won't support a feature for 0.001% of the market. In the productivity space support might trickle down from the Xeon line, like maybe AI apps that already support AVX512 on the ~cloud~. Or video encoding libraries used by the content providers. If consoles don’t support it (definitely CPU-wise), they won’t bother with it. Also, yes, Zoom virtual backgrounds are the first application I have encountered where I simply cannot run it on a 2600K. 10 years and it’s a goddamned virtual background feature on a videoconferencing application. I am not a graphics guy and I understand they are cross-platform but... points at OpenGL even iGPUs should be able to trivially do that task, right!?!
|
# ? Mar 31, 2021 17:44 |
|
|
# ? Apr 19, 2024 20:48 |
|
If you have an RTX card you can use Nvidia Broadcast so that your GPU does the background replacement regardless of your CPU but... well...
|
# ? Mar 31, 2021 17:46 |
|
When I found out about the AVX requirement I got a huge chuckle because it took a global pandemic to finally make it a thing that was useful for the average everyday person. Congrats Intel. And then I chuckled again when I realized a lot of people had laptops that didn't support it.
|
# ? Mar 31, 2021 17:49 |
|
WhyteRyce posted:When I found out about the AVX requirement I got a huge chuckle because it took a global pandemic to finally make it a thing that was useful for the average everyday person. Congrats Intel. This is also why the next generation of Pentiums are getting AVX2 after having them gated behind the Core series for the longest time It doesn't help though that Zoom's CPU support can be spotty and you might not still get their background to work even if your CPU is supposed to have those instructions. My Broadwell laptop and my Athlon 200GE couldn't do it.
|
# ? Mar 31, 2021 17:52 |
|
Epiphyte posted:What actually uses AVX-512 in home use? This is rather off to the side, but I learned yesterday from the ARMv9 announcement that there is an emerging standard to vector ops (alternative to Intel's AVX/2/512 and Arm's own Neon) called SVE2 (Scalable Vector Extensions 2). It allows vector ops on data of any width from 128 bits to 2048 bits, in 128 bit increments, on any CPU which implements SVE2, regardless of the native bit-ness of that CPU's vector circuitry. I assume that there are large performance hits for performing ops wider than the native width of your hardware -- as we saw with Zen/Zen+ CPUs which could dispatch AVX2 ops, but took two cycles to do so rather than one. At the moment there is exactly one CPU in the world which implements this: the Fujitsu A64FX, which is used in the Fukagu supercomputer. But in about 18 months it will be supported by all new Arm CPUs, and it would be cool if everyone settled on this rather than a never-ending series of vendor-specific extensions. (Lol.)
|
# ? Mar 31, 2021 17:55 |
|
I think the idea with SVE is the application queries the natural vector width of the hardware and works around that, rather than hard-coding a particular width and expecting the hardware to deal with it In the docs the vector types are defined like "svfloat32_t", which tells you it's a vector of 32-bit floats but not how many there are, because that's unknown until runtime
|
# ? Mar 31, 2021 18:56 |
|
gradenko_2000 posted:AVX2 afaik Let's do virtual backgrounds with 3D particle movement then
|
# ? Mar 31, 2021 20:12 |
|
The 11400 is the one actually good RKL SKU but Intel must be very uninterested in selling it since they didn't give it to any reviewers. https://www.youtube.com/watch?v=upGjxnGaJeI
|
# ? Mar 31, 2021 20:42 |
|
the pissing and moaning about AVX-512 is going to look bad in retrospect in a couple years. AMD is implementing it on Zen4 next year, meaning it'll finally be available in all product segments on both brands. The problem has always been "why write codepaths for hardware that doesn't exist": previously it's only been in servers, which is why you only saw HPC get written around it. It got added to laptops about 18 months ago, but only on quad-core ultrabooks, which aren't exactly where you do tons of heavy vector math, and only Intel at that. This is the first time it's been available on the desktop outside the Skylake-X HEDT processors which were lol and had a ton of performance gotchas that no longer exist on the new implementations. GPUs can't fully replace AVX, for example nobody has ever been able to port x264 or x265 with their heavily branching codepaths, instead everyone in that segment has been forced to use hardware ASIC/SIP accelerator cores which up until recently had significantly worse quality. Even today a deep motion search (eg veryslow) is still better than even the best NVENC cores (which approximate "medium" quality motion search. Which isn't to say that NVENC is bad but there are certainly workloads where you can't just drop in a GPU and call it a day. sending it off to a GPU also adds a lot of latency, which can be bad for something like inferencing where the inference is part of some larger computation. like, I don't know, maybe if you wanted to have a game where each unit runs an inference to decide what they should be doing. maybe if you are making a lot of runs of it, you can batch it to amortize it across a lot of units of computation, but maybe not. and these use-cases aren't particularly a problem anymore with downclocking since that really no longer exists on ice lake or rocket lake. there is a reason that VNNI instructions were a specific focus to get added in AVX-512. love the armchair experts (linus included) who think they know better than the experts at AMD and Intel who decided to write it and implement the instruction set. Even ARM, who had the opportunity to do it from scratch, is still doing NEON and SVE, because vector math is just very useful to have, as long as there's not a bunch of performance gotchas to using it. Paul MaudDib fucked around with this message at 22:31 on Mar 31, 2021 |
# ? Mar 31, 2021 22:12 |
|
movax posted:If consoles don’t support it (definitely CPU-wise), they won’t bother with it. you could trivially use a "virtual camera" which outputs some other video stream or some game as a virtual webcam, yes. figuring out where your head is in realtime, as you move, is the difficult part of the problem here, not the compositing. and that task is much more akin to something like a video encoding motion search than a compositing task. And again nobody has ever built a version of x264 or x265 motion search that works well for GPU architectures, everyone uses fixed-function hardware accelerators if they want to encode on a GPU, but it is very amenable to AVX acceleration. it's probably valid to point out they should have written a SSE fallback (assuming SSE did what they needed, AVX and AVX2 and AVX-512 have all added new instruction types that go above and beyond just vector width) but I don't think anyone notable really cared about it as a feature until 12 months ago. maybe streamers but it was always inferior to a greenscreen or a streaming camera with depth-of-field (at which point it's trivial, just composite over the areas where the depth of field is higher). it was probably a "nobody is going to use this anyway, why should we bother spending any time on it" and then lol Paul MaudDib fucked around with this message at 22:21 on Mar 31, 2021 |
# ? Mar 31, 2021 22:19 |
|
Kazinsal posted:So not much point in stepping up from my 8700K then, unless I want to spend $wtf on an 11900KF. Kinda disappointing. priznat posted:Yah it’s pretty good, what I got too. No rush to upgrade! It might go through 3 gpu generations in my system by the time I replace it! repiv posted:Same, I'll probably end up hanging on to this 8700K until DDR5 matures What up 8700K forever gang I'm also waiting for DDR5 and will make this 8700 into a bangin Plex server one day
|
# ? Mar 31, 2021 23:07 |
|
I've heard AVX-512 has some use cases for emulation software, nothing specific though.
|
# ? Mar 31, 2021 23:15 |
|
BaronVanAwesome posted:What up 8700K forever gang Going from a 2500K I am used to long lived machines. My 2500K is now my unraid/plex machine, until it dies!
|
# ? Mar 31, 2021 23:27 |
|
Paul MaudDib posted:love the armchair experts (linus included) who think they know better than the experts at AMD and Intel who decided to write it and implement the instruction set. Even ARM, who had the opportunity to do it from scratch, is still doing NEON and SVE, because vector math is just very useful to have, as long as there's not a bunch of performance gotchas to using it. love it when an armchair expert tries to armchair-expert a dude who once worked for Transmeta on their (admittedly unusual) x86 compatible CPU Torvalds has many faults, but you won't find many people better positioned to critique AVX512. His notorious "I hope AVX512 dies a painful death" rant was more or less immediately followed up by him admitting it was biased and performative, but he also had several actually interesting things to say about why AVX512 might not have been the greatest choice. IMO: 512-bit was clearly a good idea in its original context, which was an ISA extension designed for a special narrow market, HPC. (Yes, that's right, Larrabee was for HPC first and foremost - the GPU thing was a side project that the team was enthusiastic about but management wasn't.) But when it came time to push the Larrabee work into the mainstream x86 ISA, it's possible Intel should've reduced vector width. It's one thing to devote massive resources to SIMD when assuming the workload is nearly all SIMD, because HPC, but it's another when the applications are incredibly varied and relatively few can use SIMD.
|
# ? Mar 31, 2021 23:39 |
|
BobHoward posted:love it when an armchair expert tries to armchair-expert a dude who once worked for Transmeta on their (admittedly unusual) x86 compatible CPU and see there's nothing wrong with the point that maybe 512b is too wide (implementing it in 2 cycles is fine) but that's not the nuance he made in his original post, or that everyone cites him over. What people cite him on is "AVX-512 = bad", not "dual 512-bits is too wide, but the instructions are a step forward in many respects and furthermore..." it's linus, he's a complete shithead in general (the "oh I'm just blunt, it's just the way I am! maybe you're just too thin-skinned!" is the same thing every toxic engineer/manager always says), but with internet culture the way it is, the soundbyte is all that matters. if he didn't think AVX-512 was a mistake he shouldn't have (true to his usual form) brashly stated exactly that in exactly as many words. again, if AVX-512 was a mistake then AMD wouldn't be going ahead and implementing it too. They saw all the feedback and design flaws with the early implementations and went ahead and pursued it anyway. because it's worth pursuing in general, even if maybe you don't go for 1024 bits worth of vectors and you keep it so that it doesn't have to downclock. but oh I guess linus worked on a failed processor that one time, he's smarter than Jim Keller and Lisa Su right? the people that have billions of dollars of revenue riding on these design decisions, they don't know what they're talking about! linus is a self-declared "filesystems guy" too and yet he did the 'ZFS is a meme and nobody should use it, use BTRFS instead' thing too (just ignore the "not ready for production, may cause data loss" on half the features). What's that law about "when the news mis-reports some topic that you know about, you chuckle, but they're just as likely to mis-report on other topics and you don't realize it because you don't know about that topic"? Well, anyone who's used ZFS in production or knows the state of btrfs knows that Linus doesn't know what he was talking about there, and maybe it should give you pause when he opines on other things he thinks he's an expert on. He is a project manager, he is an engineer who works on kernel code, those are the things you should listen to him on. not that any of this makes rocket lake good in general, it's obvious that AVX isn't the advantage Intel needs here, but it's going to be on both platforms next year whether people here like it or not, and we really should be moving past the stage where we have to care about whatever hyperbolic thing falls out of Linus's mouth this week, unless it's kernel-related. it doesn't matter what he thinks about global warming, it doesn't matter what he thinks about AVX-512, it's happening regardless of what he thinks. anyway, I'm not armchair-experting anything, I'm deferring to the experts who think it's worth sinking a lot of money and silicon into implementing. Linus is the one who is making a positive claim that AVX-512 is benchmarkeetering. I strongly doubt Lisa Su is doing it just for a couple benchmark wins if it's not going to be something that actually sells processors. Paul MaudDib fucked around with this message at 00:56 on Apr 1, 2021 |
# ? Mar 31, 2021 23:58 |
|
BobHoward posted:love it when an armchair expert tries to armchair-expert a dude who once worked for Transmeta on their (admittedly unusual) x86 compatible CPU man you dissed an intel product in the intel thread, that's the fuckin bat signal for paul to blow a big team blue load all over the thread that we'll be cleaning out of nooks and crannies for days (unironically would love to hear more about the innards of transmeta's CPUs if you're allowed to talk about it. might make a good discussion for the non-Intel non-AMD thread if I can get around to finishing an OP for it)
|
# ? Apr 1, 2021 00:32 |
|
Kazinsal posted:man you dissed an intel product in the intel thread, that's the fuckin bat signal for paul to blow a big team blue load all over the thread that we'll be cleaning out of nooks and crannies for days lol, who is defending an intel product? I'm defending an amd product here! The Intel one is kinda trash, but it seems likely AMD is going to do it better next year. AMD presumably thinks so too, seeing as they invested a lot of money to do it. Think Lisa Su is the type to waste money on winning a few benchmarks, if it's not going to be something that actually sells processors? anyway, if you don't want to read the forums then please don't, i wouldn't want that on my conscience Paul MaudDib fucked around with this message at 00:57 on Apr 1, 2021 |
# ? Apr 1, 2021 00:47 |
|
Intel are cheaper and available so whatever.
|
# ? Apr 1, 2021 00:54 |
|
Paul MaudDib posted:lol, who is defending an intel product? I'm defending an amd product here! The Intel one is kinda trash, but it seems likely AMD is going to do it better next year. Despite all the interesting things you know, your predictable rants and boring, pedantic walls of text make these threads worse. Your desire to be correct outweighs the positives of your knowledge.
|
# ? Apr 1, 2021 02:06 |
|
Linus Torvalds, who's that? Is he trying to be like the Tech Tips guy?
|
# ? Apr 1, 2021 02:30 |
|
MaxxBot posted:The 11400 is the one actually good RKL SKU but Intel must be very uninterested in selling it since they didn't give it to any reviewers. Now that B560 mobos also have fully unlocked memory OCing, the 11400F is screaming deal for games. I certainly would take that over a Ryzen 3600.
|
# ? Apr 1, 2021 03:07 |
|
Linus is a smart guy who has done way more than I ever could but I've worked with enough engineers like him that I will on principal never agree to any argument made that attempts to appeal to his authority. All of a sudden I'm transported back to some conference room having the most aggravating conversations WhyteRyce fucked around with this message at 16:51 on Apr 1, 2021 |
# ? Apr 1, 2021 16:49 |
|
Paul is ultimately correct about AVX512. AVX512 sucked because of all the dumb decisions around its emerging standard and implementation. AVX512, and vector extensions in general, are very useful and it's great we are going to see a unified standard across the stack and from both companies. WhyteRyce posted:Linus is a smart guy who has done way more than I ever could but I've worked with enough engineers like him that I will on principal never agree to any argument made that attempts to appeal to his authority. Communication and correctly identifying what's going on are both real hard and are huge, constant problems in all aspects of life. I'm fairly sure we even suck at communicating directly with ourselves. Khorne fucked around with this message at 20:47 on Apr 1, 2021 |
# ? Apr 1, 2021 19:46 |
|
Khorne posted:"it just uses the navier-stokes equation so compare the code to that" by a top-of-their-field phd physicist who is an alleged co-author of the software. I've found that the key to dealing with these people or anyone who uses the phrase "it's just ..." is to put them solely in charge of fixing it. Either they're right, and the thing gets fixed, or they have to adjust their attitude and start collaborating.
|
# ? Apr 1, 2021 21:26 |
Linus probably isn't the best person to talk about this, since he's an OS kernel developer. It's extremely dubious whether AVX*, MMX, or even SSE is of use in a kernel, since almost everything you're doing is a matter of short runs where the latency is more important than throughput - and for the specific instructions with AVX*, MMX or SSE that you're trying to use, it will end up adding more time than just doing a regular calculation using the ALU or FPU. Anything SIMD or vector-like can be done in userspace, and as an added bonus, this makes it easy to set the affinity so that those processes never get moved off the CPU by the scheduler.
|
|
# ? Apr 1, 2021 21:28 |
|
BlankSystemDaemon posted:Linus probably isn't the best person to talk about this, since he's an OS kernel developer. Not emptyquoting. Besides, it's not like Linus is saying that Linux won't support AVX instructions, is he? As long as the operating system itself stays the hell out of the way of what is actually being done on said OS, everything's gravy. I don't think anyone ever expected Linux to use AVX for the kernel internals, after all.
|
# ? Apr 1, 2021 21:57 |
|
His point was more around the alternative uses that AVX512 silicon could be put to. It's a fair question to ask, and Apple's approach of lots of execution units shows how it can pay off in a way that benefits multiple workload types.
|
# ? Apr 1, 2021 23:13 |
|
Aren't GPUs better for a lot of the use cases that AVX-512 was designed for? i guess there's the case where the CPU<->GPU travel time is too expensive for a real-time application but i wonder how often that's a bottleneck. Updates for CUDA also seem a lot "cleaner" than the labyrinth of mapping Intel CPU to AVX support and coding specific paths for them
|
# ? Apr 2, 2021 00:07 |
|
Intel's math libraries use AVX512 and apparently it does bring good benefits to data science work.SCheeseman posted:I've heard AVX-512 has some use cases for emulation software, nothing specific though. https://www.reddit.com/r/emulation/comments/lzfpz5/what_are_the_implications_of_avx512_for_emulation/ And if AMD's implementing it it might not get orphaned like TSX on rpcs3 lurksion fucked around with this message at 01:00 on Apr 2, 2021 |
# ? Apr 2, 2021 00:52 |
|
lurksion posted:Intel's math libraries use AVX512 and apparently it does bring good benefits to data science work. that thread kinda highlights the various issues with it - quote:This is the issue with AVX-512; it's really a large family of loosely related instructions and should've been rolled out in smaller waves e.g. AVX-512A, 512B, etc. or even given different names. For example, BF16 is part of the AVX-512 suite despite seeing very bespoke implementations. quote:Yeah, speaking off the record from professional experience: for our particular vector workloads on the hardware we happen to use it’s faster to disable AVX512 because the change in thermals causes clock throttling that leads to a net performance reduction. quote:Oh, most important. Rocket Lake successor, Alder Lake, is rumoured to NOT have AVX-512 support. And I have no idea about Meteor Lake. So, AVX-512 has a non-zero risk of being orphaned on desktop (This is actually the second time that Intel tried to introduce AVX-512 to consumers if you count the ill fated 10nm Cannonlake). Intel already announced AMX (Advanced Matrix Extensions) for the server Sapphire Rapids, and the HEDT line based on it will surely have it, too, in the same way that AVX-512 was supported on it while it took years in desktop.
|
# ? Apr 2, 2021 01:25 |
|
Kazinsal posted:man you dissed an intel product in the intel thread, that's the fuckin bat signal for paul to blow a big team blue load all over the thread that we'll be cleaning out of nooks and crannies for days If "allowed to talk about it" means you think I worked for Transmeta, just to be clear, I did not. I don't know a ton about Transmeta's architecture, other than it was a VLIW machine. They relied on their "Code Morphing System," a JIT, to translate x86 code to this proprietary VLIW ISA. The combo of CPU and low level firmware functioned like a real x86 - the native ISA wasn't documented, and iirc they took steps to prevent you from even trying to run native code yourself. Despite the protection, I recall people had some success at reverse engineering the native ISA. SwissArmyDruid posted:Not emptyquoting. Besides, it's not like Linus is saying that Linux won't support AVX instructions, is he? As long as the operating system itself stays the hell out of the way of what is actually being done on said OS, everything's gravy. I don't think anyone ever expected Linux to use AVX for the kernel internals, after all. ~Technically~ the OS does have to support AVX - the scheduler has to save and restore its registers when context switching. I think you're not allowed to use AVX registers inside the kernel, since that permits them to avoid saving/restoring context for every system call. AVX registers hold enough data that it's an important optimization (syscalls need to be very low latency).
|
# ? Apr 2, 2021 01:35 |
|
BobHoward posted:If "allowed to talk about it" means you think I worked for Transmeta, just to be clear, I did not. Technically you are allowed but you need a drat good reason. https://yarchive.net/comp/linux/kernel_fp.html Crypto code is about it iirc
|
# ? Apr 2, 2021 01:46 |
|
BobHoward posted:I think you're not allowed to use AVX registers inside the kernel, since that permits them to avoid saving/restoring context for every system call. AVX registers hold enough data that it's an important optimization (syscalls need to be very low latency). The main syscall handler doesn't preserve the state of any floating point or SIMD registers yeah, but you are allowed to use SIMD/FP instructions in the kernel. It just needs to be wrapped in code that pushes/pops that state manually. I know there's a bunch of AVX crypto code in the kernel and I think some of the software RAID stuff uses it as well. e: oops had this page open for a while and didn't refresh
|
# ? Apr 2, 2021 03:21 |
|
shrike82 posted:Aren't GPUs better for a lot of the use cases that AVX-512 was designed for? i guess there's the case where the CPU<->GPU travel time is too expensive for a real-time application but i wonder how often that's a bottleneck. Of course this sort of stuff can be done on a GPU as well but these image operations are usually part of some bigger processing pipeline and it gets obnoxious to transfer the image back and forth between CPU and GPU for each pipeline step depending on how it's implemented, and a lot of filters aren't written for GPU processing, so there's a lot of value in doing these things on the CPU still. e: here's a spooky mix of C++ templates, C preprocessor macros and avx512 intrinsics if anyone is curious TheFluff fucked around with this message at 03:37 on Apr 2, 2021 |
# ? Apr 2, 2021 03:34 |
|
https://www.youtube.com/watch?v=oaB1WuFUAtw some perspective here from Dr Ian Cutress about how Rocket Lake might be regarded as a win for Intel because it demonstrates that their design teams still have the chops to do new designs, since doing the "backport" of the 10nm cores into 14nm is not an easy thing to do, regardless of the actual performance, and that they're going to have to learn to do this sort of thing more often given that their plans involve working with other fabs beyond just their own. I don't know if I really buy that reasoning, mostly because A. Rocket Lake was already late/delayed in the first place, and B. the reason why Intel is having to need to learn to work with other fabs is because they've had a hell of a time moving on from 14nm, though I thought the argument was interesting
|
# ? Apr 2, 2021 05:15 |
|
Everything's a win when you're a huge megacorp with too much inertia. If it wasn't a win, heads would be rolling
|
# ? Apr 2, 2021 06:22 |
|
They are clearly at the absolute edge of what 14nm can give them, so I agree that I don't think the the architecture design team is really to blame honestly. The design folks made Cannon Lake with the intention for it to launch on the original 10nm in 2015. That first iteration of 10nm pretty much completely failed as a node, so woops! Eat poo poo Cannon Lake, thanks for all the work design team, the thing is broken from day 1 thanks to manufacturing! Then they made Ice Lake and Tiger Lake, both of which are good performers architecturally but on 10nm V2, which while at least functional, never hit the targets it was supposed to hit. Ice Lake was supposed to launch in 2016... it still hasnt launched on server. Insane, and likely a result of the yields never ramping to make massive server dies profitable like planned. It also never hit the frequencies I believe they intended. So while better, still pretty much a huge let down by manufacturing. I personally think Rocket Lake exists mostly because the design team would otherwise be sittin on their rear end, cause the manufacturing side is half a decade behind. So now you get "codesign", because Intel has realized that going all in and betting on the manufacturing folks to deliver is a bad idea in a world where each manufacturing improvement is going to get harder and harder and riskier and riskier. Pretty much the worst thing that I think you can lay at the feet of design is spectre/meltdown, but that seems considerably mitigated if the nodes had actually gotten poo poo out on the intended schedule. Instead, they spent 6 years grafting additional cores to Skylake and doing band-aid fixes on a design that was supposed to have been replaced years ago. (i likely dont know what im talking about, so take this whole thing as just a web forum rant)
|
# ? Apr 2, 2021 07:45 |
|
https://www.youtube.com/watch?v=LYdHTSQxdCM Gamers Nexus has a review up of the i5-11400, the non-overclockable Rocket Lake six-core, and it comes really dang close to a 5600X despite being a over hundred bucks cheaper or, put another way, is significantly faster than a Ryzen 5 3600 on top of being 20-40 bucks cheaper
|
# ? Apr 2, 2021 08:55 |
|
gradenko_2000 posted:https://www.youtube.com/watch?v=LYdHTSQxdCM Interesting. Over in euro land the i7 11700 non K is already available for the same price as a 5600X as well. Wonder if prices will start to fall soon.
|
# ? Apr 2, 2021 12:43 |
|
|
# ? Apr 19, 2024 20:48 |
|
As someone who only has man-on-the-street level knowledge of chip fab, can someone explain to me what exactly it means for a node/process to fail, and how one does it for 5 years straight?
|
# ? Apr 2, 2021 14:02 |