|
BurritoJustice posted:now when I go to fold the CPU is running around the same but the GPUs are sitting on like, 10k each. Which is obviously not worth it for the electricity used. Does anyone have any ideas? If you return a work unit after Timeout, but before Expiry, you will still get some credit but the WU will be re-issued to someone else. And if you return before Timeout, your QRB is higher the faster you return it. COVID-19 Moonshot work units have especially short timeouts because they are so time-critical. Researchers are using the data to prioritize chemical compounds to synthesize and assay in-vitro. More traditional projects have longer timeouts on them. Can you try pausing the CPU and see if the GPUs start working again? Two CPU threads should be dedicated to feeding the GPUs with work and sanity-chekcing their results. If your PPD permanently dropped, then I'd suspect some kind of hardware issue. Have you asked in the foldingforum.org forums?
|
# ? Oct 10, 2020 14:28 |
|
|
# ? Mar 29, 2024 16:57 |
|
BurritoJustice posted:I started folding the other day, with the following specs: I've sometimes had spots where I had low PPD and letting it get through a work unit or two fixed it. Vir's answer was better, I'm just +1 on it basically
|
# ? Oct 10, 2020 19:21 |
|
Also, if Folding uses the same model that BOINC projects do, you don't get credit for WUs when you return them; you get credit when they are validated. Sometimes you're waiting on the wingman; sometimes (much more rarely) you're waiting on the validator process.
|
# ? Oct 10, 2020 19:29 |
|
Folding work units are not supposed to be done by two folders. The only cases where a work unit is issued to more than one folder is if it passes timeout, it crashes, gets dumped or there's something wrong with the work server. The servers validate the work on their own, because it takes relatively little computation to validate the folding work once it has been done.
|
# ? Oct 11, 2020 07:27 |
|
BurritoJustice posted:I first ran it for a while and I was getting around 180k per day on my CPU and around 800-820k on each of the GPUs, but I had a system crash because of something unrelated (bethesda engine bullshit) and now when I go to fold the CPU is running around the same but the GPUs are sitting on like, 10k each. Which is obviously not worth it for the electricity used. Does anyone have any ideas? Could it be that the GPUs are not actually doing anything (waiting for new work units) and the rolling average WU/day calculation is a bit flaky?
|
# ? Oct 12, 2020 04:04 |
|
Folding@home has quietly released a new client version (7.6.20), and there is now an ARM Linux / Raspberry Pi version as well: https://foldingathome.org/alternative-downloads/ e: Also they posted some science news: https://foldingathome.org/2020/10/20/covid-moonshot-sprint-4/ quote:Just as we were doing this, something extremely unexpected happened: The chirally-separated version of the compound we started with turned out be 100 nM! That’s 25 times more potent (meaning you need 25x less drug to get the same effect in shutting down the protease) than what we thought we started from! Vir fucked around with this message at 11:39 on Oct 20, 2020 |
# ? Oct 20, 2020 11:22 |
|
I have released Homefarm v2.8.0. This release's big features are multi-arch support (farms can now be x86_64, arm7h/arm7l, or heterogenous), and user-defined lists of packages to be added to the local repo and installed on compute nodes.
|
# ? Oct 25, 2020 22:38 |
|
Posted my October WCG update. Tons of detail from OpenPandemics and Smash Childhood Cancer this month.
|
# ? Oct 30, 2020 05:04 |
|
Folding at Home is running a 20 year anniversary stream on Twitch: https://twitch.tv/videos/807519293 e: Link to replay. Vir fucked around with this message at 13:34 on Nov 19, 2020 |
# ? Nov 18, 2020 19:20 |
|
Sometime in the past few days I hit 100 years of CPU time for WCG's Mapping Cancer Markers project
|
# ? Nov 22, 2020 07:19 |
|
There is a security vulnerability that affects those who use the Folding@Home GUI prior to version 7.6.20 to control remote folding instances over insecure networks: https://foldingathome.org/2020/11/23/update-for-those-using-advanced-remote-client-management-configurations/ Basically, the folding client or an attacker on the folding client's network could trick the GUI into running arbitrary code. This doesn't matter to those who fold on their own computers, but maybe if you're running folding on your webserver or something and just opened it up to the world. mdxi posted:Sometime in the past few days I hit 100 years of CPU time for WCG's Mapping Cancer Markers project
|
# ? Nov 24, 2020 11:10 |
|
Vir posted:I'll have you know that I've simulated a whopping 0.000042 seconds of proteins jiggling. Every wiggle counts!
|
# ? Nov 24, 2020 22:21 |
|
I tested my 5800X out in folding vs half of my 3950X, it's about 44% faster which is considerably more than I expected. I guess the changes to the FPU such as reduction in latency of FMA instructions really make a big difference. I also tested them in Prime 95 and found a similar performance difference.
|
# ? Nov 30, 2020 05:22 |
|
MaxxBot posted:I tested my 5800X out in folding vs half of my 3950X, it's about 44% faster which is considerably more than I expected. I guess the changes to the FPU such as reduction in latency of FMA instructions really make a big difference. I also tested them in Prime 95 and found a similar performance difference. That's pretty cool. I'm looking forward to getting my hands on some 5000-series CPUs. I don't know what Folding's workunits are like, but I find World Community Grid's projects to be fiendishly difficult to benchmark. Even within a single batch, WUs can (and do) require vastly different amounts of computation. My best effort at "fair" benchmarking was to do 24-hour runs of single projects, then examine the min, max, average, and do quintile bucketing. I'm not going to do that this time around; I just wanna see the numbers go up a bit more I think it's good that you saw similar results using a completely different test (Prime95).
|
# ? Nov 30, 2020 06:11 |
|
mdxi posted:I don't know what Folding's workunits are like, but I find World Community Grid's projects to be fiendishly difficult to benchmark. Even within a single batch, WUs can (and do) require vastly different amounts of computation. My best effort at "fair" benchmarking was to do 24-hour runs of single projects, then examine the min, max, average, and do quintile bucketing. I'm not going to do that this time around; I just wanna see the numbers go up a bit more Yeah there's a lot of variance, I'm gonna have to do this for a week or so on both systems to get better data.
|
# ? Nov 30, 2020 17:58 |
|
Rambling BOINC-related-crap chat time. You may (or may not; it's fine either way) remember that I built a super-janky mini-ITX micro-rack type thing out of steel L-channel from Lowe's, in an attempt to pack as much compute into as little physical space as I could. In and of itself, it was a successful experiment, but viewed through the lens of future upgrades and expandability, it was less great. There were several small irritations, and the biggest downsides were: * Mini-ITX boards are now priced at a premium, because it's cool to build a monster gaming rig in a 1.7L enclosure * When I built the chassis, I never intended to have GPUs in those nodes. I've since added GPUs to some of them, and it's a huge PITA (they have to be slotted after the tray is in the rack) * OMG the dust For a while I was working on a second chassis design, which would address all the issues that had cropped up, and provide better cooling to boot. But I am a terrible craftsman, and eventually got disenchanted with the whole thing. Over the weekend I decided to put everything back in standard mini-tower cases. Not all at once, but opportunistically. I just migrated the first node out of the rack and back into a case. So far things are looking good and temps are stable after several hours of crunching: code:
code:
And that's the fact that on all but one of my six nodes, Microbiome Immunity Project WUs started failing, 100% of the time, following a seemingly-innocuous OS update. On the last node they kept right on completing successfully -- about 90% of the time. I'll spare you the gory and infuriating details, but it's looking like -- and I am still very much still working this out, but it's looking like -- a memory speed/XMP issue. After spending months chasing down what looked like a software issue, I was left with the only realistic option being a kernel update that changed how the hardware was being treated, so i wrote down a bunch of BIOS settings from the one working node and started comparing it to the other nodes. The first thing I found was that memory speeds on that machine had been reset to the default of 2133MHz, with no XMP. Obviously on a Zen 2 CPU that's deeply suboptimal, since turning down RAM speed also turns down the Infinity Fabric speed, but that node was returning MIP workunits while all the other nodes segfaulted over and over. So I cranked the clocks back up to the rated speed of the RAM, but left XMP off, and MIP WUs continued to clear. I've made the same adjustment on the node I just rebuilt, and so far it has completed 2 MIP WUs -- the first since late September. If that's the problem, I will be simultaneously relieved and disgusted. All these machines had been crunching MIP (and all other) WUs just fine, for over a year, with no settings changes. MIP uses the same Rosetta suite that the Rosetta@Home project does. Earlier in the year I had also been running Rosetta, but when it was overwhelmed with volunteers and WCG's OpenPandemics project came online I wrapped up that work, so I can't say if it would have also been affected by this. Such a pain in the rear end. My hobby, ladies and gentlemen!
|
# ? Dec 1, 2020 00:00 |
|
nice. i know somewhere along the way i found your blog with the pics of the L-channel “rack” and it was aspirational to see what you were doing and think about trying something similar. but seeing all the management details and problem solving reminds me too much of work. I already spend a ton of time comparing performance of servers and looking at details like that sadly as a hobbyist you can really only spend money on consumer gear. in the semi professional market both dell and hp have software that consumes .xml config files and sets all the bios settings to whatever you need. easily driven by config management things like puppet or ansible. easy to monitor too. but second hand last generation severs are still expensive and not really all that competitive for driving your points per day. if your going for value then consumer ryzen cpus cant be beat
|
# ? Dec 1, 2020 00:58 |
|
yummycheese posted:in the semi professional market both dell and hp have software that consumes .xml config files and sets all the bios settings to whatever you need. easily driven by config management things like puppet or ansible. easy to monitor too. but second hand last generation severs are still expensive and not really all that competitive for driving your points per day. if your going for value then consumer ryzen cpus cant be beat i manage all my machines -- from the OS, down to BOINC, and even individual BOINC projects -- with a mixture of ansible and custom scripting. being able to manage BIOS at a fleet level would be nice, but not $18k HPE server nice. also, i did the rackmounts-in-the-house thing, and it is just not my jam anymore. BOINC has gotten me back into to enjoying hardware again, but i still want my machines to be in the background as much as is possible, and not take over my house. also, yes, for these kind of tasks, if you're trying to do "hobbyist grade HPC" then nothing can beat Ryzen right now. to be honest, this is the point in time where i had expected intel to get their head on straight and come back with something that actually competed with Zen. but that's definitely not happening in 2021 or possibly even 2022. on the topic of my MIP1 segfault problem, i think that XMP really may have been the issue. the node that i tweaked yesterday has since completed 20 MIP1 workunits, with one failure. That's not perfect, but 95% completion is far better than 0%. I feel like turning RAM speeds down more might get me to 100%, but that would be at the cost of slowing down IF, which slows down everything. i'm going to make the same change on the remaining four x86_64 nodes today and see what happens.
|
# ? Dec 1, 2020 19:13 |
|
Have you all discussed this yet? I saw it and immediately though of folding@home https://deepmind.com/blog/article/alphafold-a-solution-to-a-50-year-old-grand-challenge-in-biology quote:In a major scientific advance, the latest version of our AI system AlphaFold has been recognized as a solution to this grand challenge by the organizers of the biennial Critical Assessment of protein Structure Prediction (CASP) assessment. This breakthrough demonstrates the impact AI can have on scientific discovery and its potential to dramatic... Professor Venki Ramakrishnan, Nobel Laureate and President of the Royal Society posted:This computational work represents a stunning advance on the protein-folding problem, a 50-year-old grand challenge in biology. It has occurred decades before many people in the field would have predicted. It will be exciting to see the many ways in which it will fundamentally change biological research. Not sure how this changes things for folding@home (don't personally understand how it works as it is), perhaps the grid will now run AlphaFold? Or it will be used to verify AlphaFold's predictions?
|
# ? Dec 3, 2020 04:58 |
|
SETI@home is probably not coming back for a very long time because of what's been going on over in the Arecibo telescope in Puerto Rico, which it depends on. _________________ I haven't done one of these in a couple months, but here's where the goons stand. We've been going down in activity a lot, but that's something all groups are experiencing in general and not just a goon thing, since those who are here for COVID-19 probably know already that the focus is on in vivo testing through the vaccine candidate trials. Also, we could probably use a new team motivator, like maybe the goons could do something like work on HIV next, or work towards out-computing rivals like we were doing with Asteroids before. Asteroids@Home is having database issues and isn't reporting team rankings currently. In Folding@home, we are still #68 in rankings. Its compute power is now about 240 petaFLOPS, a shadow of the 2.62 exaFLOPS it was back towards the end of April. In Rosetta@Home, goons have dropped to #86 in Recent average credit, and we have risen to #164 in Total credit. In TN-Grid, which goons are still doing strong in, our Recent average credit ranking is now #36, and our Total credit ranking has risen to #52. Ibercivis SETI@home Arecibo
|
# ? Dec 3, 2020 05:18 |
|
Here is where things stand with my MIP1 weirdness, 24h after making all nodes as identical as possible in terms of both software and hardware config:code:
For the record, applying just the hardware fixes -- turning off XMP and slightly clocking down the RAM -- or applying just the software fixes -- installing the same packages everywhere, including some that don't seem to make any sense (automake, autoconf, binutils, elfutils, gc, gcc, gcc-fortran, guile, make) -- left things broken. It had to be both, to get MIP1 WUs to stop segfaulting. I've been running Linux a long time and this is one of the least explicable things I've ever seen. And again, for the record, all six nodes ran MIP1 successfully for over a year before this cropped up. And no other WCG subproject, or any other project (GPUGrid, Einstein@Home) was affected. Unrelated edit: just for fun, 2 of my RasPis died overnight. I'm sure it's the SD card getting hosed. I'll be very happy if they'd bring out a variant with an M.2 slot mdxi fucked around with this message at 17:48 on Dec 3, 2020 |
# ? Dec 3, 2020 08:38 |
|
I've posted my monthly WCG update for November.
|
# ? Dec 3, 2020 20:36 |
|
I'm currently only running BOINC on my desktop, but I was curious about some of your mentalities about grid computing. For those of you with clusters or dedicated computing setups, do you live in areas with cheap electricity/solar, or are you just willing to pay for electrical costs as support for your hobby/as a charitable cause?
|
# ? Dec 3, 2020 23:37 |
|
pseudorandom posted:are you just willing to pay for electrical costs as support It's this for me. It's a form of giving back/helping that's a good fit for my talents. Let's face it, despite all the early promise, computing hasn't actually done a whole lot to directly make the world a better place, and working in tech almost never translates into helping others. So yeah, I build machines that I don't use, and then I pay for the power and cooling that those machines require. I don't expect anyone else to do the same, or do it for the same reasons, but that's what compels me.
|
# ? Dec 4, 2020 03:48 |
|
I fold on my game computer while I'm not gaming, and fold on some e-waste machines for additional heating in the winter time. Tuxide posted:In Folding@home, we are still #68 in rankings. Its compute power is now about 240 petaFLOPS, a shadow of the 2.62 exaFLOPS it was back towards the end of April.
|
# ? Dec 6, 2020 01:00 |
|
pseudorandom posted:I'm currently only running BOINC on my desktop, but I was curious about some of your mentalities about grid computing. For those of you with clusters or dedicated computing setups, do you live in areas with cheap electricity/solar, or are you just willing to pay for electrical costs as support for your hobby/as a charitable cause? My landlord pays the power bill But even at my last place where I had to pay electricity costs I did it just because I enjoy it.
|
# ? Dec 6, 2020 03:21 |
|
Dr. Bowman shares some of the F@H results so far. They are moving into animal testing of candidates for therapeutic compounds. https://www.youtube.com/watch?v=6w8xb__A8Gc
|
# ? Dec 8, 2020 00:01 |
|
Had an unexpected evening of computer janitoring. Got the second minitower case in, to continue de-comm of my homebrew rack. It's a simple job: * Swap fans from case with nicer fans that I already have on hand * Disassemble tray * Move mobo and PSU into case Very straightforward and all the components have already had over a year of burn-in, so no problems expected. And with the first machine, no problems had. This time though, CPU temps were at 90.5C by the time I could walk across the room and SSH into it to check things out. Popped the case open, plugged in a monitor, and dropped into BIOS, where temps looked fairly normal, and all fans were spinning. Absent any obvious issues, I wondered if I had managed to somehow introduce an air gap between the CPU and HSF -- I did rotate it around in my hand while giving it a good air-dusting, and the orientation had switched from horizontal mount to vertical. The TIM was cakey, but not truly concrete yet, but I cleaned it, re-greased, and reseated everything anyway. Back to BIOS where... hang on, why's the RAM running at 2400MHz? Oh, because this board's poo poo-rear end BIOS just randomly reset itself again (a problem that hasn't reared its head since I got the 3900X running stably back around August 2019). And it was a complete reset, which meant that my fan and PPT settings were gone. So in BIOS everything was fine, but let the OS boot and BOINC start, and suddenly it's 24 threads running full-tilt, with "quiet" fan settings, and the factory default PPT, and the stock cooler, inside a cheap case. Put all the BIOS settings back like I had them before, and this time let it run a bit before buttoning it back up. Everything looked good, so mystery solved (and the TIM should be good until it's time to do a rebuild, next upgrade cycle). Current uptime: 37 minutes; Tdie of 64.8C.
|
# ? Dec 8, 2020 07:15 |
|
Personal milestone today: 200y of CPU time to World Community Grid. I'm also close to new badges for MIP and ARP, which means I'm jonesing for the silly things. I don't care until I notice that they're about to ding over, then I obsessively check the stats every morning and evening. Mmmmmmm, the Skinner box of science...
|
# ? Dec 19, 2020 06:16 |
|
Folding@Home simulations might have found a way to disable a bunch of coronaviruses, like SARS-CoV-2 (Covid-19), SARS-CoV-1 and MERS. They might even have found a therapeutic for the common cold. https://foldingathome.org/2020/12/16/sars-cov-2-nsp16-activation-mechanism-and-a-cryptic-pocket-with-pan-coronavirus-antiviral-potential/
|
# ? Dec 19, 2020 17:18 |
|
Vir posted:Folding@Home simulations might have found a way to disable a bunch of coronaviruses, like SARS-CoV-2 (Covid-19), SARS-CoV-1 and MERS. They might even have found a therapeutic for the common cold. https://foldingathome.org/2020/12/16/sars-cov-2-nsp16-activation-mechanism-and-a-cryptic-pocket-with-pan-coronavirus-antiviral-potential/ That is work right there. Protein engineering is getting really, really interesting.
|
# ? Dec 19, 2020 18:18 |
|
Not starting any upgrades yet, but doing a little work in support of the upcoming upgrades: -- 3 of 4 nodes are out of my homebuilt micro-rack, and the case for the 4th has been ordered, so I can hurry up and throw that thing out -- I built those 4 nodes with wifi, but moving forward I'm going back to wired connections (mobos without wifi are a bit cheaper) -- The three nodes which are back in standard cases have already been swapped to wired ethernet ---- As a side note, 16 port gigabit switches are cheap as hell these days -- Since node04 will be in line for an upgrade to 5900X, I decided to go ahead and give it a new mobo when it gets moved into its case ---- I think switching from mITX to uATX will allow enough space for a 140mm top exhaust fan, due to the changed CPU position ---- It's also getting an RX 550, which will bring me to 5 of 6 nodes with GPUs -- Finally, I found a "children's folding activity table" on Amazon for $30, which is exactly the right size to hold 3 machines, with the other three under it ---- This means no more compute nodes in the bedroom, and provides an impetus to go ahead and move those two machines -- which currently exist as motherboards held up with d6s, sitting next to PSUs -- into cases before their rebuilds
|
# ? Dec 27, 2020 23:26 |
|
December's WCG update is up, including the first real data/analysis update from ARP.
|
# ? Jan 1, 2021 21:03 |
|
mdxi posted:December's WCG update is up, including the first real data/analysis update from ARP. Thanks for posting these. I'm just running OPN and ARP on a 3600 right now, which isn't much. It's good to see progress being made by the projects as a whole rather than just my number going up
|
# ? Jan 5, 2021 05:48 |
|
The NPC posted:Thanks for posting these. I'm just running OPN and ARP on a 3600 right now, which isn't much. It's good to see progress being made by the projects as a whole rather than just my number going up You're welcome. I started doing them both to surface the information (which at the time was buried on the WCG forums; they've since started publishing updates as news stories), and to have something to point at for the very intelligent individuals who do "WHY EVEN BOTHER NO ONE HAS CURED SCIENCE" drive-bys every few months on r/BOINC. Since starting though, I look forward to putting them together for the same reason you like reading them: I want to see the bigger picture. And it's all good. The 3600 is a great CPU for BOINC, and you're contributing cycles. That's all that matters.
|
# ? Jan 5, 2021 16:36 |
|
The January WCG update has been posted. It was kind of a slow news month. Edit: Removed big slab of CPU talk. I started a thread in project.log where I can blather about my BOINC hardware and (un)associated projects, being as wordy as I feel like, without worrying about making GBS threads up this place. mdxi fucked around with this message at 06:58 on Feb 6, 2021 |
# ? Feb 5, 2021 06:56 |
|
WCG's OpenPandemics did its first GPU beta last weekend, and seemed pretty successful. A second beta happened Friday and was halted very quickly due to a 100% error rate on Intel GPUs. One interesting tidbit came up in the discussion, from an admin: "[GPU WUs] are equivalent to the CPU work units. They solve the same problem just a different way. But for each CPU work unit, they would run multiple ligands in a work unit (sometimes less if they were difficult). For GPU they run lots of ligands in a single work unit due to the speed of the calculations." At least something is happening other than "tech team security review"
|
# ? Feb 28, 2021 10:47 |
|
WCG's OPNG is went through a third beta test today, and I learned something about my AMD GPUs. OPNG requires OpenCL 1.2, which was published as a standard in 2012. Three of my machines have AMD GPUS: a RX 550 (released 2017) and two WX 3200s (released 2019). All of them use lower-power GCN4/Polaris cores, which basically no one liked for graphics but which have a decent reputation for compute. Except that none of them support OpenCL 1.2; they're all 1.1-only. Meanwhile, my 750 Ti, bought in 2014, is still crunching like a goddamned champ. Mildly peeved about this, but I guess I'll put those GPUs back to work on Einstein@Home WUs for now. Edit: Dug deeper. This isn't an AMD hardware problem; it's a Linux/Mesa problem. https://www.phoronix.com/scan.php?page=news_item&px=Mesa-20.3-OpenCL-1.2-Clover https://mesamatrix.net/ mdxi fucked around with this message at 08:44 on Mar 6, 2021 |
# ? Mar 6, 2021 07:53 |
|
mdxi posted:That is work right there. Protein engineering is getting really, really interesting. Here's a status update on this. They hope to have a clinical candidate this month. https://www.youtube.com/watch?v=fH0hIXC3bUg
|
# ? Mar 12, 2021 15:04 |
|
|
# ? Mar 29, 2024 16:57 |
|
There was another (the 5th, I think?) WCG OPNG beta at the end of last week and over the weekend. Some people are still having problems, but the admins report that overall things are looking good, and there may be just one more beta before OPNG moves into general availability.
|
# ? Mar 29, 2021 17:44 |