Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
mdxi
Mar 13, 2006

to JERK OFF is to be close to GOD... only with SPURTING



There will be at least one more beta for OPNG, likely happening this afternoon (US time). Also, it looks like the OPNG points issue is finally getting sorted out: they will be scaled to 20X compared to CPU points, to account for the vastly shorter runtimes. I'm in the "who cares; we're doing science" camp on this, but there has been a vocal "why am i not being awarded points" camp as well, all the way through the series of betas.

I admit that I've been confused at the amount of noise people have made about this. I know that people enjoy the gamification aspect (Exhibit A: the title of this thread), but points are the single least useful metric there is. Pretty much everyone awards badges based on runtime, and WCG's points are particularly meaningless, being scaled approximately 7X from every other BOINC project in existence, so you can't even use them to talk about cross-project performance without doing math.

Edit: This ended up not being done as a separate beta run, but just as a new batch of WUs for the existing run. Currently running are batches of WUs which have been generated from already complete CPU WUs, to ensure that the results are approximately identical.

I'm personally happy about this, because I have unbroken my AMD cards by installing the OpenCL portions of the AMDGPU-PRO drivers on them -- this being OCL 1.2 compliant, in contrast to the Mesa OCL driver. I didn't want the hassle, but my desire to crunch more science won out after a half-day of arguing with myself about it. I've got a really simple bash script that I'd be happy to share in the incredibly unlikely event that anyone else has also had this problem.

Edit 2: Really interesting post from a WCG admin on how they build and "score" WUs for this project:

quote:

For a CPU work unit, we estimate they can run X jobs based on what each job has inside of it. This is based off how many atoms are in a given ligand.

( 0.0000000122 * Atoms^2 + 0.0000000751 * Atoms + 0.0000105946 ) * ga_num_evals * ga_run = how long we estimate it'll take for an average cpu.

Each job has a different number of atoms and structure, which changes the equation by evals being different and higher generally with more atoms in a ligand. This is 100% just an estimate but gets us a pretty good average runtime on similar processors.

When a work unit is created, we package multiple jobs together or split them up based on how difficult they are. We try to target say 3 hours per CPU work unit. For the GPU version, we create them with 20 times the difficulty as CPU version. These are split the exact same way, thus they get 20 times more points because they were originally created 20 times harder.

If we ran one of the GPU work units on CPU, it would on average take them 60 hours to complete the same task.
We also learned today that OPN WUs "short circuit", in much the same way as how a chain of 'or' statements short-circuit by halting evaluation as soon as a single true value is found: when OPN(G) WUs find a good match on a ligand (based on some threshold), they halt work and declare themselves complete rather than exhaustively testing all possible values.

mdxi fucked around with this message at 05:56 on Mar 31, 2021

Adbot
ADBOT LOVES YOU

mdxi
Mar 13, 2006

to JERK OFF is to be close to GOD... only with SPURTING



WGC OPNG progress is moving really fast now, after a glacially slow bring-up, and cautious early beta period. From one of the admins this afternoon, after yet another batch of beta WUs:

quote:

We are having some virtual high fives here in communications with the researchers. They have some additional checks they want to look at, but currently we are getting really good results and things look awesome.

With this latest beta batch, I have changed the assimilator to be production ready with how it packages results for the researchers.

So next steps in my mind are this:

1. Eat pizza for lunch
2. Send additional results back to researchers for validation (tomorrow)
3. Await final thumbs up from researchers (Hopefully we'll have this by Friday)
4. Build batches and upload 7.28 to opng (Tomorrow)
5. Perform other final checklists (ongoing)
6. Have go/no go conversation with the researchers (TBD)
Then, a later update, regarding a release of WUs later tonight:

quote:

I plan on adding 10 batches just to make sure the points and everything match what I'm expecting to see when we go live during production. As far as I can tell, it is, but to be 100% sure instead of 99%, I'm running these 10 extra batches. These are going to be the last 10 batches for beta as I do not plan on running any more.
So if this set looks good, OPNG beta is in the bag.

Vir
Dec 14, 2007

Does it tickle when your Body Thetans flap their wings, eh Beatrice?


Will this allow MacOS and ARM Linux users to use GPUs for Open Pandemics?

mdxi
Mar 13, 2006

to JERK OFF is to be close to GOD... only with SPURTING



Vir posted:

Will this allow MacOS and ARM Linux users to use GPUs for Open Pandemics?

There is no Mali GPU support. Nvidia, AMD, and Intel only.

People have reported both success and failure for OPNG WUs on Macbooks, so it's available -- at least for Intel GPUs. Didn't find anyone crunching on a desktop Mac in my quick search of the forums.

Vir
Dec 14, 2007

Does it tickle when your Body Thetans flap their wings, eh Beatrice?


There are some old Mac Pros with discrete AMD GPUs in them, and even some from Nvidia before that, but Folding@Home doesn't support any of them because there was a bug in OpenCL for MacOS back when it was more worthwhile, and MacOS has deprecated OpenCL and replaced it with Apple's own Metal API.

Binary Badger
Oct 11, 2005

Trolling Link for a decade




Curiously, Apple has kept the legacy OpenCL (v1.2) even in its latest version of macOS; they even re-wrote it to run natively under the new Apple Silicon CPUs.

Apple did this so that the current plethora of scientific software written for Intel chips would either run without modification under Rosetta, or just require a recompile with the latest Xcode with another flag set for M1 code..

I doubt you'll ever find anyone on the forums willingly running Folding@Home on a Mac because the Folding software authors are either totally uninterested in writing in old OpenCL code or feel that they don't need to give Macs GPU support, which pisses Mac users off and then they'll join a project that DOES support Apple GPUs like Einstein.. and the CPU client definitely is not really optimized on Mac.

Binary Badger fucked around with this message at 03:49 on Apr 4, 2021

mdxi
Mar 13, 2006

to JERK OFF is to be close to GOD... only with SPURTING



OPNG is live, but rather than the flood everyone was expecting, the current WU release rate is 1700 every 30-ish minutes (there's some randomness built in to keep people from fetching WUs on a clock, because yes that is a thing that people will do -- looking at you HSTB).

Currently unknown if that's gonna increase or not.

My machines have crunched 97 so far. They're very unevenly distributed across the farm (low: 0; high: 43), as you might expect for something with limited and somewhat-irregular availability. I'll be evaluating how this goes, and turning Einstein@Home back on as a low-priority project if my GPUs have consistent downtime.

Vir
Dec 14, 2007

Does it tickle when your Body Thetans flap their wings, eh Beatrice?


The F@H statistics page has been given a makeover. It now displays team logos for those that have it, and it has also broken the client's web frontend.

SAGoons is ranked 69, but lacks a logo: https://stats.foldingathome.org/team/150

Old page: https://statsclassic.foldingathome.org/team/150

Rexxed
May 1, 2010

Dis is amazing!
I gotta try dis!



Vir posted:

The F@H statistics page has been given a makeover. It now displays team logos for those that have it, and it has also broken the client's web frontend.

SAGoons is ranked 69, but lacks a logo: https://stats.foldingathome.org/team/150

Old page: https://statsclassic.foldingathome.org/team/150

Chikimiki
May 14, 2009


Vir posted:

The F@H statistics page has been given a makeover. It now displays team logos for those that have it, and it has also broken the client's web frontend.

SAGoons is ranked 69, but lacks a logo: https://stats.foldingathome.org/team/150

Old page: https://statsclassic.foldingathome.org/team/150

Oh so that's why I couldn't see my total score on the web client, I was gonna ask in this thread as to why that is

Vir
Dec 14, 2007

Does it tickle when your Body Thetans flap their wings, eh Beatrice?


F@H has rolled back the new stats pages, because it broke the web client. This premature rollout also revealed that many third party websites were hammering the statistics API with excessive calls. F@H has only one programmer on staff, which I guess is a typical symptom of how research grants pay for PhD projects, but infrastructure and support functions are under-funded.

Here's the beta page: https://statsbeta.foldingathome.org/team/150

mdxi
Mar 13, 2006

to JERK OFF is to be close to GOD... only with SPURTING



WCG Monthly Update - March 2021

I had been writing this up on Reddit, because Markdown is so much nicer than bbcode (can I pay the to add this now that Lowtax is dead?). But Reddit sucks balls as a community, so I quit. But I really enjoyed doing it, so I'm putting it here now.

It was a quiet month overall, except for OpenPandemics becoming GPU-enabled.

OpenPandemics
  • OPN1 is now GPU-enabled (see Betas section for more info)

Africa Rainfall Project
  • Research team member Camille Le Coz was recently accepted as a presenter at the EGU General Assembly 2021, a virtual conference for the European Geosciences Union. The conference is currently scheduled for late April.
  • The project's principal investigator, Professor Nick van de Giesen, will be giving a presentation about the project on March 11 at an IBM event.

Microbiome Immunity Project
  • Dr. Julia Koehler Leman, one of the research team members, will be speaking about the project at Winter RosettaCon 2021, a virtual conference for users of the Rosetta biodynamics suite. [Ed: Yes, WCG's MIP1 and Rosetta@Home use the same software]
  • Researchers are working simultaneously on three papers that are at various stages in the creation process. One of the papers has already been submitted to an academic journal for review.

Help Stop TB
No update.

Mapping Cancer Markers
  • Researchers continue to process work on World Community Grid while working on a paper about lung cancer markers.

Smash Childhood Cancer
SCC is on hiatus from WCG's perspective, but researchers are working on the next set of targets, and working in the lab with proteins targeted by previous WCG work:
  • Beta catenin -- The research team has decided to move forward with further testing on three compounds that show promise against this protein.
  • Osteopontin -- Testing continues on several compounds that may be effective at targeting this protein.
  • PAX3:FOX01 --The researchers are conducting lab testing on a compound that may be effective at targeting this protein

Betas
This month was full of OPN GPU testing -- the first WCG project to use a GPU in several years, for a variety of reasons. OPN1 on GPU has been given the short-name OPNG, so it's easy to tell WUs apart. Here's some info from the beta:
  • In addition to systemic testing, ten batches of completed WUs were rebuilt as GPU WUs (batch 30010 through 30019)
  • On GPU, a batch took an average of 3.5 days of compute time, vs 1162 days on CPU
  • This represents an average speedup of 336X (max speedup for a batch was 516X, but OPN WUs exit early when they find a "good enough" match so there's no apples-to-apples comparison)
  • OPNG uses Autodock GPU, which uses a modified algorithm that exhibits a greater probability of finding strong interactions between the molecules and viral proteins, and is well suited to dock larger or more complex molecules.
  • Overall, Autodock GPU exhibited a 1.6X increase in efficiency compared to Autodock 4 (this is algorithmic efficiency, not the raw speedup from parallelization on GPU)
  • Future OPNG WUs will test more complex compounds, while CPU will continue to focus on the current work

mdxi fucked around with this message at 23:19 on Apr 18, 2021

mdxi
Mar 13, 2006

to JERK OFF is to be close to GOD... only with SPURTING



OPNG stress test currently underway:

quote:

This afternoon around 18:00 UTC, we'll begin an extreme stress test of the World Community Grid infrastructure with the help of the OpenPandemics - COVID-19 research team in the Forli Lab. We're grateful for the support and hard work of the Forli Lab team in co-creating and refining this test, and we all look forward to seeing what our entire system can do.

For the purposes of this stress test, we have been given 30,000 batches of work units to run on the GPU version of OpenPandemics - COVID-19. These are real work units that will provide data for the project.

We anticipate this test will take approximately 3 days to run through the 30,000 batches that have been provided. However, the test will end as soon as the 30,000 batches have been processed, whether this processing takes less than 3 days or more than 3 days.

The stress test will involve all parts of the World Community Grid pipeline, from generating batches to post-analysis. This will help us identify bottlenecks, and see where and how we can improve. Below is an outline of the pipeline:
  • Researchers identify targets and/or ligands to compute
  • Researchers create batches of work units to be run by World Community Grid volunteers
  • World Community Grid downloads work units from the researchers' server
  • World Community Grid builds work units and load them into BOINC (Berkeley Open Infrastructure for Network Computing)
  • Volunteer computers and devices download these work units
  • Volunteer computers process the work units
  • Volunteer computers upload the files back to World Community Grid servers
  • World Community Grid validates results
  • World Community Grid assimilates the results
  • World Community Grid packages batches into tar files for researchers
  • World Community Grid uploads the packages to the research server
  • Researchers re-hydrate the results and place data into their database
  • Researchers perform analysis on the results
Three more important points for those who want to participate in the stress test:
  • After the stress test is complete, we will revert to sending out results at a pace of 2,000 work units every 30 minutes. Depending on the researchers' needs, we may modify this in the future, but for the present our plan is to continue at the 2,000 per 30-minute pace.
  • GPU work units for OpenPandemics - COVID-19 are designed to run on OpenCL version 1.2 and above. However, there are certain cards that still have issues due to having GPU drivers that aren't 100% compatible with OpenCL 1.2. Most of the issues are with cards that were released before 2016.
  • Please post any issues or questions in this thread where we can see them more easily, rather than creating new threads that may be harder for us to track.
Edit: I'm reading through the thread for good info to add to this post.

- This test is working on a specific target:

quote:

The batches for the stress test are targeting the spike protein, the most important surface protein of the virus, using a structure that was determined using cryo-electron microscopy (cryoEM) by our collaborators at the Ward lab at Scripps Research. Approximately 280 million small molecules from the ZINC database will be docked against a promising, hypothetical binding site. Our goal is to identify a few of these molecules that will bind with sufficient affinity to the spike to interfere with the replication process of SARS-CoV-2.
- When the stress test went live, 8000 WUs per minute were being created.
- There's enough traffic that WCG's LBs were dropping connections and techs were/are working on it.

That's all the news for now

mdxi fucked around with this message at 05:30 on Apr 27, 2021

mdxi
Mar 13, 2006

to JERK OFF is to be close to GOD... only with SPURTING



Spot the GPU betas and stress test:

Adbot
ADBOT LOVES YOU

mdxi
Mar 13, 2006

to JERK OFF is to be close to GOD... only with SPURTING



Pacing update on the WCG OPNG stress test:

quote:

We crunched through about 7.5k batches in the first 36 hours. We will continue with this pace until the full 30k batches have been completed.
I'm curious what they're going to do as a result of this exercise. They've already said that, initially, nothing will change. But there's no reason to do something like this other than some form of capacity planning.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply