Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
BlankSystemDaemon
Mar 13, 2009

System Access Node Not Found



I think one of the things that set HPC apart is the way storage is handled.

As far as I know, most cloud services tend to have local or near-local/nearline storage for each of the racks they use, not unlike mainframes.
HPC uses distributed storage, and none of their compute nodes have ANY storage aside from as much RAM as they can possibly fit.

Heck, Lustre on ZFS is the entire reason Linux has ZFS - I'm not sure it would've happened otherwise.

Adbot
ADBOT LOVES YOU

Antigravitas
Dec 8, 2019

Outside Context Problem


Most cloud services are engineered to use object storage as well. That's entirely unworkable when your model requires fast random access across a 10TB dataset. A 10TB dataset isn't even large.

e: This gets funny when I tell people that no, "the cloud" is a terrible fit for us. I've run into so many people who can not comprehend that anything outside the realm of web crap exists. Like this one dude who kept asking me about our "app" and how we need to refactor our "app" and I just could not get through to him that we don't have an "app", we have a bunch of fuckoffhuge datasets and people write code as required to run against it. We can't put that in s3, we need fast low-latency access that can saturate 10Gpbs ethernet for testing before moving to big iron for the final run.

Antigravitas fucked around with this message at 10:58 on Feb 24, 2021

Pablo Bluth
Sep 7, 2007

I've made a huge mistake.


AWS are trying! But of course this is with HPC focused datacentres not just buzzwords.

https://www.nas.nasa.gov/publications/ams/2020/12-17-20.html

BlankSystemDaemon
Mar 13, 2009

System Access Node Not Found



Antigravitas posted:

A 10TB dataset isn't even large.
It really isn't.

NAME USED AVAIL REFER MOUNTPOINT
storage 25.0G 38.4T 2.65G /mnt/storage

xzzy
Mar 5, 2009



pfft if your columns don't end with P or E is it even really data??

RFC2324
Jun 7, 2012

http 418



xzzy posted:

pfft if your columns don't end with P or E is it even really data??

really more like a loose collection of anecdotes

Cardiac
Aug 28, 2012



xzzy posted:

Pretty much every government HPC cluster was given a mandate to start running protein folding jobs in the last year too. I bet you can't guess why!

I would assume Alphafold.

Adbot
ADBOT LOVES YOU

BlankSystemDaemon
Mar 13, 2009

System Access Node Not Found



xzzy posted:

pfft if your columns don't end with P or E is it even really data??
What do P or E mean in zfs list?
That's this pool on my server. All drives are a minimum of 2TB, and will be replaced with bigger drives as money permits (minimum 8TB drives, as those aren't SMR).

BlankSystemDaemon fucked around with this message at 07:35 on Feb 25, 2021

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply