Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
Finster Dexter
Oct 20, 2014

Beyond is Finster's mad vision of Earth transformed.
Thank You for Your Help NoSQL, but We Got It from Here

quote:

It’s time for us to admit what we have all known is true for a long time; NoSQL is the wrong tool for many of the modern application use cases, and it’s time that we move on.

Adbot
ADBOT LOVES YOU

Bloody
Mar 3, 2013

Jabor posted:

project manager: thanks for the feedback, we don't have time to address it all so we're going to launch as-is and think about these issues for v2

boss?

Powerful Two-Hander
Mar 10, 2004

Mods please change my name to "Tooter Skeleton" TIA.


lmao at this bit:


quote:

Eventual consistency is a reasonable trade-off of durability risk versus availability. If your business is consumer engagement and latency has a direct impact on your income (which is true for all content, community, and commerce applications), you want the most responsive UI you can get. If you have to scale to millions of concurrent users you can’t tolerate any bottlenecks. What you trade-off by adopting eventual consistency in your database architecture is occasionally losing someone’s post or a comment, which is an acceptable risk for these types of applications.

this is like the "we'll make it up in volume!" of application development. doesn't matter if data randomly disappears as long as it does it fast!

my homie dhall
Dec 9, 2010

honey, oh please, it's just a machine

Powerful Two-Hander posted:

lmao at this bit:


this is like the "we'll make it up in volume!" of application development. doesn't matter if data randomly disappears as long as it does it fast!

he’s correct, your posts don’t matter and having a single consistent view of the world on every client isn’t a requirement for those applications

Bloody
Mar 3, 2013

availability is usually (but not always) more important than consistency imo

Shaggar
Apr 26, 2006
if you're talking about posts then sure but if you're talking about stuff that matters consistency is critical.

HoboMan
Nov 4, 2010

i still think users would generally be more upset at their post getting lost than the service being slow but that is just me

Bloody
Mar 3, 2013

Shaggar posted:

if you're talking about posts then sure but if you're talking about stuff that matters consistency is critical.

in realtime systems availability can trump consistency

anthonypants
May 6, 2007

by Nyc_Tattoo
Dinosaur Gum

Bloody posted:

in realtime systems availability can trump consistency
lomarf

jony neuemonic
Nov 13, 2009

Jabor posted:

project manager: thanks for the feedback, we don't have time to address it all so we're going to launch as-is and think about these issues for v2

aaaaaaAAAAAAAAAAAAAAAAAAAAAAAAA

Arcsech
Aug 5, 2008

Jabor posted:

project manager: thanks for the feedback, we don't have time to address it all so we're going to launch as-is and think about these issues for v2


gonadic io posted:

after launch: okay so these are all the new features and integrations we promised the investor and or client for v2. if we get time we can refactor and look at those bugs

extremely relatable

gonadic io
Feb 16, 2011

>>=

Arcsech posted:

extremely relatable

you guys can rewrite working code in your own time i'm focused on adding value

Plorkyeran
Mar 22, 2007

To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed
two years later: why didn't you ever get around to fixing all these problems we hit before launch?

raminasi
Jan 25, 2005

a last drink with no ice

HoboMan posted:

i still think users would generally be more upset at their post getting lost than the service being slow but that is just me

facebook loading fast is wayyyyy more important than facebook perfectly retrieving every photo comment from eight years ago

(or would be if facebook were important)

JawnV6
Jul 4, 2004

So hot ...

CPColin
Sep 9, 2003

Big ol' smile.
Counterpoint: Groovy/Grails.

Finster Dexter
Oct 20, 2014

Beyond is Finster's mad vision of Earth transformed.
counterpoint: jarvascript

Workaday Wizard
Oct 23, 2009

by Pragmatica

if the easy path is not the correct path in your language it’s poo poo hth

MononcQc
May 29, 2007


it is safer to keep the ABS break handling going than the radio volume adjustments. Lots of time-critical systems willingly shed and drop non-critical operations on purpose when capacity constrains the system, in order to keep critical functionality going.

DONT THREAD ON ME
Oct 1, 2002

by Nyc_Tattoo
Floss Finder
we use mongo with critical medical data in an application that has modest performance requirements.

the person who made that decision is a terrible engineer, but ive never been able to figure out who is responsible.

anyhow only a couple weeks left here thank god. mongo with relational medical data is a waking nightmare.

HoboMan
Nov 4, 2010

lomarf if you guys really think anything should be allowed to be non-consistent without a really good loving reason ("webscale" is not a good reason)

gonadic io
Feb 16, 2011

>>=

HoboMan posted:

lomarf if you guys really think anything should be allowed to be non-responsive without a really good loving reason ("cloud" is not a good reason)

my homie dhall
Dec 9, 2010

honey, oh please, it's just a machine
lomarf if you think the options are either consistent or available with no in betweens

anthonypants
May 6, 2007

by Nyc_Tattoo
Dinosaur Gum

Ploft-shell crab posted:

lomarf if you think the options are either consistent or available with no in betweens
the in between is that if data isn't consistent then it can hardly be called available

MononcQc posted:

it is safer to keep the ABS break handling going than the radio volume adjustments. Lots of time-critical systems willingly shed and drop non-critical operations on purpose when capacity constrains the system, in order to keep critical functionality going.
putting mongodb in cars might kill more people than using mongodb for medical data, good job

anthonypants fucked around with this message at 01:28 on Jul 14, 2018

MononcQc
May 29, 2007

anthonypants posted:

the in between is that if data isn't consistent then it can hardly be called available
putting mongodb in cars might kill more people than using mongodb for medical data, good job

MongoDB is not for 'real time' systems in any meaning of the sense other than web folks using it to mean 'live streaming updates'

Real-time systems otherwise are those like phone switches (soft-real time) which have service constraints in terms of latency and quality (such as not dropping calls to 911) but can otherwise miss some events, whereas you have hard real-time systems such as cars' computers, where the example with ABS brakes comes from. Think also of airplanes, where some signals and computations cannot be late.

Toyota's issues a few years ago with cars crashing themselves had to do with an inversion of priority on high-criticality events being classified as less important than low-priority ones, ending up endangering the lives of people.

In such systems, where demand may exceed the total system capacity, load shedding and dropping data may prove more important than locking up and not doing the job you're supposed to do. Either because it results in millions of dollars in fines, or because it might kill people.

In terms of databases in the "I'm making a CRUD app" sense, yeah, you don't really get into these issues, but the moment you start to consider a message bus as a database, real-time life-critical systems for sure have measures in place to drop data on the floor.

anthonypants
May 6, 2007

by Nyc_Tattoo
Dinosaur Gum

MononcQc posted:

MongoDB is not for 'real time' systems in any meaning of the sense other than web folks using it to mean 'live streaming updates'

Real-time systems otherwise are those like phone switches (soft-real time) which have service constraints in terms of latency and quality (such as not dropping calls to 911) but can otherwise miss some events, whereas you have hard real-time systems such as cars' computers, where the example with ABS brakes comes from. Think also of airplanes, where some signals and computations cannot be late.

Toyota's issues a few years ago with cars crashing themselves had to do with an inversion of priority on high-criticality events being classified as less important than low-priority ones, ending up endangering the lives of people.

In such systems, where demand may exceed the total system capacity, load shedding and dropping data may prove more important than locking up and not doing the job you're supposed to do. Either because it results in millions of dollars in fines, or because it might kill people.

In terms of databases in the "I'm making a CRUD app" sense, yeah, you don't really get into these issues, but the moment you start to consider a message bus as a database, real-time life-critical systems for sure have measures in place to drop data on the floor.
ok now do one where your car or plane or w/e doesn't return a consistent result instead of just being temporarily unavailable

MononcQc
May 29, 2007

anthonypants posted:

ok now do one where your car or plane or w/e doesn't return a consistent result instead of just being temporarily unavailable

consistency is not just in terms of corrupting individual records or not but on the integrity of the total data set. If I'm dropping 10% of values/records/messages, I am giving you an inconsistent mechanism overall.

But sure for the sake of the exercise, let's do one where you become unavailable.

Assume that your car's CAN bus is filled with low-priority "A/C on/off" events because your dipshit toddler is playing with the buttons. You're on cruise-control as well, which fills the bus with further events. You're changing lanes and there happens to be a car in there so the events coming from lane assistance fill the bus entirely until there's no space left.

You get in a collision, but the controller in charge of deploying the airbags gets a response from the CAN bus: "sorry, buffer full. Retry later, transaction aborted". You do not die of your injuries, but get severe brain damage. Your dipshit toddler now has to care for you for the rest of your life.

MononcQc fucked around with this message at 02:02 on Jul 14, 2018

my homie dhall
Dec 9, 2010

honey, oh please, it's just a machine

anthonypants posted:

the in between is that if data isn't consistent then it can hardly be called available

the in between meaning that there are a bunch of consistency levels between strongly serializable and total garbage that are acceptable for many applications that can offer greater performance or availability characteristics

Progressive JPEG
Feb 19, 2003

HoboMan posted:

i still think users would generally be more upset at their post getting lost than the service being slow but that is just me

does youtube really need to display an accurate number of views on every video load?

Night Shade
Jan 13, 2013

Old School
terrible programmers: Your dipshit toddler now has to care for you for the rest of your life

JawnV6
Jul 4, 2004

So hot ...

Progressive JPEG posted:

does youtube really need to display an accurate number of views on every video load?

who wrote the blustery new-media piece about how display timestamps losing precision the further back they are (1d, 2d, 1w, 6mo, etc.) would have massive ramifications?

pokeyman
Nov 26, 2006

That elephant ate my entire platoon.

Night Shade posted:

terrible programmers: Your dipshit toddler now has to care for you for the rest of your life

Powerful Two-Hander
Mar 10, 2004

Mods please change my name to "Tooter Skeleton" TIA.


Progressive JPEG posted:

does youtube really need to display an accurate number of views on every video load?

it's more that if I click like I expect the result to be saved and not silently dropped. likes and poo poo (like my posts) are a simplified example but you could best describe it as "every user's activity is important to them" whereas that eventual consistency approach is more like a "the important thing is the average experience, regardless of whether at an individual level it sucks because your insightful post was ignored"

i mean the key takeaway from the piece was "dont use nosql [for anything important] " but it inexplicably uses commerce as an example where performance trumps consistency and that's stupid because you absolutely need consistency if you're dealing with purchases /stock checks etc.

MononcQc
May 29, 2007

Powerful Two-Hander posted:

i mean the key takeaway from the piece was "dont use nosql [for anything important] " but it inexplicably uses commerce as an example where performance trumps consistency and that's stupid because you absolutely need consistency if you're dealing with purchases /stock checks etc.

you are in a hospital after the car crash caused partially by your lovely driving and your dipshit toddler's handling of AC buttons within your strongly consistent car. While the doctors are trying to save what's left of your mangled body, they need access to your medical history. You were on a road trip out of the region you usually live in, so your regular practitioners and related medical records are out of town.

Fortunately, you would think (had your brain not been shaken into jelly) that since you live in a country with national health records, these remote doctors will be able to access your data as if it had been local. You would be right, but in the crash, your car took down a utility pole that carried the lines from the local hospital's regional data center to the rest of the network. You inadvertently caused a netsplit that isolated your actual caregivers from your usual ones.

Now since your site is offline to a majority of the cluster, and that you want the medical records to be fully consistent, your medical records have been made unavailable. You see, you wouldn't want to be able to have stale data, or to be able to get more prescriptions than you are allowed by driving rapidly across regions to get the same prescription filled multiple times (you drug fiend). As such, your medical history is safely unavailable from the doctors, preventing you from cheating the system, being administered medicine that ignores latest (seconds to minutes) updates to your profile, or getting your brain saved.

This is good, because had you been living in a country like the Netherlands, which uses NoSQL eventual consistent database Riak for some of its national healthcare storage, your emergency caregivers would have been able to both access and update your medical records, which would only have been re-synced later once the network is back up. But good thing hard consistency is enforced where your mindless body (and dipshit toddler) live, your brain is fully damaged for good -- that's a durable write that won't be forgotten anytime soon.

Shaggar
Apr 26, 2006
how would they access stuff in NoSQL if the NoSQL isn't available because their network is down.

or are you suggesting everyones medical records are stored at every provider location

MononcQc
May 29, 2007

Concerned about your well-being and how to care for you, the significant other with whom you had dipshit toddler is now frantically looking for books on how to deal with this situation.

Your significant other enters a bookstore part of a national chain, flips some pages, and finds some books that look interesting. At the checkout, the transaction is refused. This god drat utility pole you took down also disconnected this store from the rest of the national chain, and since the inventory is shared with the chain's website, they won't let the local store sell books they have on-location since that would cause an inconsistent inventory view between systems.

You see, that would cause issues since all other stores and the website may think that the local store has 3 copies in inventory, whereas the local store would know only 2 of them would be left. Now either the website can't sell the local copies or the local store can't get orders delivered from the other stores inventories when locally the books are all gone.

Rather than having an inventory system that would delay the shipping of the book to online customers only to send an e-mail saying "the book is unfortunately out of stock. You can wait X days for a re-stocking, or get a refund" the local store is not allowed to sell its books. Majority wins, ensuring consistency, and the majority is the rest of the store locations.

Your significant other must unfortunately buy their book online, ensuring full consistency. Knowing medical bills could be tough, they choose to use regular shipping, which is 3-5 business days. Since your accident happened on a Friday night motherfucker, the book might only come in over a week later. At which point your significant other may have lost precious time getting prepared to deal with your jelly brains.

If only some inconsistencies had been allowed in this god forsaken commerce world.

MononcQc
May 29, 2007

Shaggar posted:

how would they access stuff in NoSQL if the NoSQL isn't available because their network is down.

or are you suggesting everyones medical records are stored at every provider location

Yes, it's a thing called data replication. People can work with local redundant copies, which get synced, repaired, and updated across regions once the network heals.

If you have ever been to a pharmacy like CVS to get a prescription and the pharmacy had their own independent system instead of using the same exact records as general practitioner you have, you may inadvertently used one such system where the pharmacists work on their own copy of the data rather than updating your doctor's records directly in their computers.

MononcQc
May 29, 2007

I'm gonna cut it short on all these stories.

"Consistency" is a really overloaded term, and usually people take it to mean what you get in ACID transactions:

A: Atomicity
Atomicity guarantees that each transaction is treated as a single "unit", which either succeeds completely, or fails completely

C: Consistency
Consistency ensures that a transaction can only bring the database from one valid state to another, maintaining database invariants: any data written to the database must be valid according to all defined rules, including constraints, cascades, triggers, and any combination thereof.

I: Isolation
Isolation ensures that concurrent execution of transactions leaves the database in the same state that would've been obtained if the transactions were executed sequentially.

D: Durability
Durability guarantees that once a transaction has been committed, it will remain committed even in the case of a system failure

Having all four of these at the same time implies what would be a generally globally consistent mechanism. All records for all pieces of data must be kept in sync, fully visible to each other.

That ACID property can be local to a database (like SQL), or you can also see it as a systemic view (all databases). ACID is generally something you would like to see on a single instance of your database since it's technically easier to do locally with all the data.

But it's also something you unknowingly want to be relaxed at a distributed scope. You don't want your bank to be unable to give you money from your account because an ATM 3 cities over is offline. Instead you expect to possibly reduce Consistency and maybe Isolation.

The bank will take some risk by storing two sets of Atomic and Durable transactions, writing in a ledger that you have gotten money out twice in both areas. They may eat a loss, but upon full system repair and visibility, they will notice you have overdrafted your accounted, and add a third durable transaction saying you now own overdraft fees.

None of the transactions were lost, the data wasn't consistent, but they instead made the system able to recover from these inconsistencies. The transactions and records were not corrupted nor lost (they are durable!), but that has little impact on consistency.

People fear Mongo because it sucked rear end at all parts of ACID. Its client would declare a transaction successful after sending it to the TCP socket. Writes would be dropped, and be non-durable. Redis can be the same since it can accept writes that it never flushes to disk.

But you can use SQL without serialized transactions and get AD. Hell if you use postgres, the default isolation levels do not even provide full isolation since you can read records weirdly across transactions. https://www.postgresql.org/docs/9.5/static/transaction-iso.html is a great page to get some details.

MononcQc fucked around with this message at 13:47 on Jul 14, 2018

Jabor
Jul 16, 2010

#1 Loser at SpaceChem

MononcQc posted:

But it's also something you unknowingly want to be relaxed at a distributed scope. You don't want your bank to be unable to give you money from your account because an ATM 3 cities over is offline. Instead you expect to possibly reduce Consistency and maybe Isolation.

The bank will take some risk by storing two sets of Atomic and Durable transactions, writing in a ledger that you have gotten money out twice in both areas. They may eat a loss, but upon full system repair and visibility, they will notice you have overdrafted your accounted, and add a third durable transaction saying you now own overdraft fees.

None of the transactions were lost, the data wasn't consistent, but they instead made the system able to recover from these inconsistencies. The transactions and records were not corrupted nor lost (they are durable!), but that has little impact on consistency.

usually i think people solve this by not being able to get money out of the atm when it's offline

Adbot
ADBOT LOVES YOU

MononcQc
May 29, 2007

Jabor posted:

usually i think people solve this by not being able to get money out of the atm when it's offline

You can get money out of ATMs not even in your bank's network.

E: to be clear, you need a parent system to be online at some point, mostly to validate the cardholder data (your PIN), authorize a transaction, and so on, but there is a concept called Automated Clearing Houses that does async batch processing on transactions, and the full financial transaction having to do with shifting money around all required accounts can take up to a full business day to process.

MononcQc fucked around with this message at 15:00 on Jul 14, 2018

  • Locked thread