New around here? Register your SA Forums Account here!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Notorious b.s.d.
Jan 25, 2003

by Reene

Rex-Goliath posted:

no i just think it's comparing apples and oranges

mongodb is a bad apple, and very few people who select nosql technologies have any loving idea what they are doing

Adbot
ADBOT LOVES YOU

PIZZA.BAT
Nov 12, 2016


:cheers:


yeah mongo is a trash fire you won't get any argument from me there

Notorious b.s.d.
Jan 25, 2003

by Reene
you should default to a mature sql-based rdbms unless you have specific reasons that the behavior of a nosql alternative is important

there are applications for which cassandra or hbase are the only viable choices, and it's important to figure out when you're looking down the barrel at one of those

don't pick cassandra just for fun, because boy howdy

Beve Stuscemi
Jun 6, 2001




use your works MS enterprise licensing to stand up a bunch of sql servers and keep them perpetually in test

Mao Zedong Thot
Oct 16, 2008


nosql has always been good

mongodb has always been bad

ThePeavstenator
Dec 18, 2012

:burger::burger::burger::burger::burger:

Establish the Buns

:burger::burger::burger::burger::burger:

Notorious b.s.d. posted:

ok so don't use sql "unless the app needs joins"

that's one of the most moronic sentences i've ever seen committed to paper

this is the kind of discussion that leads to people using mongodb

The many-to-many part is important. You can model most data relationships in a document format that would normally require a join in SQL. The main relationship that document databases are generally bad at and SQL had a huge leg up on are sets of data with many-to-many relationships.

Notorious b.s.d.
Jan 25, 2003

by Reene

ThePeavstenator posted:

The many-to-many part is important. You can model most data relationships in a document format that would normally require a join in SQL. The main relationship that document databases are generally bad at and SQL had a huge leg up on are sets of data with many-to-many relationships.

unless your app is just a k/v store, you are eventually gonna have a many to many join

and then you will wish you had used the document features of a fully fledged sql rdbms, instead of using a nosql database because you already knew how it worked from your last fart app at the prior startup

Fiedler
Jun 29, 2002

I, for one, welcome our new mouse overlords.

Notorious b.s.d. posted:

why would you stand up a separate service for this, though?

postgres, ms sql, and oracle will all happily slurp up a json document for you, and then let you query the json

and then once you need some actual database functionality, that is still there for you

the only compelling argument for building anything new on oracle is "i'm leaving this company very soon and want to burn it to the ground"

ThePeavstenator
Dec 18, 2012

:burger::burger::burger::burger::burger:

Establish the Buns

:burger::burger::burger::burger::burger:

Notorious b.s.d. posted:

unless your app is just a k/v store, you are eventually gonna have a many to many join

and then you will wish you had used the document features of a fully fledged sql rdbms, instead of using a nosql database because you already knew how it worked from your last fart app at the prior startup

You can use more than 1 database at a time. NoSql is not just rockstar coding ninjas building WebScale apps and piping their data into the DB equivalent /dev/null. SQL, document, graph, key/value, and columnstore databases all exist and have valid use cases depending on what you need them for.








Yes I know this is a Wendy's drive through, I'll have a Baconator and a Frosty.

Sapozhnik
Jan 2, 2005

Nap Ghost
new-style databases (some of which have optional sqloid front ends) are something you use when you have no other choice: when you have a massive database that needs to be geographically distributed. their capabilities are powerful but the price (not just monetary) of those capabilities is severe operational pain that your organization has to be prepared to endure in exchange

for most applications it's literally the computing equivalent of buying a ford f150 to do a 15 mile highway commute because that's what Real Men drive (except the tradeoffs are way worse than getting mediocre fuel economy purely for the sake of being a stupid rear end in a top hat)

and no your "clickstream analytics" or whatever terabytes of worthless garbage you collect are not valuable business data. i mean yeah sure stick them in some garbage freeform data store, it's not like they actually matter, but that's not an endorsement of said garbage data stores or any of their non-existent merits. if you don't mind your "database" making GBS threads its pants and losing data from time to time i can write you one that's as fast as you could possibly want!

Fiedler posted:

the only compelling argument for building anything new on oracle is "i'm leaving this company very soon and want to burn it to the ground"

Notorious b.s.d.
Jan 25, 2003

by Reene

ThePeavstenator posted:

You can use more than 1 database at a time. NoSql is not just rockstar coding ninjas building WebScale apps and piping their data into the DB equivalent /dev/null. SQL, document, graph, key/value, and columnstore databases all exist and have valid use cases depending on what you need them for.

a regular ole' sql database will happily handle your document, graph, and k/v needs, most of the time. why would you want to maintain two services and two schemas, with separate transaction models, when you could just ... not.

if you have a mongodb and a postgres db next to each other, you have to spend time syncing data between them. or, you could just put all of the data into postgres, and have a single query in a single transaction that references document data AND table structured data

you need a really special reason to abandon all the good stuff that comes in an ordinary sql rdbms

----

column stores are a whole different kettle of fish, i'll give you that one.

Notorious b.s.d.
Jan 25, 2003

by Reene

Fiedler posted:

the only compelling argument for building anything new on oracle is "i'm leaving this company very soon and want to burn it to the ground"

no i would never invite oracle into my business, that's like inviting a vampire into the house

but if your poo poo is already invested in oracle, oracle db is genuinely a veyr nice product. you just never, ever want to think about how much it costs

tak
Jan 31, 2003

lol demowned
Grimey Drawer

Notorious b.s.d. posted:

ok so don't use sql "unless the app needs joins"

that's one of the most moronic sentences i've ever seen committed to paper

this is the kind of discussion that leads to people using mongodb

The key word was "complex", you can do joins. Get with the times grandpa

If you set up your data model for it, NoSQL can make it super easy to scale a web app (never tried mongo though)

Horizontal auto scaling owns

Notorious b.s.d.
Jan 25, 2003

by Reene

tak posted:

If you set up your data model for it, NoSQL can make it super easy to scale a web app (never tried mongo though)

Horizontal auto scaling owns

are you loving kidding me?

what web property are you running that needs scalability greater than what a single database server can provide

keep in mind that all of stack overflow uses a single ms sql pair, serving requests live. because that's all they need. and when they outgrow it, they'll buy a bigger box

Notorious b.s.d.
Jan 25, 2003

by Reene
tak is the "web scale" guy

i knew one of these guys would show up as soon as we started nosql chat

https://www.youtube.com/watch?v=b2F-DItXtZs

Gildiss
Aug 24, 2010

Grimey Drawer

Notorious b.s.d. posted:

are you loving kidding me?

what web property are you running that needs scalability greater than what a single database server can provide

keep in mind that all of stack overflow uses a single ms sql pair, serving requests live. because that's all they need. and when they outgrow it, they'll buy a bigger box

People that choose MongoDB are doing a great service for jobs, and job security for the field.
How can we get paid to look good and competent and put out fires if there are no fires to put out?
If everyone else was handsome and intelligent then I'd be a loving hobo down by the river.

stuffed crust punk
Oct 8, 2004

by LITERALLY AN ADMIN
i enjoy job interviews and job interview discussion

Schadenboner
Aug 15, 2011

by Shine

See, I was nodding along until this point.

But, uh, go MongoDB?

:confused:

Schadenboner fucked around with this message at 01:24 on Sep 13, 2019

crepeface
Nov 5, 2004

r*p*f*c*
turns out i didn't do as badly as i thought in my interview, i'm on to the next stage. :)

Fiedler
Jun 29, 2002

I, for one, welcome our new mouse overlords.

Notorious b.s.d. posted:

a regular ole' sql database will happily handle your document, graph, and k/v needs, most of the time. why would you want to maintain two services and two schemas, with separate transaction models, when you could just ... not. ...column stores are a whole different kettle of fish, i'll give you that one.

mssql does all of that including column stores

tak
Jan 31, 2003

lol demowned
Grimey Drawer

Notorious b.s.d. posted:

are you loving kidding me?

what web property are you running that needs scalability greater than what a single database server can provide

keep in mind that all of stack overflow uses a single ms sql pair, serving requests live. because that's all they need. and when they outgrow it, they'll buy a bigger box

Lots

We're running around 20 GKE nodes on average, hundreds of terabytes of data, hundreds of millions of api calls per day. lots of mapreduce pipelines and secondary indexing in different technologies depending on requirements (elastic and bigquery mainly, with idempotent pubsub and idempotency-centric data schemas to keep things sane), lots of different teams and products

It's super easy to rapidly prototype, and when we need SQL we use it (mostly cloudsql, based on postgres, around 20 TB. And bigquery for analytics and stuff that doesn't need sub 100ms latency)

And absolutely zero computer touching

Lol at the thought of ever touching a db config file manually ever again

tak fucked around with this message at 02:44 on Sep 13, 2019

barkbell
Apr 14, 2006

woof
the company that asked for my references and called them a couple weeks ago seems to have ghosted me. cool.

barkbell fucked around with this message at 02:45 on Sep 13, 2019

Notorious b.s.d.
Jan 25, 2003

by Reene

tak posted:

We're running around 20 GKE nodes on average, hundreds of terabytes of data, hundreds of millions of api calls per day. lots of mapreduce pipelines and secondary indexing in different technologies depending on requirements (elastic and bigquery mainly), lots of different teams and products

you are still within the reasonable bounds of a single db server

hundreds of millions of api calls per day... so like 1000 qeuries per second? thats on the big end of a single sql rdbms, but it is doable

it seems like the primary reason you need "horizontal auto scaling" is you are on lovely cloud nodes that are individually way too small and expensive to be useful

you basically created a problem for yourself that you then needed to solve

tak
Jan 31, 2003

lol demowned
Grimey Drawer

Notorious b.s.d. posted:

you are still within the reasonable bounds of a single db server

hundreds of millions of api calls per day... so like 1000 qeuries per second? thats on the big end of a single sql rdbms, but it is doable

it seems like the primary reason you need "horizontal auto scaling" is you are on lovely cloud nodes that are individually way too small and expensive to be useful

you basically created a problem for yourself that you then needed to solve

This is not in the bounds of a single db server

How do you handle frequent schema changes and large migrations across dozens of products and teams with tight SLOs without spending millions on licenses, hardware, and sr db admins?

NoSQL is a tool for certain jobs

A centralized db infra is another

Notorious b.s.d.
Jan 25, 2003

by Reene

tak posted:

This is not in the bounds of a single db server

How do you handle frequent schema changes and large migrations across dozens of products and teams with tight SLOs without spending millions on licenses, hardware, and sr db admins?

NoSQL is a tool for certain jobs

A centralized db infra is another

nice goalpost moving

so now the problem isn't load or scaling, it's "schemaless" schema migration and coordination across dozens of teams. totally different claim than two seconds ago

guess what: that is a hard problem no matter how you store data. nosql doesn't magic away coordination on dozens of teams

tak
Jan 31, 2003

lol demowned
Grimey Drawer

Notorious b.s.d. posted:

nice goalpost moving

so now the problem isn't load or scaling, it's "schemaless" schema migration and coordination across dozens of teams. totally different claim than two seconds ago

guess what: that is a hard problem no matter how you store data. nosql doesn't magic away coordination on dozens of teams

It's both. And I didn't say schemaless, you're projecting mongomania on the concept of nosql in general

Data storage isn't the interesting part of my job, and I'm glad I can work with technologies where I don't have to think much about the uninteresting parts unless I need to. We have a small infrastructure team that handles most of that

You may be right, maybe we should move over to something more centralized, and maybe it would save money on storage costs overall in a few years. But I'd rather just keep making more features in software that people are happy to pay for, and the board would tend to agree

Notorious b.s.d.
Jan 25, 2003

by Reene

tak posted:

It's both. And I didn't say schemaless, you're projecting mongomania on the concept of nosql in general

Data storage isn't the interesting part of my job, and I'm glad I can work with technologies where I don't have to think much about the uninteresting parts unless I need to. We have a small infrastructure team that handles most of that

You may be right, maybe we should move over to something more centralized, and maybe it would save money on storage costs overall in a few years. But I'd rather just keep making more features in software that people are happy to pay for, and the board would tend to agree

mongodb is web scale

Qtotonibudinibudet
Nov 7, 2011



Omich poluyobok, skazhi ty narkoman? ya prosto tozhe gde to tam zhivu, mogli by vmeste uyobyvat' narkotiki

Notorious b.s.d. posted:

the main reason to use cassandra is not that it's "nosql" it's that you get very interesting scalability and consistency guarantees for extremely specific use cases

while we're still on db chat, the reason to use cassandra is that you hate yourself. it's a wonderful thing that people think they can just run like a postgres server because it's "just a database, but also free distributed systems magic with no extra work!"

i pray that azure cosmos cassandra poo poo gets to a good state so that i never have to deal with any "we want multi-site distributed systems but we don't need to hire anyone with cassandra experience" customers ever again.

PIZZA.BAT
Nov 12, 2016


:cheers:


tak posted:

This is not in the bounds of a single db server

How do you handle frequent schema changes and large migrations across dozens of products and teams with tight SLOs without spending millions on licenses, hardware, and sr db admins?

NoSQL is a tool for certain jobs

A centralized db infra is another

hate to memepost but those who don't know just don't know

don't bother too much with the unenlightened. they'll figure it out eventually

PIZZA.BAT
Nov 12, 2016


:cheers:


but can nbsd see why kids LOVE the taste of cinnamon toast crunch??

School of How
Jul 6, 2013

quite frankly I don't believe this talk about the market
In my experience, it much more common for perfectly scalable code to be accused of being "unscalable", than it is for something actually unscalable to exist. I can't tell you how many times I've come across something that is perfectly usable, but someone comes along and says "this won't scale, it has to be redone". Ot all comes from this simplistic belief that "nosql == scalable", and "sql == not scalable". Before that it was "compiled language == scalable", and "non-compiled language == not scalable".

the right way is to just build the drat thing using whatever tools you are comfortable with, and then not worry about scalability until the time comes that scalable actually starts to affect the product. The problem with this approach is that usually that point where it starts to affect the product may never come. When you're doing a job interview it's common to get asked "what did you do to handle the scale". If you answer the question with "I didn't do anything", then you won't impress the interviewer. The people that overengineered their solution to solve problems that don't exist have a crazy story to tell the interviewer and have an easier time impressing interviewers. Interviewers want to hear crazy stories about how you had to roll your own load balancer, or had to modify the nginx sourcecode or something like that.

tak
Jan 31, 2003

lol demowned
Grimey Drawer

School of How posted:


the right way is to just build the drat thing using whatever tools you are comfortable with, and then not worry about scalability until the time comes that scalable actually starts to affect the product. The problem with this approach is that usually that point where it starts to affect the product may never come. When you're doing a job interview it's common to get asked "what did you do to handle the scale". If you answer the question with "I didn't do anything", then you won't impress the interviewer.

I agree with the first part but the secret to the second is to build something that makes money for your customers and sell it a lot while spending dev resources adding features with big contracts behind them

But regardless it's not hard to spin

"I made sure to prioritize solving real problems experienced by our customers through constant communication between developers, product management, and stakeholders; and by maintaining a rapid and efficient feature development feedback cycle. Performance improvements and gold plating were never prioritized because they were never necessary, and the product scaled to do what we needed without any significant redesign"

qhat
Jul 6, 2015


School of How posted:

In my experience, it much more common for perfectly scalable code to be accused of being "unscalable", than it is for something actually unscalable to exist. I can't tell you how many times I've come across something that is perfectly usable, but someone comes along and says "this won't scale, it has to be redone". Ot all comes from this simplistic belief that "nosql == scalable", and "sql == not scalable". Before that it was "compiled language == scalable", and "non-compiled language == not scalable".

the right way is to just build the drat thing using whatever tools you are comfortable with, and then not worry about scalability until the time comes that scalable actually starts to affect the product. The problem with this approach is that usually that point where it starts to affect the product may never come. When you're doing a job interview it's common to get asked "what did you do to handle the scale". If you answer the question with "I didn't do anything", then you won't impress the interviewer. The people that overengineered their solution to solve problems that don't exist have a crazy story to tell the interviewer and have an easier time impressing interviewers. Interviewers want to hear crazy stories about how you had to roll your own load balancer, or had to modify the nginx sourcecode or something like that.

is there like anything at all that you are correct on

Notorious b.s.d.
Jan 25, 2003

by Reene
sure we could have a single ACID database on a pair of nodes but isn’t it more fun to troubleshoot a distributed system running on twenty or more VMs in the cloud?

just think of all the time spent writing Hadoop jobs for secondary indices! spark, hive, map/reduce! our resumes will be stuffed for the next fart app

if anyone thinks this is stupid, just use a baffling combination of buzzwords until they go away

qhat
Jul 6, 2015


i don't think any interviewers want to hear a story about you rolling your own load balancer or hideously overengineering a solution for christ's sake. they want to know whether you considered the problem of scale and as such made appropriate decisions on architecture based on reasonable assumptions of usage. just being like "whatever just going to make this all in butts.js who cares scale later" is exactly the kind of bullshit that generates incredible amounts of technical debt that someone definitely has to pick up later down the line.

tak
Jan 31, 2003

lol demowned
Grimey Drawer

Notorious b.s.d. posted:

sure we could have a single ACID database on a pair of nodes but isn’t it more fun to troubleshoot a distributed system running on twenty or more VMs in the cloud?

just think of all the time spent writing Hadoop jobs for secondary indices! spark, hive, map/reduce! our resumes will be stuffed for the next fart app

if anyone thinks this is stupid, just use a baffling combination of buzzwords until they go away

You have no idea what requirements I have our what the performance constraints are

I don't use any of those except mapreduce, but do you think those technologies are just pointless cloud marketing jargon or something? Processing hundreds of millions of entities daily with mapreduce is easy and easily scales to whatever you need. And my median query and write latency stays below 100ms the whole time. There's a reason Google uses this stuff

We have microservices that share a common node pool, and we are generally sitting around 70% cpu in the whole cluster. But there's a huge variance in load, and proper autoscaling just works without ever needing to worry about over our under provisioning

If we had to provision physical machines to handle peak load, we'd easily be spending 20x as much on hardware alone

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
the funny thing about load is that when one service is seeing its peak amount of user-facing traffic, so is every other service.

tak
Jan 31, 2003

lol demowned
Grimey Drawer

Jabor posted:

the funny thing about load is that when one service is seeing its peak amount of user-facing traffic, so is every other service.

Exactly

horizontal scaling meshes nicely with that

jesus WEP
Oct 17, 2004


Notorious b.s.d. posted:

you basically created a problem for yourself that you then needed to solve
can we just change yospos’s name to this

Adbot
ADBOT LOVES YOU

Progressive JPEG
Feb 19, 2003

Notorious b.s.d. posted:

you basically created a problem for yourself that you then needed to solve

my resume is going up uP UP

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply