Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Cold on a Cob
Feb 6, 2006

i've seen so much, i'm going blind
and i'm brain dead virtually

College Slice

floatman posted:

There was recently a jobs queue system implemented in SQL.
Jobs went in a table, then something picked jobs out. Jobs were then done.
Because there was multiple users who could queue jobs, a random picking algorithm was required so that the job system doesn't get stuck doing all the jobs for user A before user B (gotta be fair).
BUT then there was the concept that some jobs were of higher priority than other jobs, so there must be a random picking of jobs, but "randomly" pick the higher priority jobs first. And it was decided that random was not fair enough, so we must pick user jobs by round robin (i.e. high priority job for user A, high priority job for userB, normal job for user A, etc).

So how did this pan out? Some loving poo poo giant rear end SQL statement embedded in the codebase that used recursive CTEs. Since jobs were either waiting status or completed status, whenever a job was picked, the job record needed a lock to lock the job so if the SQL did run again while this job was running it wouldn't pick up the same job. This lock also touched the other tables that were involved in the job (I E. Updating of user details, etc you can imagine the loving shitshow). The recursive CTE query also had this group/order by statement that did JACK poo poo and was not needed but formed 90% of the query plan. Database just loving mysteriously died one-day once the amount of data that needed to be scanned through surpassed the query planned limits and the SQL server just loving gave up and decided to chug along row by row.
And here's the kicker:
The codebase called that SQL query to pick up a "semi-random" job if it wanted to get the next job to process. But if the codebase KNEW the primary key of the job it wanted to process and needed to get that job, it would still call this idiot hellfucker SQL query but with the primary key specified so even though you're getting just that one record you could have gotten with a straight up select you're still going to pay the costs of the recursive CTE and the 90% group/order by statement.

All these loving problems could have been avoided if we just did a naive random select from a queue of jobs. "But what about higher priority jobs," You may ask.
Welp, JUST USE TWO SEPARATE QUEUES and empty the priority one first duhhhhhhhhhhhhhh

lol

please tell me this was in sql server or oracle where there are proper queues built into the loving database engine that are miles better than trying to write your own from scratch :allears:

Adbot
ADBOT LOVES YOU

abigserve
Sep 13, 2009

this is a better avatar than what I had before
for you sql gurus how good is it at handling billions/trillions of very small rows (like <100 byte average length)

I am writing an application that does tens of thousands of rows per second and a standard interface is much nicer than writing it raw to disk and also writing some handlers for retrieving it.

The code is modular so I can write to anything, I'm going to try mongodb as well because it is web scale, maybe I'll do hadoop as well but I don't even know where to start with something like that

Cold on a Cob
Feb 6, 2006

i've seen so much, i'm going blind
and i'm brain dead virtually

College Slice
wait

"Since jobs were either waiting status or completed status, whenever a job was picked, the job record needed a lock to lock the job so if the SQL did run again while this job was running it wouldn't pick up the same job."

they left the jobs in the "queue" ?!?!?!

:cripes:

cinci zoo sniper
Mar 15, 2013




abigserve posted:

for you sql gurus how good is it at handling billions/trillions of very small rows (like <100 byte average length)

I am writing an application that does tens of thousands of rows per second and a standard interface is much nicer than writing it raw to disk and also writing some handlers for retrieving it.

The code is modular so I can write to anything, I'm going to try mongodb as well because it is web scale, maybe I'll do hadoop as well but I don't even know where to start with something like that

heavily depends on data structure and intended uses

abigserve
Sep 13, 2009

this is a better avatar than what I had before
it's network flow data, consists of a single solitary table and looks like this (sorry for the formatting)

code:
Field, Type, Null, Key, Default, Extra
last_switched,datetime,YES,MUL,NULL,
src_ip,"int(4) unsigned",YES,,NULL,
src_port,"int(2) unsigned",NO,,NULL,
dst_ip,"int(4) unsigned",YES,,NULL,
dst_port,"int(2) unsigned",NO,,NULL,
in_bytes,"int(8) unsigned",NO,,NULL,
in_pkts,"int(8) unsigned",NO,,NULL,
protocol,"int(1) unsigned",NO,,6,
src_ipv6,varbinary(16),YES,,NULL,
dst_ipv6,varbinary(16),YES,,NULL,

I guess the intended use is to store and have the ability to search through the data within reasonably narrow timeframes.

cinci zoo sniper
Mar 15, 2013




abigserve posted:

it's network flow data, consists of a single solitary table and looks like this (sorry for the formatting)

code:
Field, Type, Null, Key, Default, Extra
last_switched,datetime,YES,MUL,NULL,
src_ip,"int(4) unsigned",YES,,NULL,
src_port,"int(2) unsigned",NO,,NULL,
dst_ip,"int(4) unsigned",YES,,NULL,
dst_port,"int(2) unsigned",NO,,NULL,
in_bytes,"int(8) unsigned",NO,,NULL,
in_pkts,"int(8) unsigned",NO,,NULL,
protocol,"int(1) unsigned",NO,,6,
src_ipv6,varbinary(16),YES,,NULL,
dst_ipv6,varbinary(16),YES,,NULL,

I guess the intended use is to store and have the ability to search through the data within reasonably narrow timeframes.

this is going to be a really lovely table to query like that, assuming you'll use last_switched, a nullable field, as the key

abigserve
Sep 13, 2009

this is a better avatar than what I had before

cinci zoo sniper posted:

this is going to be a really lovely table to query like that, assuming you'll use last_switched, a nullable field, as the key

thanks for noticing that, that was an error in my schema as it wasn't supposed to be a nullable field

cinci zoo sniper
Mar 15, 2013




abigserve posted:

thanks for noticing that, that was an error in my schema as it wasn't supposed to be a nullable field

if it's a non-nullable, monotonically increasing or decreasing field that will be used in every single query, then in postgres you can use last_switched as partitioning key or brin index key. that should make this table decently well manageable at least until low-mid billions of records

abigserve
Sep 13, 2009

this is a better avatar than what I had before

cinci zoo sniper posted:

if it's a non-nullable, monotonically increasing or decreasing field that will be used in every single query, then in postgres you can use last_switched as partitioning key or brin index key. that should make this table decently well manageable at least until low-mid billions of records

it'd be pretty easy to write psql as another supported datastore. I guess I should probably just bite the bullet, pay the :tenbux: for a db.t2.xlarge instance and a couple terabytes of storage then just see what happens

cinci zoo sniper
Mar 15, 2013




abigserve posted:

it'd be pretty easy to write psql as another supported datastore. I guess I should probably just bite the bullet, pay the :tenbux: for a db.t2.xlarge instance and a couple terabytes of storage then just see what happens

eh, for testing maybe, for production im not a fan db.t2.xlarge or db.t2.2xlarge because they are postgresql 9.x. for you 10 is more or less needed, 11 is desired (which is only on rds preview right now i think) if you want to do partitions, rather than brin indexing.

Powerful Two-Hander
Mar 10, 2004

Mods please change my name to "Tooter Skeleton" TIA.


floatman posted:

There was recently a jobs queue system implemented in SQL.
Jobs went in a table, then something picked jobs out. Jobs were then done.
Because there was multiple users who could queue jobs, a random picking algorithm was required so that the job system doesn't get stuck doing all the jobs for user A before user B (gotta be fair).
BUT then there was the concept that some jobs were of higher priority than other jobs, so there must be a random picking of jobs, but "randomly" pick the higher priority jobs first. And it was decided that random was not fair enough, so we must pick user jobs by round robin (i.e. high priority job for user A, high priority job for userB, normal job for user A, etc).

So how did this pan out? Some loving poo poo giant rear end SQL statement embedded in the codebase that used recursive CTEs. Since jobs were either waiting status or completed status, whenever a job was picked, the job record needed a lock to lock the job so if the SQL did run again while this job was running it wouldn't pick up the same job. This lock also touched the other tables that were involved in the job (I E. Updating of user details, etc you can imagine the loving shitshow). The recursive CTE query also had this group/order by statement that did JACK poo poo and was not needed but formed 90% of the query plan. Database just loving mysteriously died one-day once the amount of data that needed to be scanned through surpassed the query planned limits and the SQL server just loving gave up and decided to chug along row by row.
And here's the kicker:
The codebase called that SQL query to pick up a "semi-random" job if it wanted to get the next job to process. But if the codebase KNEW the primary key of the job it wanted to process and needed to get that job, it would still call this idiot hellfucker SQL query but with the primary key specified so even though you're getting just that one record you could have gotten with a straight up select you're still going to pay the costs of the recursive CTE and the 90% group/order by statement.

All these loving problems could have been avoided if we just did a naive random select from a queue of jobs. "But what about higher priority jobs," You may ask.
Welp, JUST USE TWO SEPARATE QUEUES and empty the priority one first duhhhhhhhhhhhhhh

lmao what the gently caress

there was some process somewhere another team had that ran on a "last in first out" queue and apparently started back logging itself around 2008 but nobody noticed, then when they moved it to a non-potato server a few years later it started blasting out 5 year old confirmation documents to clients who were pissed

for my part I set off a massive load of alerts because I refactored a big old mess of sql data views/procs and made it about 10x faster which caused our exporter to blast out messages so fast the receiver poo poo itself, that was pretty funny

NihilCredo
Jun 6, 2011

iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt


for the sake of discussion, let's assume that we actually needed a complicated priority system (otherwise the whole thing is trivial as you noted)

seems like the right approach would have been to separate the priority scoring from the actual job queue

for the former, you can run that overly-complicated query at the READ UNCOMMITTED transaction isolation level so it doesn't acquire any locks. it's not displaying data to the customer or anything, so dirty reads aren't a problem

then, with the job_id in hand from the query, you can set the isolation level back to highest make full use of every locking feature your database provides to actually update that job to "processing" status and ensure that it gets run one and only one time. you're only touching the job table at this point so it doesn't affect the rest of the program

MrMoo
Sep 14, 2000

abigserve posted:

it's network flow data, consists of a single solitary table and looks like this (sorry for the formatting)

code:
Field, Type, Null, Key, Default, Extra
last_switched,datetime,YES,MUL,NULL,
src_ip,"int(4) unsigned",YES,,NULL,
src_port,"int(2) unsigned",NO,,NULL,
dst_ip,"int(4) unsigned",YES,,NULL,
dst_port,"int(2) unsigned",NO,,NULL,
in_bytes,"int(8) unsigned",NO,,NULL,
in_pkts,"int(8) unsigned",NO,,NULL,
protocol,"int(1) unsigned",NO,,6,
src_ipv6,varbinary(16),YES,,NULL,
dst_ipv6,varbinary(16),YES,,NULL,

I guess the intended use is to store and have the ability to search through the data within reasonably narrow timeframes.

Make the IPv6 entries only 12 bytes long and merge with the IPv4 columns to create the final result.

Powerful Two-Hander
Mar 10, 2004

Mods please change my name to "Tooter Skeleton" TIA.


that would leave open the case of "job got set to processing and never finished because it halted half way" though which yeah i guess you could solve via a time out or whatever or through a db transaction or something buutttt why not just have you queue read by a single process dispatcher that passes the jobs to workers that do the actual thing and write success when they're done using the job id directly ?

any time the dispatcher fails it'd just start up and pick up the next x jobs and carry on and you don't have to worry about messing with db locking at all and the workers don't need to know or care about the priority.

Shaggar
Apr 26, 2006

NihilCredo posted:

for the sake of discussion, let's assume that we actually needed a complicated priority system (otherwise the whole thing is trivial as you noted)

seems like the right approach would have been to separate the priority scoring from the actual job queue

for the former, you can run that overly-complicated query at the READ UNCOMMITTED transaction isolation level so it doesn't acquire any locks. it's not displaying data to the customer or anything, so dirty reads aren't a problem

then, with the job_id in hand from the query, you can set the isolation level back to highest make full use of every locking feature your database provides to actually update that job to "processing" status and ensure that it gets run one and only one time. you're only touching the job table at this point so it doesn't affect the rest of the program

read uncommitted (aka nolock) wont work for that since multiple processes performing the same query will get the same dirty data and then all attempt to pull the same row for processing. you need to use readpast and updlock which will read past locked rows and then lock only your row you intend to process.

Shaggar fucked around with this message at 16:42 on Dec 21, 2018

Shaggar
Apr 26, 2006

Powerful Two-Hander posted:

that would leave open the case of "job got set to processing and never finished because it halted half way" though which yeah i guess you could solve via a time out or whatever or through a db transaction or something buutttt why not just have you queue read by a single process dispatcher that passes the jobs to workers that do the actual thing and write success when they're done using the job id directly ?

any time the dispatcher fails it'd just start up and pick up the next x jobs and carry on and you don't have to worry about messing with db locking at all and the workers don't need to know or care about the priority.

why would the single process dispatcher be any less likely to fail in the middle of processing than independent dispatchers? what if it assigns a process to a worker and the worker crashes? you have to handle retries some how and a single dispatcher doesn't really help.

Powerful Two-Hander
Mar 10, 2004

Mods please change my name to "Tooter Skeleton" TIA.


Shaggar posted:

why would the single process dispatcher be any less likely to fail in the middle of processing than independent dispatchers? what if it assigns a process to a worker and the worker crashes? you have to handle retries some how and a single dispatcher doesn't really help.

the dispatcher does have to worry about "other" processes touching the queue so you don't have to mess with row locks to ensure that a job only gets processed once, you take away the issue of "workers A and B both try to process job 1" because you've assigned job 1 to a specific worker first.

if the workers are distributed in different components that can fail independently then yeah you could restart the dispatch and reassign a job but you can solve that by tagging a job to the worker or just run the workers as separate threads in a single service,depends on volume I guess.

i mean you could still get the dreaded "worker finished the job but failed before it could write completion"

edit: ofc at some point someone will complain it's slow and install a second dispatcher and you're back to square one

edit 2: why the gently caress am I posting all this I'm supposed to be on holiday

Shaggar
Apr 26, 2006
the row locking problem is actually pretty easy to handle and the failure handling is far and away the hardest issue, especially when you need to avoid reprocessing effects. also if you design for multiple processes in the queue then you can easily handle scaling up by adding more processes.

Nomnom Cookie
Aug 30, 2009



abigserve posted:

for you sql gurus how good is it at handling billions/trillions of very small rows (like <100 byte average length)

I am writing an application that does tens of thousands of rows per second and a standard interface is much nicer than writing it raw to disk and also writing some handlers for retrieving it.

The code is modular so I can write to anything, I'm going to try mongodb as well because it is web scale, maybe I'll do hadoop as well but I don't even know where to start with something like that

better than handling trillions of large rows?

mongo will poo poo itself. don't use mongo. it is possible but you will regret it (actually this is true anytime mongo is under consideration for anything)

what i would do is write to elasticsearch, export completed indexes to parquet, and delete indexes from elasticsearch after a day or a week. kibana for NRT queries and hive/presto serves historical queries. just like log aggregation

NihilCredo
Jun 6, 2011

iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

Powerful Two-Hander posted:

that would leave open the case of "job got set to processing and never finished because it halted half way" though which yeah i guess you could solve via a time out or whatever or through a db transaction or something buutttt why not just have you queue read by a single process dispatcher that passes the jobs to workers that do the actual thing and write success when they're done using the job id directly ?

any time the dispatcher fails it'd just start up and pick up the next x jobs and carry on and you don't have to worry about messing with db locking at all and the workers don't need to know or care about the priority.

when you set it to processing, also set a column with the unique ID of the dispatcher. if that ID is the same between restarts, then a dispatcher that crashes can pick up its own jobs when it restarts. failing that, have a way to get the IDs of all active dispatchers (can be just a time with last_seen time updated every minute) and treat as free any job marked as 'processing' by an absentee dispatcher

Shaggar posted:

read uncommitted (aka nolock) wont work for that since multiple processes performing the same query will get the same dirty data and then all attempt to pull the same row for processing. you need to use readpast and updlock which will read past locked rows and then lock only your row you intend to process.

you use read uncommitted when selecting the job_id to process

you use for update skip locked (or the mssql equivalent) when you actually update the job_id you found and set it to 'processing'

if the update fails, you assume someone who read your same data beat you to the job and goto step 1. you may have to run the heavy query multiple times but this way you never actually locks the non-job queue tables

NihilCredo
Jun 6, 2011

iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

cinci zoo sniper posted:

eh, for testing maybe, for production im not a fan db.t2.xlarge or db.t2.2xlarge because they are postgresql 9.x. for you 10 is more or less needed, 11 is desired (which is only on rds preview right now i think) if you want to do partitions, rather than brin indexing.

i think he's a perfect fit for https://www.timescale.com/how-it-works

Powerful Two-Hander
Mar 10, 2004

Mods please change my name to "Tooter Skeleton" TIA.


NihilCredo posted:

when you set it to processing, also set a column with the unique ID of the dispatcher. if that ID is the same between restarts, then a dispatcher that crashes can pick up its own jobs when it restarts. failing that, have a way to get the IDs of all active dispatchers (can be just a time with last_seen time updated every minute) and treat as free any job marked as 'processing' by an absentee dispatcher



yeah that's kind of what i was thinking. pretty sure we actually had a system like this years ago but the workers were set up so that if one died, they all refused to work until it was restarted. must have been unionised I guess.

also I apologise for derailing the terrible programmer stories and taking us all to the bikeshed

cinci zoo sniper
Mar 15, 2013





never seen this before, curious. off the bat seems like a decent fit yeah, since the target workload is rather pristine - not sure if there can be any much partitioning in space done however

Nomnom Cookie
Aug 30, 2009




oh that's a good one. closest thing i've ever used is influxdb which is not suitable

akadajet
Sep 14, 2003


timecube has grown up

Shaggar
Apr 26, 2006

NihilCredo posted:

when you set it to processing, also set a column with the unique ID of the dispatcher. if that ID is the same between restarts, then a dispatcher that crashes can pick up its own jobs when it restarts. failing that, have a way to get the IDs of all active dispatchers (can be just a time with last_seen time updated every minute) and treat as free any job marked as 'processing' by an absentee dispatcher


you use read uncommitted when selecting the job_id to process

you use for update skip locked (or the mssql equivalent) when you actually update the job_id you found and set it to 'processing'

if the update fails, you assume someone who read your same data beat you to the job and goto step 1. you may have to run the heavy query multiple times but this way you never actually locks the non-job queue tables

wouldn't this cause a race condition wrt the update? like if 2 processes have selected the same job id and then theres a delay between when one attempts to update and its selection of job id, you run the risk of the faster process successfully updating and committing before the later tries its update. so when the second does its update the first process's lock will be gone so the update will work.

or are you locking the row for the entire processing of the job? In which case theres no real reason to set it to a temporary status of "processing" because that status would be removed by the final job status or in the event of a failure the transaction will rollback to before it was selected.

Cold on a Cob
Feb 6, 2006

i've seen so much, i'm going blind
and i'm brain dead virtually

College Slice
i seriously don't understand why you wouldn't just use a service broker queue if you're using mssql

Shaggar
Apr 26, 2006
at some point im gonna use service broker to do pub sub instead of polling, but i'll still have the same table structure for reporting and app needs. service broker would basically be running the same query to get the list if items to queue tho.

NihilCredo
Jun 6, 2011

iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

Shaggar posted:

wouldn't this cause a race condition wrt the update? like if 2 processes have selected the same job id and then theres a delay between when one attempts to update and its selection of job id, you run the risk of the faster process successfully updating and committing before the later tries its update. so when the second does its update the first process's lock will be gone so the update will work.

each dispatcherruns something like "update jobs set processing = true, owner = @my_id where job_id = @chosen_id and processing = false"

I'm on phone so pretend i added the various id columns and skip locked clauses

first dispatcher will see 1 row affected and do the job. second one will see 0 rows and go look for another chosen_id

Shaggar
Apr 26, 2006
ok i got you. that works. i wouldn't have thought to do it by rows affected.

CRIP EATIN BREAD
Jun 24, 2002

Hey stop worrying bout my acting bitch, and worry about your WACK ass music. In the mean time... Eat a hot bowl of Dicks! Ice T



Soiled Meat

abigserve posted:

it's network flow data, consists of a single solitary table and looks like this (sorry for the formatting)

code:
Field, Type, Null, Key, Default, Extra
last_switched,datetime,YES,MUL,NULL,
src_ip,"int(4) unsigned",YES,,NULL,
src_port,"int(2) unsigned",NO,,NULL,
dst_ip,"int(4) unsigned",YES,,NULL,
dst_port,"int(2) unsigned",NO,,NULL,
in_bytes,"int(8) unsigned",NO,,NULL,
in_pkts,"int(8) unsigned",NO,,NULL,
protocol,"int(1) unsigned",NO,,6,
src_ipv6,varbinary(16),YES,,NULL,
dst_ipv6,varbinary(16),YES,,NULL,

I guess the intended use is to store and have the ability to search through the data within reasonably narrow timeframes.

what would be your search parameters for this?

Carthag Tuek
Oct 15, 2005

Tider skal komme,
tider skal henrulle,
slægt skal følge slægters gang



drop table imo

hackbunny
Jul 22, 2007

I haven't been on SA for years but the person who gave me my previous av as a joke felt guilty for doing so and decided to get me a non-shitty av

:bahgawd::manning:

Cold on a Cob
Feb 6, 2006

i've seen so much, i'm going blind
and i'm brain dead virtually

College Slice

Shaggar posted:

at some point im gonna use service broker to do pub sub instead of polling, but i'll still have the same table structure for reporting and app needs. service broker would basically be running the same query to get the list if items to queue tho.

i was more responding to the op, but in your case why not just push messages to your queue when you update the operational table instead of updating an operational table to query later to queue messages? seems like an extra pointless step to me.

if the problem is they can't be processed until a later time, use a conversation timer.

if you're dealing with multiple dependencies then check and only queue the message when all dependencies are satisfied.

if the messages need to be "cancellable" things get hairy but you can still update the operational table to say "don't process that message we nulled the process timestamp" and then double check when processing that message you queued up for 1 am, and since you're always looking poo poo up via the pk of the table the check is fast.

basically i don't see the point of turning operational tables into queues and i really have to stretch to come up with scenarios where it would be the only choice

Cold on a Cob
Feb 6, 2006

i've seen so much, i'm going blind
and i'm brain dead virtually

College Slice
all of that said, queues are way overused too. i mostly use them in async scenarios such as file drops and api requests coming in waves to smooth out processing. ideally your system can just make whatever api calls it needs to and everything just works, i.e. read this i guess: https://ferd.ca/queues-don-t-fix-overload.html

Shaggar
Apr 26, 2006

Cold on a Cob posted:

i was more responding to the op, but in your case why not just push messages to your queue when you update the operational table instead of updating an operational table to query later to queue messages? seems like an extra pointless step to me.

if the problem is they can't be processed until a later time, use a conversation timer.

if you're dealing with multiple dependencies then check and only queue the message when all dependencies are satisfied.

if the messages need to be "cancellable" things get hairy but you can still update the operational table to say "don't process that message we nulled the process timestamp" and then double check when processing that message you queued up for 1 am, and since you're always looking poo poo up via the pk of the table the check is fast.

basically i don't see the point of turning operational tables into queues and i really have to stretch to come up with scenarios where it would be the only choice

its easier to just handle control and statusing in a table vs an opaque message queue and ive never seen a purpose built message queue that isn't fragile as hell and relies on the ability to dump everything in the queue and restart when things go wrong.

DELETE CASCADE
Oct 25, 2017

i haven't washed my penis since i jerked it to a phtotograph of george w. bush in 2003
well guys we better separate the queue from the database because what if the database server goes down? then how will the workers fetch jobs from the queue to process?? what do you mean the work of those jobs requires further queries to the database, i don't see your point???

abigserve
Sep 13, 2009

this is a better avatar than what I had before

CRIP EATIN BREAD posted:

what would be your search parameters for this?

It's at least last_switched but can be any combination of columns.

Im writing the timescaledb backend now, I'll let you guys know how that goes!

Nomnom Cookie
Aug 30, 2009



Shaggar posted:

its easier to just handle control and statusing in a table vs an opaque message queue and ive never seen a purpose built message queue that isn't fragile as hell and relies on the ability to dump everything in the queue and restart when things go wrong.

SQS is super robust. job has been relying on SQS for about five years now and had 0 problems. billions of messages per month, 1000x traffic spikes, once 300 misconfigured workers short polling in a loop, never had a problem

probably helps that the features you get are put message, receive message, and delete received message

Adbot
ADBOT LOVES YOU

floatman
Mar 17, 2009

Cold on a Cob posted:

lol

please tell me this was in sql server

Yes.

Cold on a Cob posted:

wait

they left the jobs in the "queue" ?!?!?!

:cripes:

Yes

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply