Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Xae
Jan 19, 2005

Mox Hombre posted:

yo what's up YOSPOS long time no see.

I have to get some data out of a few weirdly designed tables in postgres. We process tips for hairdressers and whatnot. We have a Ledger table where we track positive and negative additions to someone's balance. There's also a Payout table, and a payout is us sending money out, and a payout covers several tips, for several people. Once someone has been paid out, we add negative balance to their Ledger and mark it with a payoutId to know which payout resulted in that person's balance going down.

So far so good. I'm writing some analytics for that cause the db guy is not currently available. I have to get the average number of tips per payout for one person. Now you would imagine that there would be a M2M table or something that tells you which tips were paid out in which payout. But there isn't. Apparently, the way to get all the Ledger entries that were paid out in one payout (say payoutId = 99) is to find the Ledger entry with payoutId = 98 and then just go up by ids until we get to a Ledger entry with payoutId = 99.

so I have
code:
Ledger or whatever
id    reason    amount    payoutId
9     tip       $4
8     payout    -$32      9
7     tip       $4
6     tip       $4
5     tip       $4
4     payout    -$52      8
3     tip       $4
2     payout    $3        7
And I need to get a result that tells me that payout 9 had 3 tips, payout 7 had 1 tip and so on for each payout. Or even better to annotate it with a column that says which payout the tip was paid out, then I can do grouping and aggreagates. Getting this would be pretty easy if the data were more normalized or If i were doing this in application code. but with sql this is just hosed
code:
SELECT reason, SUM(amount), AVG(amount), COUNT(amount)
  FROM Ledger 
 WHERE id BETWEEN (SELECT id FROM Ledger  WHERE payout_id = #input) AND (Select id FROM Ledger WHERE payout_id = (#input+1))-1
 GROUP BY reason 
Alternately gently caress around with Window Functions or write some stored procedures.
https://www.postgresql.org/docs/11/tutorial-window.html

Adbot
ADBOT LOVES YOU

Kuule hain nussivan
Nov 27, 2008

Mox Hombre posted:

yo what's up YOSPOS long time no see.

I have to get some data out of a few weirdly designed tables in postgres. We process tips for hairdressers and whatnot. We have a Ledger table where we track positive and negative additions to someone's balance. There's also a Payout table, and a payout is us sending money out, and a payout covers several tips, for several people. Once someone has been paid out, we add negative balance to their Ledger and mark it with a payoutId to know which payout resulted in that person's balance going down.

So far so good. I'm writing some analytics for that cause the db guy is not currently available. I have to get the average number of tips per payout for one person. Now you would imagine that there would be a M2M table or something that tells you which tips were paid out in which payout. But there isn't. Apparently, the way to get all the Ledger entries that were paid out in one payout (say payoutId = 99) is to find the Ledger entry with payoutId = 98 and then just go up by ids until we get to a Ledger entry with payoutId = 99.

so I have
code:
Ledger or whatever
id    reason    amount    payoutId
9     tip       $4
8     payout    -$32      9
7     tip       $4
6     tip       $4
5     tip       $4
4     payout    -$52      8
3     tip       $4
2     payout    $3        7
And I need to get a result that tells me that payout 9 had 3 tips, payout 7 had 1 tip and so on for each payout. Or even better to annotate it with a column that says which payout the tip was paid out, then I can do grouping and aggreagates. Getting this would be pretty easy if the data were more normalized or If i were doing this in application code. but with sql this is just hosed

My intuitive solution to this is to group your ledger rows by person and add a row number to them, then filter that result to just payouts.

Then you'd get something like...
code:
id    reason    amount    payoutId    row
8     payout    -$32      9    9   
7     tip       $4
6     tip       $4
5     tip       $4
4     payout    -$52      8    5   
3     tip       $4
2     payout    $3        7   3
Then you can subtract the row number of the previous payout + 1 from the current payout to get then number of tips in between the two.

However, are you even sure getting the number of tips per payout is even possible? Do payouts always empty the whole balance, or are partial payouts possible?

Agrikk
Oct 17, 2003

Take care with that! We have not fully ascertained its function, and the ticking is accelerating.
I have a MySQL table like so:

CREATE TABLE `user_names` (
`user_id` int(11) NOT NULL AUTO_INCREMENT,
`username` varchar(75) NOT NULL,
`team` bigint(20) NOT NULL,
PRIMARY KEY (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1114096 DEFAULT CHARSET=latin1;

with three columns: user_id, username, teamname

I am trying to pull a user/team combo at random, so I have a statement:

select user_id, username, team from user_names where user_id = (SELECT FLOOR(RAND()*((select count(1) FROM user_names)-1+1)+1))


that should theoretically grab a single user_id anywhere from user_id 1 to the maximum user_id which is count(1) (1,081,480) and its respective username and team. Right?

However, when I repeatedly run the command, often I get nulls, sometimes I get a single user_id (as expected) and other times I get two or three user_ids returned.

What's going on here?

Agrikk fucked around with this message at 06:30 on Jul 26, 2020

Kuule hain nussivan
Nov 27, 2008

Agrikk posted:

I have a MySQL table like so:

CREATE TABLE `user_names` (
`user_id` int(11) NOT NULL AUTO_INCREMENT,
`username` varchar(75) NOT NULL,
`team` bigint(20) NOT NULL,
PRIMARY KEY (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1114096 DEFAULT CHARSET=latin1;

with three columns: user_id, username, teamname

I am trying to pull a user/team combo at random, so I have a statement:

select user_id, username, team from user_names where user_id = (SELECT FLOOR(RAND()*((select count(1) FROM user_names)-1+1)+1))


that should theoretically grab a single user_id anywhere from user_id 1 to the maximum user_id which is count(1) (1,081,480) and its respective username and team. Right?

However, when I repeatedly run the command, often I get nulls, sometimes I get a single user_id (as expected) and other times I get two or three user_ids returned.

What's going on here?
No clue what's going on with your query, especially it returning multiple rows, but the MySQL way of getting a random row is...

SELECT x FROM y ORDER BY RAND() LIMIT 1;

hey mom its 420
May 12, 2007

Thanks for the responses. Yeah getting them only for one payout would be doable, just limit the ids to be between two payoutIds. However, I need aggregate data for all of them. I'll look into stored procedures and window function, it's my intuition that this can't be done with just normal SQL because it looks like you need something akin to a map reduce to do it.

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

Kuule hain nussivan posted:

No clue what's going on with your query, especially it returning multiple rows, but the MySQL way of getting a random row is...

SELECT x FROM y ORDER BY RAND() LIMIT 1;

I think maybe the mid-level query (the SELECT FLOOR one) is being evaluated for each row of the outer query, rather than just once as you intended. You could perhaps force it to do it the way you want by doing a join instead. Anyway don't do that, just do what Kuule hain nussivan said.

PhantomOfTheCopier
Aug 13, 2008

Pikabooze!

abelwingnut posted:

yea, that makes sense. problem is i have no idea how to do that. :(

would this be something easily done with python?
Why complicate your life? https://www.man7.org/linux/man-pages/man1/split.1.html has existed for decades: "A  split command appeared in Version 3 AT&T UNIX", which was February 1973. https://en.m.wikipedia.org/wiki/Split_(Unix)

Mouse Cadet
Mar 19, 2009

All aboard the McEltrain
Next Stop: Atlanta
I had a query that was scheduled to run at 10am est every Monday and return data going back to 12am the prior Tuesday (so any data in a 154 hour window). I was using this code to do that:

where i.dte BETWEEN DATEADD(hour, -154, GETDATE()) AND GET DATE

That worked fine but now I need the query to run at times other than 10am on Monday, it could be 8am or 2pm but still going back to 12am the prior Tuesday. Is there a way to make the code return data after 11:59:59 the prior Tuesday up to whatever minute the query is run?

Thank you

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

Mouse Cadet posted:

I had a query that was scheduled to run at 10am est every Monday and return data going back to 12am the prior Tuesday (so any data in a 154 hour window). I was using this code to do that:

where i.dte BETWEEN DATEADD(hour, -154, GETDATE()) AND GET DATE

That worked fine but now I need the query to run at times other than 10am on Monday, it could be 8am or 2pm but still going back to 12am the prior Tuesday. Is there a way to make the code return data after 11:59:59 the prior Tuesday up to whatever minute the query is run?

Thank you

the answer is probably "yes" but the way you do it might depend what RDBMS you are using

abelwingnut
Dec 23, 2002


yes. you effectively want to change the reference point from something relative to something absolute, and the best reference point is time 0.

so something like:

BETWEEN dateadd(hr, -108, dateadd(week, 0, getdate()), 0) AND getdate()

so you're saying: ok, give me how many weeks it has been since time 0, and from that point, subtract 108 hours (i think that's how many hours it would be, can't check now). this gives you the start time. and the end time would just be a regular old getdate().

may not be exact, but shows what i'm getting at enough, i think.

abelwingnut fucked around with this message at 13:53 on Aug 5, 2020

Mouse Cadet
Mar 19, 2009

All aboard the McEltrain
Next Stop: Atlanta

Hammerite posted:

the answer is probably "yes" but the way you do it might depend what RDBMS you are using

SQL Server 2017. Thank you.

NihilCredo
Jun 6, 2011

iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

Mouse Cadet posted:

SQL Server 2017. Thank you.

You can use DATEPART(dayofweek, GETDATE()) to extract the numeric value of the current day of week. Then use DATEADD() to subtract that many days from GETDATE(), so you go back to day #0 i.e. Sunday, and add two days to go to Tuesday. Finally, cast the result to DATE to go to midnight.

Munkeymon
Aug 14, 2003

Motherfucker's got an
armor-piercing crowbar! Rigoddamndicu𝜆ous.



SQL code:
select dateadd(hour, 12, --goto noon
	convert(datetime, convert(date, --strip time off
		dateadd(weekday, 3 - datepart(weekday, getdate()), getdate())))) --find Tuesday

Combat Pretzel
Jun 23, 2004

No, seriously... what kurds?!
Are there any query builder tools for DB2/AS400 that also display table/column remarks during autocompletion and dropdown/combobox selectors? We have Showcase Strategy Query at work, but that app is from the stone age. It's been bought up ages ago and turned into a terrible web app. We're looking for an alternative, but that sort of feature seemed to have gone amiss, because people eventually started to slowly move to databases where you don't need to mangle names to fit within a few letters.

abelwingnut
Dec 23, 2002


so it seems merge is an extremely buggy command in sql server 2012 and it's best to avoid it. is that the general thought among you all as well? it could be so helpful but, man, it's just not updating like 20k of 200k rows for...i can't find a reason.

e: lol, now i'm doing it the update/insert classic way and it's still not working :psyduck:

e2: the only solution was deleting rows i wanted to update then inserting them. what the hell--never seen that.

abelwingnut fucked around with this message at 20:47 on Aug 6, 2020

Macichne Leainig
Jul 26, 2012

by VG


Latest headache from my offshore team. This is supposed to be an app that can be used on mobile.

Oh yeah, and none of the tables in the subqueries have a FK relationship to the main table. Or indexes. :thunk:

Please give me the sweet release of death.

Macichne Leainig fucked around with this message at 19:03 on Aug 6, 2020

Ruggan
Feb 20, 2007
WHAT THAT SMELL LIKE?!


abelwingnut posted:

so it seems merge is an extremely buggy command in sql server 2012 and it's best to avoid it. is that the general thought among you all as well? it could be so helpful but, man, it's just not updating like 20k of 200k rows for...i can't find a reason.

e: lol, now i'm doing it the update/insert classic way and it's still not working :psyduck:

e2: the only solution was deleting rows i wanted to update then inserting them. what the hell--never seen that.

You sure you aren’t doing something like an inner join on a nullable column without handling the nulls? Very easy to accidentally do that and discard the records with a null on one side.

Ruggan
Feb 20, 2007
WHAT THAT SMELL LIKE?!


Also merge is fine and useful when you need to use the output clause not only for inserted/deleted but also the merge source

But otherwise I find it to be the least sweet syntactic sugar around

TheFluff
Dec 13, 2006

FRIENDS, LISTEN TO ME
I AM A SEAGULL
OF WEALTH AND TASTE
Merge had some weird locking issues in old versions of SQL Server IIRC (can't remember how old though) but I think it's fine now. It has a bunch of caveats but sometimes it's exactly the right tool for the job.

abelwingnut
Dec 23, 2002


Ruggan posted:

You sure you aren’t doing something like an inner join on a nullable column without handling the nulls? Very easy to accidentally do that and discard the records with a null on one side.

the join was pk to pk, so it couldn't be that. i have no idea how it wasn't working properly. even the classic 'update, then insert' method wasn't working. i had to full on delete the records that needed updating, then just insert them along with the new records.

the query itself is a 4k behemoth with like 50 temp tables and some other craziness going on, so maybe the server's running out of memory or processing power or...i don't know. it takes 30min to run, so it isn't too insane...or so i thought.

it needs to be refactored down to something more manageable, yes. that's phase 2.

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe
I became aware recently that in SQL Server you can use a VALUES table-expression ("Table Value Constructor") to specify a collection of values without having to create and populate a temporary table.

Suppose I have a similar use case to this but I have to use Oracle. What's the best I can do? I do not want to assume I am free to create temporary tables, either global or private. Assume I am going to left outer join from the pseudo-table to a real table, so a WHERE X IN (...) condition won't do.

Moonwolf
Jun 29, 2004

Flee from th' terrifyin' evil of "NHS"!


Hammerite posted:

I became aware recently that in SQL Server you can use a VALUES table-expression ("Table Value Constructor") to specify a collection of values without having to create and populate a temporary table.

Suppose I have a similar use case to this but I have to use Oracle. What's the best I can do? I do not want to assume I am free to create temporary tables, either global or private. Assume I am going to left outer join from the pseudo-table to a real table, so a WHERE X IN (...) condition won't do.

There isn't, as far as I know, anything that nice in Oracle yet. If the set of pseudo-table data is fixed, but you can't store it, you could put it in a WITH clause/CTE by doing UNION ALL and selecting them from Dual. Handily stolen example:
code:
WITH t1 AS
  (SELECT 1 SUBYPE, 123 DATAID, 0 ORIGNALDATAID, '01-Aug-2010' DTE FROM dual
  UNION ALL
  SELECT 1, 145, 10, '01-Aug-2010' FROM dual
  UNION ALL
  SELECT 1, 12345, 475, '01-Aug-2010' FROM dual
  UNION ALL
  SELECT 1, 789, 0, '01-Aug-2010' FROM dual
  UNION ALL
  SELECT 2, 789456, 0, '01-Aug-2010' FROM dual
  UNION ALL
  SELECT 2, 789456, 0, '02-Aug-2010' FROM dual)
SELECT dataid,ORIGNALDATAID,decode(ORIGNALDATAID,0,dataid,ORIGNALDATAID)req FROM t1 where subype=1 and dte='01-Aug-2010'

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

Moonwolf posted:

There isn't, as far as I know, anything that nice in Oracle yet. If the set of pseudo-table data is fixed, but you can't store it, you could put it in a WITH clause/CTE by doing UNION ALL and selecting them from Dual. Handily stolen example:
code:
WITH t1 AS
  (SELECT 1 SUBYPE, 123 DATAID, 0 ORIGNALDATAID, '01-Aug-2010' DTE FROM dual
  UNION ALL
  SELECT 1, 145, 10, '01-Aug-2010' FROM dual
  UNION ALL
  SELECT 1, 12345, 475, '01-Aug-2010' FROM dual
  UNION ALL
  SELECT 1, 789, 0, '01-Aug-2010' FROM dual
  UNION ALL
  SELECT 2, 789456, 0, '01-Aug-2010' FROM dual
  UNION ALL
  SELECT 2, 789456, 0, '02-Aug-2010' FROM dual)
SELECT dataid,ORIGNALDATAID,decode(ORIGNALDATAID,0,dataid,ORIGNALDATAID)req FROM t1 where subype=1 and dte='01-Aug-2010'

I thought that that's all that was available. Thanks for confirming.

prom candy
Dec 16, 2005

Only I may dance
I'm not sure if this is the right thread for this but I can't think of where else it would go. I'm building an advanced filter type tool, similar to ebay where you can add a bunch of different filters to narrow down your results. I'm being asked if we can do something like this:



where before applying each filter you get a preview of how many results it'll return. The only way I can think to do this is by basically pre-running each of the filter options as a COUNT query every single time the filters change, but that seems like it would be really slow. Is there a better way to do something like that or is this the kind of thing where we'd want a totally separate database layer if we want to access our data in this way?

ChickenWing
Jul 22, 2010

:v:

prom candy posted:

I'm not sure if this is the right thread for this but I can't think of where else it would go. I'm building an advanced filter type tool, similar to ebay where you can add a bunch of different filters to narrow down your results. I'm being asked if we can do something like this:



where before applying each filter you get a preview of how many results it'll return. The only way I can think to do this is by basically pre-running each of the filter options as a COUNT query every single time the filters change, but that seems like it would be really slow. Is there a better way to do something like that or is this the kind of thing where we'd want a totally separate database layer if we want to access our data in this way?

SELECT console, COUNT(*) FROM products GROUP BY console

prom candy
Dec 16, 2005

Only I may dance
Alright I feel kinda stupid now having seen that... I have a lot more categories than just console, but I guess these count queries shouldn't be too slow? Plus it's going to be counting from a subset of the records that have the conditions already applied... not sure if that would make things faster or slower.

prom candy fucked around with this message at 19:57 on Aug 11, 2020

ChickenWing
Jul 22, 2010

:v:

prom candy posted:

Alright I feel kinda stupid now having seen that... I have a lot more categories than just console, but I guess these count queries shouldn't be too slow? Plus it's going to be counting from a subset of the records that have the conditions already applied... not sure if that would make things faster or slower.

I'm pretty sure it's just a general developer thing of not being able to see the simple solutions once you're in the weeds, I do the same thing all the time.

Whatever aggregation stage you're pulling your filter options out at is ideally where you'd do the counts, because the queries can almost always pull double-duty.

e: for example, our app has a similar function to what you're implementing here. We use java+hibernate, so our flow looks like:

{{base filtered query}}

select filter_options_1 , count(*) from {{base filtered query}} group by filter_options_1

select filter_options_2 , count(*) from {{base filtered query}} group by filter_options_2

etc...

return select {{base filtered_query}} + filter options 1 + filter options 2

ChickenWing fucked around with this message at 20:49 on Aug 11, 2020

Withnail
Feb 11, 2004
Is there any cool new way to write hierarchical queries in the latest versions of SQL server (ala oracle 'connect by')? Or do you still need to use CTE / TSQL etc?

Just-In-Timeberlake
Aug 18, 2003
Can anybody suggest a solution for a query running slowly in a stored procedure but fast in a query window? It takes ~7 seconds in a query window and 1-2 minutes in a stored procedure.

I've tried the following:

  • parameter sniffing: created local variables and assigned the two date values passed in to them, no change in performance
  • OPTION (RECOMPILE) - no change in performance
  • OPTION (OPTIMIZE FOR (@startDate UNKNOWN, @endDate UNKNOWN)) - no change in performance
  • Made sure SET ANSI_NULLS is set to ON

Those are the major things I know to check for and at least one of them has solved the problem in the past, but not this time.

MSSQL 2014 is the version I'm using.

Ruggan
Feb 20, 2007
WHAT THAT SMELL LIKE?!


Withnail posted:

Is there any cool new way to write hierarchical queries in the latest versions of SQL server (ala oracle 'connect by')? Or do you still need to use CTE / TSQL etc?

Still need to use a recursive CTE pretty sure

Ruggan
Feb 20, 2007
WHAT THAT SMELL LIKE?!


Just-In-Timeberlake posted:

Can anybody suggest a solution for a query running slowly in a stored procedure but fast in a query window? It takes ~7 seconds in a query window and 1-2 minutes in a stored procedure.

I've tried the following:

  • parameter sniffing: created local variables and assigned the two date values passed in to them, no change in performance
  • OPTION (RECOMPILE) - no change in performance
  • OPTION (OPTIMIZE FOR (@startDate UNKNOWN, @endDate UNKNOWN)) - no change in performance
  • Made sure SET ANSI_NULLS is set to ON

Those are the major things I know to check for and at least one of them has solved the problem in the past, but not this time.

MSSQL 2014 is the version I'm using.

Implicit database? If you’re running queries with your database set to master and master is set to a different compatibility level than your database, then running the query from master and running it from your database could produce different query plans.

Any cross server or cross database joins?

Anything like resource governor installed?

Can you preview the query plans produced using estimate query plan and see a specific difference to narrow down what might be causing the issue?

Just-In-Timeberlake
Aug 18, 2003
Running the actual execution plan for both shows this, which I'm sure is the problem, but I'm not sure how to fix it:

In the query window execution plan, there is an index seek, but in the stored procedure execution plan it becomes an index scan

edit: I found the (FORCESEEK) hint and it works as it should now.

Just-In-Timeberlake fucked around with this message at 15:20 on Aug 18, 2020

Withnail
Feb 11, 2004
I'm a noob with CTEs but if I have the following query on a self referencing table which return all the employees for manger id 16

code:
WITH employees

                             AS (SELECT *,

                                        Cast(0 AS VARBINARY(max)) AS level

                                 FROM   EmployeeTable

                                 WHERE  EmployeeId = 16

                                 UNION ALL

                                 SELECT a.*,

                                        level + Cast(a.EmployeeId AS VARBINARY(max)) AS level

                                 FROM   EmployeeTable a

                                        INNER JOIN employees p

                                                ON a.ManagerId = p.EmployeeId)

                        SELECT *

                        FROM   employees

                       ORDER  BY level
but instead of just the hierarchy for manger id 16, I want to union multiple hierarchies from multiple managers (managerId is null) for a given area (area id = 5)

Do I use multiple CTEs?



edit. I was able to solve this by changing how the level is calculated, putting the CTE in a while loop and dumping the results in a temp table

I suck at sql, I hate writing tsql

Withnail fucked around with this message at 01:29 on Aug 20, 2020

Chin Strap
Nov 24, 2002

I failed my TFLC Toxx, but I no longer need a double chin strap :buddy:
Pillbug
Not an SQL but more of a schema question. I'm working with this group that has a bunch of different datasets that go through data prep and have roughly similar schema at the end. For reasons though things aren't always equivalent enough that you could just paste the sets together into one table.

Things like column A means one thing for set A and something slightly different everywhere else.

I'm proposing creating a unified schema with additional columns highlighting the exceptions, like a separate column AExceptionReason that would be enum valued and encode the differences.

Is there some name for this kind of thing? All my searches come up with data cleaning references and that's not what I want because the data is about as clean as it is going to get. This is more about allowing the data to live together with all of its differences. Currently the documentation of differences all live in outside documentation that is impossible to deal with programmatically.

abelwingnut
Dec 23, 2002


i'm not really sure how to say this, but hopefully this works.

is there any way to run through a script line-by-line? like in a debug mode? i've seen this in other languages, where you can just say execute this line, and it'll show you how variables change with each line. anything like that without having to manually insert and remove selects with each line?

Sir Bobert Fishbone
Jan 16, 2006

Beebort

abelwingnut posted:

i'm not really sure how to say this, but hopefully this works.

is there any way to run through a script line-by-line? like in a debug mode? i've seen this in other languages, where you can just say execute this line, and it'll show you how variables change with each line. anything like that without having to manually insert and remove selects with each line?

If you're using MSSQL, there's likely a Debug button next to Execute on the top bar of SSMS. I step through almost every stored procedure I write to make sure it's not going to tank something.

Moonwolf
Jun 29, 2004

Flee from th' terrifyin' evil of "NHS"!


abelwingnut posted:

i'm not really sure how to say this, but hopefully this works.

is there any way to run through a script line-by-line? like in a debug mode? i've seen this in other languages, where you can just say execute this line, and it'll show you how variables change with each line. anything like that without having to manually insert and remove selects with each line?

This only makes sense within the context of whatever database you're using's procedural language extensions (TSQL, PL/SQL, PL/pgSQL, etc). Within those, there should be a debugger like Sir Fishbone says, probably integrated into whatever IDE you use for it if it's a better supported one.

SQL proper is declarative, you can't step through it because the entire statement is parsed and applied as one thing, there you're best off commenting out joins and reading the explain plans while you work out why it's not doing what you expect.

Ruggan
Feb 20, 2007
WHAT THAT SMELL LIKE?!


Yep. What you write is not what is run. It is compiled into code that is run and there is no way to step through SQL line by line.

For example, with a simple select a+b with join, your code might get optimized into a plan with two index seeks, a preemptive sort on one of the tables, a merge join, and a scalar operator for some calculation. There isn’t a way to visualize your sets at each step or correlate each step to a specific line or set of lines of declarative SQL.

If your code contains multiple statements, for example a multi-part SP, you can run each separate statement individually sure.

NihilCredo
Jun 6, 2011

iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

I guess a true SQL debug mode would look like EXPLAIN ANALYZE on steroids, with the engine printing the intermediate results between each step of the execution plan, and serialising parallel steps.

It would be pretty cool to have such s thing, but as noted, the execution plan you'd be stepping through would look nothing like your query, to the point that even a "source map" approach wouldn't be feasible.

Adbot
ADBOT LOVES YOU

Star War Sex Parrot
Oct 2, 2003

NihilCredo posted:

I guess a true SQL debug mode would look like EXPLAIN ANALYZE on steroids, with the engine printing the intermediate results between each step of the execution plan, and serialising parallel steps.

It would be pretty cool to have such s thing, but as noted, the execution plan you'd be stepping through would look nothing like your query, to the point that even a "source map" approach wouldn't be feasible.
That's cool because the academic system we're building aims for that. We compile physical plans to our DSL (a la MemSQL) and we can then either compile that using LLVM (a la HyPer 10 years ago, Postgres is adopting it for some queries now) to machine-native code for maximum performance, or we have an interpreter for that DSL for debugging. We're still working on getting the interpreter set up with a GDB stub, but in the meantime having the DSL alone is a nice way to see what operations are actually being performed at runtime. As usual the HyPer guys already built something similar for their new system.

Star War Sex Parrot fucked around with this message at 17:26 on Aug 28, 2020

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply