Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
door.jar
Mar 17, 2010
Thanks, that explanation of pivots makes things clearer. Your final example would work except the database is big, badly indexed and I'm querying against a JOIN nightmare.

I sat down with the DBA today and while he initially cringed at my procedure after we discovered that 94% of the work is in the initial join that can't be avoided he relaxed. Should get reduced when he gets it indexed properly.

I think the real issue is that this database was never really meant to be queried like we're doing, not sure what the point of it was in that case but there we are.

Adbot
ADBOT LOVES YOU

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine
So I may have finally gotten the go-ahead to switch from MyISAM to InnoDB (current MySQL version: 5.0.77). I know I now get to play with transactions and foreign key constraints, but anything else I should know? I know InnoDB is slower with SELECT COUNT(*), is there a better way to handle queries that just want to see if a row exists? Or is the speed difference trivial?

Edit: And, any gotchas that might prevent me from doing an easy conversion? So far as I know we don't use any of the myisam* programs.

Golbez fucked around with this message at 15:48 on Dec 9, 2011

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

Golbez posted:

I know InnoDB is slower with SELECT COUNT(*), is there a better way to handle queries that just want to see if a row exists? Or is the speed difference trivial?

I'm not sure what you're asking. InnoDB is supposed to be slower if you ask it to COUNT() like all the rows in a table. I don't think it will make a difference if you're asking it to COUNT a relatively small set of rows using an index. If you want to check whether a row exists you could just SELECT the primary key LIMIT 1 or you could SELECT EXISTS(...)

I can't think of any gotchas because it is a long time since I did a big convert to InnoDB, but I'm pretty sure there must have been some.

butt dickus
Jul 7, 2007

top ten juiced up coaches
and the top ten juiced up players

Hammerite posted:

I'm not sure what you're asking. InnoDB is supposed to be slower if you ask it to COUNT() like all the rows in a table. I don't think it will make a difference if you're asking it to COUNT a relatively small set of rows using an index. If you want to check whether a row exists you could just SELECT the primary key LIMIT 1 or you could SELECT EXISTS(...)

I can't think of any gotchas because it is a long time since I did a big convert to InnoDB, but I'm pretty sure there must have been some.

The default InnoDB setup puts everything in the same file. You should change the setting to make it per database so it doesn't make one huge file.

Fnappy
Sep 24, 2009
Hopefully someone can help me here cause I have no idea why this isn't working. I'm trying to pull a mailing target list and make it exclude a number of domains. When I pull it, it seems to still include all the domains I specifically told it I DONT WANT!

Maybe someone can direct me to what I'm doing wrong.
______________________________________________________________

Select *
FROM Mailing_Table
WHERE EmailStatus='VALID' AND OptStatus= 'N'
and (email not like '%@gmail.com' or email not like '%@googlemail.com' or email not like '%@hotmail.com' or email not like '%@live.com' or email not like '%@msn.com' or email not like '%@hotmail.fr' or email not like '%@hotmail.co.jp' or email not like '%@hotmail.co.uk' or email not like '%@rocketmail.com' or email not like '%@yahoo.ca' or email not like '%@yahoo.com' or email not like '%@yahoo.cl' or email not like '%@yahoo.cn' or email not like '%@yahoo.co.id' or email not like '%@yahoo.co.in' or email not like '%@yahoo.co.jp' or email not like '%@yahoo.co.kr' or email not like '%@yahoo.co.nz' or email not like '%@yahoo.co.th' or email not like '%@yahoo.co.uk' or email not like '%@yahoo.com' or email not like '%@yahoo.com.ar' or email not like '%@yahoo.com.au' or email not like '%@yahoo.com.br' or email not like '%@yahoo.com.cn' or email not like '%@yahoo.com.co' or email not like '%@yahoo.com.dk' or email not like '%@yahoo.com.hk' or email not like '%@yahoo.com.mx' or email not like '%@yahoo.com.my' or email not like '%@yahoo.com.pe' or email not like '%@yahoo.com.ph' or email not like '%@yahoo.com.sg' or email not like '%@yahoo.com.tr' or email not like '%@yahoo.com.tw' or email not like '%@yahoo.com.ve' or email not like '%@yahoo.com.vn' or email not like '%@yahoo.de' or email not like '%@yahoo.dk' or email not like '%@yahoo.es' or email not like '%@yahoo.fr' or email not like '%@yahoo.gr' or email not like '%@yahoo.ie' or email not like '%@yahoo.in' or email not like '%@yahoo.it' or email not like '%@yahoo.no' or email not like '%@yahoo.pl' or email not like '%@yahoo.se' or email not like '%@yahoomail.com' or email not like '%@ymail.com')
____________________________________________________________________

Thanks in advance

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

Fnappy posted:

Hopefully someone can help me here cause I have no idea why this isn't working. I'm trying to pull a mailing target list and make it exclude a number of domains. When I pull it, it seems to still include all the domains I specifically told it I DONT WANT!

Maybe someone can direct me to what I'm doing wrong.
______________________________________________________________

Select *
FROM Mailing_Table
WHERE EmailStatus='VALID' AND OptStatus= 'N'
and (email not like '%@gmail.com' or email not like '%@googlemail.com' or email not like '%@hotmail.com' or email not like '%@live.com' or email not like '%@msn.com' or email not like '%@hotmail.fr' or email not like '%@hotmail.co.jp' or email not like '%@hotmail.co.uk' or email not like '%@rocketmail.com' or email not like '%@yahoo.ca' or email not like '%@yahoo.com' or email not like '%@yahoo.cl' or email not like '%@yahoo.cn' or email not like '%@yahoo.co.id' or email not like '%@yahoo.co.in' or email not like '%@yahoo.co.jp' or email not like '%@yahoo.co.kr' or email not like '%@yahoo.co.nz' or email not like '%@yahoo.co.th' or email not like '%@yahoo.co.uk' or email not like '%@yahoo.com' or email not like '%@yahoo.com.ar' or email not like '%@yahoo.com.au' or email not like '%@yahoo.com.br' or email not like '%@yahoo.com.cn' or email not like '%@yahoo.com.co' or email not like '%@yahoo.com.dk' or email not like '%@yahoo.com.hk' or email not like '%@yahoo.com.mx' or email not like '%@yahoo.com.my' or email not like '%@yahoo.com.pe' or email not like '%@yahoo.com.ph' or email not like '%@yahoo.com.sg' or email not like '%@yahoo.com.tr' or email not like '%@yahoo.com.tw' or email not like '%@yahoo.com.ve' or email not like '%@yahoo.com.vn' or email not like '%@yahoo.de' or email not like '%@yahoo.dk' or email not like '%@yahoo.es' or email not like '%@yahoo.fr' or email not like '%@yahoo.gr' or email not like '%@yahoo.ie' or email not like '%@yahoo.in' or email not like '%@yahoo.it' or email not like '%@yahoo.no' or email not like '%@yahoo.pl' or email not like '%@yahoo.se' or email not like '%@yahoomail.com' or email not like '%@ymail.com')
____________________________________________________________________

Thanks in advance

That condition will always evaluate to true. You are using OR and you intended to use AND. Also there have got to be many better ways of doing it than that, unless this is just a one-off thing.

Golbez
Oct 9, 2002

1 2 3!
If you want to take a shot at me get in line, line
1 2 3!
Baby, I've had all my shots and I'm fine

Fnappy posted:

Hopefully someone can help me here cause I have no idea why this isn't working. I'm trying to pull a mailing target list and make it exclude a number of domains. When I pull it, it seems to still include all the domains I specifically told it I DONT WANT!

Maybe someone can direct me to what I'm doing wrong.
______________________________________________________________

Select *
FROM Mailing_Table
WHERE EmailStatus='VALID' AND OptStatus= 'N'
and (email not like '%@gmail.com' or email not like '%@googlemail.com' or email not like '%@hotmail.com' or email not like '%@live.com' or email not like '%@msn.com' or email not like '%@hotmail.fr' or email not like '%@hotmail.co.jp' or email not like '%@hotmail.co.uk' or email not like '%@rocketmail.com' or email not like '%@yahoo.ca' or email not like '%@yahoo.com' or email not like '%@yahoo.cl' or email not like '%@yahoo.cn' or email not like '%@yahoo.co.id' or email not like '%@yahoo.co.in' or email not like '%@yahoo.co.jp' or email not like '%@yahoo.co.kr' or email not like '%@yahoo.co.nz' or email not like '%@yahoo.co.th' or email not like '%@yahoo.co.uk' or email not like '%@yahoo.com' or email not like '%@yahoo.com.ar' or email not like '%@yahoo.com.au' or email not like '%@yahoo.com.br' or email not like '%@yahoo.com.cn' or email not like '%@yahoo.com.co' or email not like '%@yahoo.com.dk' or email not like '%@yahoo.com.hk' or email not like '%@yahoo.com.mx' or email not like '%@yahoo.com.my' or email not like '%@yahoo.com.pe' or email not like '%@yahoo.com.ph' or email not like '%@yahoo.com.sg' or email not like '%@yahoo.com.tr' or email not like '%@yahoo.com.tw' or email not like '%@yahoo.com.ve' or email not like '%@yahoo.com.vn' or email not like '%@yahoo.de' or email not like '%@yahoo.dk' or email not like '%@yahoo.es' or email not like '%@yahoo.fr' or email not like '%@yahoo.gr' or email not like '%@yahoo.ie' or email not like '%@yahoo.in' or email not like '%@yahoo.it' or email not like '%@yahoo.no' or email not like '%@yahoo.pl' or email not like '%@yahoo.se' or email not like '%@yahoomail.com' or email not like '%@ymail.com')
____________________________________________________________________

Thanks in advance

I think you may want to replace those ORs with ANDs. For example:
Email address is foo@gmail.com
You have ... AND (email NOT LIKE '%@gmail.com' OR email NOT LIKE '%@yahoo.com')

The first one is false: The email is indeed like '%@gmail.com'. So, Booleanified, this becomes:
... AND (FALSE OR TRUE)

Which becomes:
... AND (TRUE)

If you replaced the ORs with ANDS, you would get:
... AND (FALSE AND TRUE)

Which becomes:
... AND (FALSE)

Generally, when dealing with negatives, you want to use AND instead of OR.

Put in simpler terms: Let's say you want to omit the letters A and B. You can have SELECT * WHERE letter != "A" OR letter != "B". This won't work, because A will never be B, and B will never be A, so this will always return either two trues, or one true and one false, which means it will always be true, meaning you will always get every row.

Golbez fucked around with this message at 17:30 on Dec 9, 2011

Fnappy
Sep 24, 2009

Golbez posted:

I think you may want to replace those ORs with ANDs. For example:
Email address is foo@gmail.com
You have ... AND (email NOT LIKE '%@gmail.com' OR email NOT LIKE '%@yahoo.com')

The first one is false: The email is indeed like '%@gmail.com'. So, Booleanified, this becomes:
... AND (FALSE OR TRUE)

Which becomes:
... AND (TRUE)

If you replaced the ORs with ANDS, you would get:
... AND (FALSE AND TRUE)

Which becomes:
... AND (FALSE)

Generally, when dealing with negatives, you want to use AND instead of OR.

Put in simpler terms: Let's say you want to omit the letters A and B. You can have SELECT * WHERE letter != "A" OR letter != "B". This won't work, because A will never be B, and B will never be A, so this will always return either two trues, or one true and one false, which means it will always be true, meaning you will always get every row.

Thanks a bunch, I knew I was overlooking something stupid. I changed the ORs to ANDs and everything is working swimmingly now.

McGlockenshire
Dec 16, 2005

GOLLOCKS!

Doctor rear end in a top hat posted:

The default InnoDB setup puts everything in the same file. You should change the setting to make it per database so it doesn't make one huge file.

Do note that just turning on the setting does nothing. You have to start a new data directory after turning it on. This implies a complete dump & reload. Also, this'll do nothing to speed up blind COUNT(*). The only advantage file-per-table has is only worrying about idb bloat on a per-table basis. (InnoDB can not shrink idb files.)

McGlockenshire fucked around with this message at 19:23 on Dec 9, 2011

Frozen Peach
Aug 25, 2004

garbage man from a garbage can
I don't think the answer to this is going to be a yes, but it's worth a shot anyways.

We have a field in our database that looks like this:

code:
[1F]aSome Text[1F]bEven More Text
[1F]aOther Text[1F]bSome More Text
The [1F] is the hex character 1F and the 'a' and 'b' can be basically anything. I want to be able to select on data between [1F]a and [1F]b in this example. I started out by trying this:

code:
SELECT field FROM database WHERE field LIKE '%[1F]a%Some%'
Unfortunately that selects both rows, when I only want the first row. I then changed it to this:

code:
SELECT field FROM database WHERE field LIKE '%[1F]a%Some%[1F]'
but that doesn't work if I want to find the word "b%More" rather than "a%Some", which got me to change it to this:

code:
SELECT field FROM database WHERE field + '[1F]' LIKE '%[1F]a%Some%[1F]'
Which gets me back to where I started, because LIKE is too greedy and finds the last instance of [1F] rather than the next instance of it. Is there anything I can do to make this query work like I intend? I hope this isn't as confusing as I think it might be. The database I'm trying to search on is awful and there isn't anything I can do to make it better without breaking tons of legacy apps.


Edit: This is all in MSSQL 2000, if that makes a difference.

Frozen Peach fucked around with this message at 23:23 on Dec 9, 2011

Goat Bastard
Oct 20, 2004

In the absence of regular expression functions (which MSSQL 2000 doesn't seem to have), try this abomination.

The substring mess should get the substring of your column between the first instance of [1F]open_char and [1F]close_char where open_char and close_char are your "basically anything" (of any length).

If no close_char is specified then it takes the substring from [1F]open_char to the end of the string.

The substring is then compared with your search parameter.

code:
SELECT field
FROM   database
WHERE  substring(field
                ,charindex(field, chr(1F) || :open_char) + length(chr(1F) || :open_char)
                ,CASE 
                   WHEN :close_char IS NOT NULL THEN 
                     charindex(field,chr(1F) || :close_char) - charindex(field, chr(1F) || :open_char) - length(chr(1F) || :open_char)
                   ELSE
                     length(field)
                 END
                ) LIKE '%' || :search_text || '%'
;
Haven't run it in SQL Server, it seems to work in Oracle with substr and instr instead of substring and charindex but YMMV.

Also whoever is responsible for that column that is a bad person and I hope they are very sad about the life choices that led them to the point where they thought that was a good idea.

Deep Dish Fuckfest
Sep 6, 2006

Advanced
Computer Touching


Toilet Rascal
Anyone has a good book on MySQL (or close enough) to suggest? I'm looking for something that's more of a "here's the kind of poo poo you can do with databases" kind of book than a reference work (I can look up the documentation for that). I'd also like a book that deals with practical aspects like not opening gigantic SQL injection security holes in your app and general good client side design.

For background, I'm asking because I don't have any real training in databases and I've had to write some related code at work this week. I've produced something that works without much trouble, but it's an abomination and most DBAs would probably beat me to death with a fire extinguisher for it. I'd like to prevent that.

Frozen Peach
Aug 25, 2004

garbage man from a garbage can

Goat Bastard posted:

Also whoever is responsible for that column that is a bad person and I hope they are very sad about the life choices that led them to the point where they thought that was a good idea.

The data is actually a MARC record tag (http://www.loc.gov/marc/bibliographic/) which is a horrible flat-file format that uses 1F 1D and gently caress if I can remember which other hex codes as deliminators for data. We have a row in the table for each tag, and then a field that contains the raw subfield data for that tag. The original design was meant to search tags for specific text, and searching for subfields wasn't intended.

My NEW database structures are a 1 to many to many relationship table that have a row for a book, that points to a row for each of its tags, which each point to a row for each of their subfields, in 3 separate tables. Sadly, I can't get rid of the old table without breaking tons of legacy code that depends on it. My long term goal is to dump that table all together, but that can't happen for a while.

Edit: after a metric fuckton of poking at it, I've finally got it! I'm not used to Oracle so I had to first realize || is supposed to be string concatenation, charindex and substring have their variables in odd orders in MSSQL (or oracle is odd, hell if I know). Then, since I wasn't clear, I had to modify it to support more than just 2 subfields (so "[1F]aText[1F]bMore Text[1F]cYet more text"). Also a, b, and c "being anything" meant being any one single character, which made some of the math easier. But I finally got it! Thanks for pointing me in the right direction!

God is this code a horror. I need to go down a bottle after all that :(

code:
SELECT taginfo 
FROM   marctags mt 
WHERE  tagnumber = '245' 
       AND Substring(mt.taginfo, Charindex('[1F]a', mt.taginfo) + 2, 
               Charindex('[1F]', Substring(mt.taginfo, 
                                 Charindex('[1F]a', mt.taginfo) + 
                                 2, 
                                 Len( 
                                 CAST( 
                             mt.taginfo AS VARCHAR 
                             (3000))) - Charindex('[1F]a', mt.taginfo) + 2) + 
                                 '[1F]') 
               - 1) LIKE '%Harry%' 

Frozen Peach fucked around with this message at 03:30 on Dec 10, 2011

Goat Bastard
Oct 20, 2004

Frozen-Solid posted:

Edit: after a metric fuckton of poking at it, I've finally got it! I'm not used to Oracle so I had to first realize || is supposed to be string concatenation, charindex and substring have their variables in odd orders in MSSQL (or oracle is odd, hell if I know).

Oh, sorry I didn't realise that || was the wrong concatenation operator - I've done most of my work with Oracle and DB2 (which both use it), and I didn't even think to check if it was right for MSSQL. Didn't think about the parameter orders either, I just skimmed the docs and saw that they had a similar number of parameters with similar data types. I guess I should really install an MSSQL instance and actually try running my code if I'm going to give out advice about it :)

Sab669
Sep 24, 2009

I have a spreadsheet with some 400 companies, and their website address. Then I have a table with about 800 companies, but the table doesn't track websites.

Is there an easy way to add on this column to the table AND insert the websites from the spreadsheet where the company names line up from both the spreadsheet and the table?

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

Sab669 posted:

I have a spreadsheet with some 400 companies, and their website address. Then I have a table with about 800 companies, but the table doesn't track websites.

Is there an easy way to add on this column to the table AND insert the websites from the spreadsheet where the company names line up from both the spreadsheet and the table?

Do you have some way of querying the spreadsheet as a table (e.g. MS Access and Excel co-operating somehow)? It's hard to answer this without knowing how your database and spreadsheet can communicate with one another, if at all.

If your spreadsheet and database are totally separate entities, you could export the spreadsheet as a CSV file, load it into the RDBMS as a temporary table, add the new column to the original table, then run an UPDATE query to add the websites where they exist. It would look something like

code:
UPDATE
    MyOriginalTable
    JOIN MyTemporaryTable ON MyOriginalTable.CompanyName = MyTemporaryTable.CompanyName
SET
    MyOriginalTable.Website = MyTemporaryTable.Website
WHERE
    MyTemporaryTable.Website IS NOT NULL AND
    MyTemporaryTable.Website != ''

Sab669
Sep 24, 2009

Yea, I can just dump the spreadsheet into Access as a temporary table as you suggested, thanks.

Sab669 fucked around with this message at 17:27 on Dec 13, 2011

bone app the teeth
May 14, 2008

So our lead dev decided to upgrade to Postgres last night on a whim and now some of my queries are broken. Before with MySQL I was using:

code:
SELECT * 
FROM books
LEFT JOIN imports ON books.import_id = imports.id
ORDER BY import.is_active DESC
GROUP BY books.isbn_13
Which would give me a a unique ISBN record per title with priority to the active import. Now with Postgres that is failing. I've tried select distinct on (isbn_13) but I cannot for the life of me get it working with Postgres. Any help?

Zombywuf
Mar 29, 2008

flare posted:

So our lead dev decided to upgrade to Postgres last night on a whim and now some of my queries are broken. Before with MySQL I was using:

code:
SELECT * 
FROM books
LEFT JOIN imports ON books.import_id = imports.id
ORDER BY import.is_active DESC
GROUP BY books.isbn_13
Which would give me a a unique ISBN record per title with priority to the active import. Now with Postgres that is failing. I've tried select distinct on (isbn_13) but I cannot for the life of me get it working with Postgres. Any help?

Most sql databases won't let you do that. You'd need to do something like this:
code:
select
   *
from
   books
   cross apply (select * from imports where books.import_id = imports.import_id order by is_active desc limit 1)
I think that will work it postgres but I've not tried.

bone app the teeth
May 14, 2008

Zombywuf posted:

Most sql databases won't let you do that. You'd need to do something like this:
code:
select
   *
from
   books
   cross apply (select * from imports where books.import_id = imports.import_id order by is_active desc limit 1)
I think that will work it postgres but I've not tried.

I actually achieved this by just using a subquery.

code:
SELECT DISTINCT ON (isbn_13) * from (old_query) old
Works perfectly.

mobby_6kl
Aug 9, 2009

by Fluffdaddy
Alright so I'm kind of stuck trying to find a nice solution to this in MS-SQL. Hopefully the following and the fake simplified example do a decent job of explaining:

Given a node in a hierarchy, I need to find the first level above it, whose children add up to at least a certain number. The input can be any node in the hierarchy, which is defined by simple ID/Parent_ID fields. There are only about 9 levels max at the moment, but I wouldn't hardcode this in.
code:
Accounting [120+10]
	Accounts Payable[80+40]
		Team1 [30+50]
			Alice - 30
			Bob - 50
		Team2 [20+20]
			Charlie - 20
			Dave - 20
	Accounts Receivable [10]
		Eve - 10
R&D
So, let's say starting with Bob, I want to find the department with at least 100k of expenses, and subsequently everyone who falls below it (so we can fire them :D). In this case, it would be Accounts Payable, and thus Alice, Bob, Charlie, and Dave.

Everything was great until I discovered that it wasn't possible to use aggregate functions in the recursive part of CTEs, so I'm stuck with only building the path [Bob-Team1-Accounts Payable-Accounting] with it, and then summing up only the direct children of those nodes. This sort of works sometimes, but would fail in this example since it wouldn't account for Team2 at all.

Edit: Never mind!
First I hacked up some ugly stuff with a while loop, but then I discovered that I've been working with a subset of available data, and each node also has a next node ID. Therefore, I can just SUM the nodes with IDs between node_id and next_node_id. To roll-up these sums up the tree, I made CTE to update the table from the bottom-up. The only catch is that I have to hard-code in the total number of nodes, otherwise it won't let me do ORDER BY:
code:
WITH my_cte AS
(
  SELECT TOP 12345 * FROM #data ORDER BY node_id DESC
)
UPDATE my_cte SET total = (SELECT SUM(blah) FROM #data WHERE id BETWEEN q.node_id AND q.next_node_id)
That's not exact code, but it worked for me, and was pretty fast.

With this kind of solved, are there any ideas about this? My current solution works but each day it stays online I cry in bed: http://forums.somethingawful.com/showthread.php?goto=post&postid=397902155

mobby_6kl fucked around with this message at 21:49 on Dec 14, 2011

Sab669
Sep 24, 2009

I have a problem similiar to the one I posted on the earlier on this page, where I have two tables, Contacts and Shared Contacts.

Shared Contacts tracks a few columns Contacts doesn't, and I need to insert those colums/rows into Contacts. Unlike my previous problem, there is no way ContactID or anything like that to tie individual users together :stare:

In a brief moment of almost-brilliance, I think "Wait a second- everyone has a unique email! IT'S BRILLIANT!"....but of course, not every row in Shared Contacts has an email entered. So, there's a good 400 people with no email address.

code:
UPDATE [Sales Contacts] 
SET Email = (
SELECT Email 
FROM [Shared Contacts]
WHERE [Sales Contacts].FirstName = [Shared Contacts].FirstName
AND [Sales Contacts].LastName = [Shared Contacts].LastName
AND [Sales Contacts].Email = [Shared Contacts].Email
AND [Shared Contacts].Email IS NOT NULL
AND [Shared Contacts].Email <> ''
);

Subquery returned more than 1 value. 
This is not permitted when the subquery follows =, !=, <, <= , >, >=
 or when the subquery is used as an expression.
The error makes sense, but I can't figure out how to trim it down to 1 result so that it works. And I know logically the query works for other columns/tables where there was unique data in the tables.

Sab669 fucked around with this message at 18:20 on Dec 14, 2011

kimbo305
Jun 9, 2007

actually, yeah, I am a little mad
I have a table like this:
code:
t:
    day    | id | metric | d_metric
-----------+----+--------+----------
2011-12-01 |  1 | 10     | 10
2011-12-03 |  1 | 12     | 2
2011-12-04 |  1 | 15     | 3
I'd like to fill in the gaps in days, like this:
code:
t_fill:
    day    | id | metric | d_metric
-----------+----+--------+----------
2011-12-01 |  1 | 10     | 10
2011-12-02 |  1 | 10     | 0 -- a delta of 0
2011-12-03 |  1 | 12     | 2
2011-12-04 |  1 | 15     | 3
I can't think of a one-shot way of doing this. For now, I've come up with a day by day approach:

code:
insert into t_fill
select NVL(c.day, <cur_day>) as day, id, NVL(c.metric, p.metric) as metric, NVL(c.d_metric, 0) as d_metric
from (select distinct id from ids) as a
left join (select * from t_fill where day=<prev_day>) as p
on a.id = a.id
left join (select * from t where day=<cur_day>) as c
on a.id = c.id
where (c.metric is not null or p.metric is not null);
The ids table is used to hold a current, ever-changing snapshot of which objects we care to track over time. Objects can be added or removed. Removals I don't care about, but additions are tricky. If I had an object today that has a 1-year history, this day-by-day approach (which would have to be run each day, obviously) won't patch in past history. I can add in other stuff on top of this approach, but I'd prefer a one-shot query on t and ids that can produce t_fill. Is there some way to do it with LAG() maybe?

Zombywuf
Mar 29, 2008

kimbo305 posted:

The ids table is used to hold a current, ever-changing snapshot of which objects we care to track over time. Objects can be added or removed. Removals I don't care about, but additions are tricky. If I had an object today that has a 1-year history, this day-by-day approach (which would have to be run each day, obviously) won't patch in past history. I can add in other stuff on top of this approach, but I'd prefer a one-shot query on t and ids that can produce t_fill. Is there some way to do it with LAG() maybe?

I would do this with a separate table of dates (static, temporary or generated with a recursive CTEs) and using a merge statement with an if not matched clause.

kimbo305
Jun 9, 2007

actually, yeah, I am a little mad

Zombywuf posted:

I would do this with a separate table of dates (static, temporary or generated with a recursive CTEs) and using a merge statement with an if not matched clause.

I could live with a static table. Could you give an example of how to do it?
Someone at work suggested something similar. While 99% of the cases will be a consecutive run of populated dates, followed by silence (which is much easier to generate for), there will be some cases (like that shown in my post) where there will be holes throughout, and I can't figure out how to do that.

Zombywuf
Mar 29, 2008

kimbo305 posted:

I could live with a static table. Could you give an example of how to do it?
Someone at work suggested something similar. While 99% of the cases will be a consecutive run of populated dates, followed by silence (which is much easier to generate for), there will be some cases (like that shown in my post) where there will be holes throughout, and I can't figure out how to do that.

Actually, doesn't need a merge, if I'm understanding the problem correctly, it should look something like this (sql server syntax, ymmv):
code:
insert into
  sourcetable (day, id, metric, d_metric)
select
  c.day,
  m.id,
  m.metric,
  0 -- always 0 when copying the previous day
from
  calendar c
  cross apply (
    select top 1
      id,
      metric
    from
      sourcetable s
    where
      s.day <= c.day
    order by
      s.day desc) as m
where
  not exists (select * from sourcetable s where c.day = s.day);
How well this will optimise depends on many many things, but I'd expect it to have pretty good perf. The cross applied subquery is for selecting the row with the latest day that is not more than the day currently being examined.

If you want to adapt this to fill into a new table instead of the original just remove the where clause at the end.

Hope this makes sense.

Sab669
Sep 24, 2009

I'm working on cleaning out all the duplicate data from our database, so I just wrote some long-rear end and possibly improper query that will return about 220 results.

It's a giant Select statement that looks for certain combinations of first and last names. SSMS returns the 220 rows perfectly, but when I try to export those results into Excel it looks horrible, none of the data is in separate columns or anything.

We have an Access front end (not for long, though!) and I copy-pasted the query into Access. "Query is too complex" when I try to run it :stare:

I suppose I can chop it up into a bunch of smaller select statements and export the results individually, but I really don't want to do that. Is there a clean way to export stuff from SSMS, aside from just right click -> export results?

e; Ah, didn't realize I could just click the little top-left corner to select everything in there and copy-paste it like that. Had to type out the column headings, but no big deal.

Sab669 fucked around with this message at 17:45 on Dec 23, 2011

Zombywuf
Mar 29, 2008

Sab669 posted:

I'm working on cleaning out all the duplicate data from our database, so I just wrote some long-rear end and possibly improper query that will return about 220 results.

It's a giant Select statement that looks for certain combinations of first and last names. SSMS returns the 220 rows perfectly, but when I try to export those results into Excel it looks horrible, none of the data is in separate columns or anything.

We have an Access front end (not for long, though!) and I copy-pasted the query into Access. "Query is too complex" when I try to run it :stare:

I suppose I can chop it up into a bunch of smaller select statements and export the results individually, but I really don't want to do that. Is there a clean way to export stuff from SSMS, aside from just right click -> export results?

Copy and paste the results. If it doesn't get split up into columns there's an option somewhere (tools menu?) to convert text to columns. It will be tab separated. Excel seems to decide at random whether to do this automatically.

Fruit Smoothies
Mar 28, 2004

The bat with a ZING
I have a MySQL table like this:

code:
ID	foreignID	quant
--------------------------------
1	1		1
2	1		2
3	2		1
4	2		3
--------------------------------
And I want the SUM(quant) of each, GROUP BY'd the foreign ID.

For example, where foreignID = 1, the SUM(quant) should be 3 (1 + 2), and for foreignID =2, the quant would be 4 (1 + 3)


Obviously this is a very simplified version of the situation. In the real query (which I can post if necessary), the values for foreign ID are got via a IN(subquery) syntax. In pseduo query it's sort like this.

SELECT SUM(quant) from table WHERE foreignID IN (1,2) GROUP BY foreignID

I'd rather not use a higher language to calculate, because the sizes are pretty big (~1,000 rows per foreignkey)

Thanks for your help in this busy festive season. I wish I didn't have to do this poo poo right now :(

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

Fruit Smoothies posted:

SELECT SUM(quant) from table WHERE foreignID IN (1,2) GROUP BY foreignID

But this should work. Doesn't it? Or does it give results that differ from what you want somehow?

Zombywuf
Mar 29, 2008

Fruit Smoothies posted:

SELECT SUM(quant) from table WHERE foreignID IN (1,2) GROUP BY foreignID

Sounds like you want exactly that, although MySQL does weird stuff with GROUP BY. Try "select foreignID, sum(quant) ...".

Fruit Smoothies
Mar 28, 2004

The bat with a ZING

Zombywuf posted:

Sounds like you want exactly that, although MySQL does weird stuff with GROUP BY. Try "select foreignID, sum(quant) ...".

Ah.... Well the query has a few JOINs in it. Probably should have mentioned this:

code:
ID	foreginID		diffID	foreignID	quant
------------------------|	--------------------------------|
1	1		|	1	1		1	|
2	1		|	2	1		2	|
3	2		|	3	2		1	|
4	2		|	4	2		3	|
------------------------|	--------------------------------|
Making the query look like

SELECT SUM(t2.quant) FROM t1 JOIN t2 ON (t1.foreignID = t2.foreignID) WHERE t1.ID IN (1,2,3,4)

Sorry. I completely messed up changing from the complicated situation to the "simplified one". There is another couple of JOINs in there too. :(

Zombywuf
Mar 29, 2008

The group by should still work, just tack it on the end.

Fruit Smoothies
Mar 28, 2004

The bat with a ZING

Zombywuf posted:

The group by should still work, just tack it on the end.

Ahhhh! Fixed it. I grouped by t1.foreignID rather than t2.foreignID!

Thanks.

kimbo305
Jun 9, 2007

actually, yeah, I am a little mad

Zombywuf posted:

Actually, doesn't need a merge, if I'm understanding the problem correctly, it should look something like this (sql server syntax, ymmv):
code:
insert into
  sourcetable (day, id, metric, d_metric)
select
  c.day,
  m.id,
  m.metric,
  0 -- always 0 when copying the previous day
from
  calendar c
  cross apply (
    select top 1
      id,
      metric
    from
      sourcetable s
    where
      s.day <= c.day
    order by
      s.day desc) as m
where
  not exists (select * from sourcetable s where c.day = s.day);
How well this will optimise depends on many many things, but I'd expect it to have pretty good perf. The cross applied subquery is for selecting the row with the latest day that is not more than the day currently being examined.

If you want to adapt this to fill into a new table instead of the original just remove the where clause at the end.

Hope this makes sense.
Thanks, I'll give it a try. I'm using Vertica. I don't have a lot of experience, but I've seen not exists perform poorly, really fooling the query planner. Hopefully the syntax will work, and the performance will be adequate.

e: arg, Vertica docs don't show APPLY being a keyword. I'll have to wire this up before I can know if CROSS APPLY is supported.
e2: confirmed -- APPLY doesn't exist in Vertica. Nor does top, apparently. I bet I could fix that using analytic functions. But the lack of CROSS APPLY seems to be a dealbreaker. Is there a way to recast another way? The CROSS APPLY is such a succinct way to describe it, too :(

kimbo305 fucked around with this message at 20:36 on Dec 23, 2011

Fangs404
Dec 20, 2004

I time bomb.
Got a question about a query for an SQLite database I'm using. The database is described in detail here, but an abbreviated schema which accomplishes the same is here:

code:
CREATE TABLE states (
	id INTEGER NOT NULL,
	state TEXT NOT NULL,
	PRIMARY KEY (id)
);

CREATE TABLE zctas (
	id INTEGER NOT NULL,
	zcta TEXT NOT NULL,
	PRIMARY KEY (id)
);

CREATE TABLE states_zctas (
	state_id INTEGER NOT NULL,
	zcta_id INTEGER NOT NULL,
	PRIMARY KEY (state_id, zcta_id),
	FOREIGN KEY (state_id) REFERENCES states(id),
	FOREIGN KEY (zcta_id) REFERENCES zctas(id)
);
So there's obviously a many-to-many relationship between the states and ZCTAs (you can think of a ZCTA like a ZIP code). In general, most ZCTAs only belong to one state, but there are instances of a few ZCTAs belonging to multiple states (the ZCTA 30165, for example, is located in both Georgia and Alabama).

What I want is a query that gives me only the ZCTAs that belong to multiple states and the states to which they belong. So here's an example of the results I want:

code:
zcta   |  state
------------------
30165  |  Georgia
30165  |  Alabama
98989  |  Texas       --this one is made up
98989  |  Oklahoma
I've done this, and it seems like I'm close, but I'm thinking I'm gonna need nested SELECTs.

code:
SELECT zctas.zcta, states.state
FROM zctas, states
INNER JOIN states_zctas
	ON states_zctas.zcta_id=zctas.id
GROUP BY zctas.zcta;
Any help is greatly appreciated. Thanks!

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

Fangs404 posted:

Got a question about a query for an SQLite database I'm using. The database is described in detail here, but an abbreviated schema which accomplishes the same is here:

...

Any help is greatly appreciated. Thanks!

code:
SELECT DISTINCT
    zcta.zcta,
    states.state
FROM
    states_zctas
    JOIN states_zctas AS states_zctas_different ON
        states_zctas.zcta_id = states_zctas_different.zcta_id AND
        states_zctas.state_id <> states_zctas_different.state_id
    JOIN zctas ON states_zctas.zcta_id = zctas.id
    JOIN states ON states_zctas.state_id = states.id
edit: If each ZCTA is guaranteed to belong to at most two states, then you do not need SELECT DISTINCT.

Fangs404
Dec 20, 2004

I time bomb.

Hammerite posted:

code:
SELECT DISTINCT
    zcta.zcta,
    states.state
FROM
    states_zctas
    JOIN states_zctas AS states_zctas_different ON
        states_zctas.zcta_id = states_zctas_different.zcta_id AND
        states_zctas.state_id <> states_zctas_different.state_id
    JOIN zctas ON states_zctas.zcta_id = zctas.id
    JOIN states ON states_zctas.state_id = states.id
edit: If each ZCTA is guaranteed to belong to at most two states, then you do not need SELECT DISTINCT.

Very nice! This worked perfectly. This is way more complicated than I figured. Thanks man.

kimbo305
Jun 9, 2007

actually, yeah, I am a little mad

kimbo305 posted:

Thanks, I'll give it a try. I'm using Vertica. I don't have a lot of experience, but I've seen not exists perform poorly, really fooling the query planner. Hopefully the syntax will work, and the performance will be adequate.

e: arg, Vertica docs don't show APPLY being a keyword. I'll have to wire this up before I can know if CROSS APPLY is supported.
e2: confirmed -- APPLY doesn't exist in Vertica. Nor does top, apparently. I bet I could fix that using analytic functions. But the lack of CROSS APPLY seems to be a dealbreaker. Is there a way to recast another way? The CROSS APPLY is such a succinct way to describe it, too :(

I created my static table, but am getting hung up on producing the rows I want with joins against it. I could use LAST_VALUE to pull the last valid value from each id's history, but that won't work if the inner join produces null for id when joining day=day.

How would I produce these rows with an if not matched clause?

Adbot
ADBOT LOVES YOU

DankTamagachi
Jan 20, 2005
Tamagachi to your throat!
I have a query that returns the results of three UNION'ed queries. Each query has a lengthy select statement. There are parts of the select statement that look like this:
code:
coalesce(a.fact1,'Fact1'),
coalesce(b.fact1,'Fact2')
...
etc. Each of these blocks is repeated identically across all three SELECT statements.

I was wondering if there was a way to put that piece of code (the part of the select statement here), or other types of blocks (a long list of JOINs, for example), into a place where I could reference them in one line. Kind of like a mini-view/function but simply serving as a text replacement. This way, I can edit these things once and not have to edit the relevant bit of each select statement individually.

This is on MSSQL. Is there any way to do what I'm trying to explain?

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply