|
Tinyn posted:I have a situation where I need to request a some data by id numbers, but that I need to get the data for some N number of id numbers. I could just call the simple stored proc N times, for each id number, but that performance sucks. Making a stored proc that takes 100 arguments, some of which might be NULL, to allow for however many id numbers I need at a time is obviously retarded. Who is your Database Vendor? Options: Pass an Array/Stack in. Temporary Table Batch Table: 2 Row Table, a Batch_ID, and a Key for the other ID numbers, pass the Batch_ID in to the function.
|
# ¿ Nov 5, 2007 00:23 |
|
|
# ¿ Apr 19, 2024 19:09 |
|
duck monster posted:Copy and paste that poo poo, for instant View love. Its that easy. Also, views in theory are much faster than constantly feeding big querys over and over again into it. Pre-compilation or something. Not entirely sure why, but when we benchmarked it at work, the speed differences where outrageous at times. One of the reasons Views can sometimes perform better than ad-hoc query throwing is that the Optimizer/Planner will have a set plan for it. If you send things in at will the DB engine may have to hard-parse. This is where the the database will optimize the query and generate a new plan. Often times for small, simple joins the act of figuring out what the query wants takes longer than retrieving the data. If you are careful about how you write your SQL statements and you always right them in nearly the same way you can get a soft-parse which won't invoke so much overhead. This is the mechanic that the view works with to improve performance. Since the SQL is executed the exact same way each time, you get soft-parses instead of hard-parses.
|
# ¿ Nov 5, 2007 18:53 |
|
chocojosh posted:So essentially I want to do: Edit: It looks like MSSQL supports it, with similar syntax. Link You could do: code:
Temp Table Link Xae fucked around with this message at 19:53 on Nov 6, 2007 |
# ¿ Nov 6, 2007 19:48 |
|
chocojosh posted:Thanks for the link. This part of your first post: quote:After I get all this data, I need to filter it by a bunch of search conditions, Made it sound like you wanted to return a large query then filter it your self. The only Gotcha I'm aware of with Temp Tables is a bug in Oracle if you execute too many transactions against it, but I haven't used the much in MSSQL.
|
# ¿ Nov 6, 2007 20:37 |
|
chocojosh posted:Xae: Essentially I need to first get all my data, and then filter it by doing a second select statement with dynamic where conditions (i.e. I build up a where clause for each field the user can possibly search by). By filtering I meant a second select statement that can remove rows from the temp table; not some kind of filtering in my programming language I honestly wish there was an easier solution, but not if I want to use sp_executesql which I need to use (this makes the choco very sad). Is there a reason you can't to it all in one statement? Is this a two step process, where a user selects one group then refines? Whenever possible I try to avoid Selecting the same data twice.
|
# ¿ Nov 6, 2007 21:56 |
|
chocojosh posted:replace(varchar_variable, '''', '''''') You would need to check for ';' as well, since they might try to submit a compound statement. ie "Select * from sys;Drop X;"
|
# ¿ Nov 8, 2007 00:06 |
|
jwnin posted:This is a bit of a different SQL question, but I thought I'd ask it here. Just because the data does not all need to be accessed at the same time doesn't mean that they should be in separate databases. They are logically the same entity type so they should logically be grouped together. From personal experience I have worked on systems where 15 years before someone said "Just clone the DB and put it in a different schema, we will work something out later", and 15 years later the solution was to pay through the rear end for an outsider to come in and unfuck their mess. Xae fucked around with this message at 06:24 on Nov 9, 2007 |
# ¿ Nov 9, 2007 06:20 |
|
This week at work reviewing some code I saw the worst SQL statement in history. It was 350 lines long and had 7 selects in it. It UNION ALL'ed then SELECT DISTINCT'ed off the union. I took the first query, the main one, changed two joins to left outer joins, and deleted the next 300 lines of code. For fucks sake people, learn to use non-full/inner/equal joins. See this or, this for more information. Xae fucked around with this message at 02:03 on Dec 31, 2007 |
# ¿ Dec 31, 2007 02:00 |
|
code:
code:
|
# ¿ Jan 2, 2008 06:04 |
|
Victor posted:Xae, joins will possibly increase the number of fields returned, especially if one is performing a semi-join (joining table A against B, but only selecting columns from A). In fact, incorrectly using semi-joins causes people to "fix" the results by slapping in a DISTINCT, where the proper behavior would be to change from a join to an IN subquery. Hrm, aside from the cock-up about not prefacing the table name on the * I'm still not sure. I'm installing Oracle on my machine right now to check this out. drat server taking a poo poo Edit: Ran it against the now default HR Schema] and I didn't see a difference between the results of these two queries. code:
code:
. Xae fucked around with this message at 00:49 on Jan 3, 2008 |
# ¿ Jan 3, 2008 00:18 |
|
Use a cursor: This is getting pretty sophisticated, you can probably use some SQL hack to do it, but in the long run doing it "the right way" will be best.
|
# ¿ Jan 16, 2008 00:20 |
|
Get your DBAs fired. They are worthless and are providing nothing of use to your organization. They should be fired and replaced immediately. I'm not joking. I'm too lazy to check, but I would wager that query has an implicit cursor in it. If they won't fire them switch jobs, you're crippling your development as a programmer, unless you really like it there or something... Xae fucked around with this message at 02:04 on Jan 16, 2008 |
# ¿ Jan 16, 2008 02:02 |
|
If you care about your data option 1 is best. If not option 2 works decent enough.
|
# ¿ Jan 16, 2008 05:36 |
|
cletus42o posted:I've always used option 3, but then I never use constraints in my database - I handle all of that in my application. But maybe I should rethink that? Yes, yes you should. There is a reason why there is millions of dollars put into R&D. The database will do constraints, and transactions faster, more accurately, and just plain better than anything you can possibly conceive of.
|
# ¿ Jan 17, 2008 00:45 |
|
minato posted:Sure, but 99% of the time the FK points to the PK of the target table, and PKs should always have an index on them to ensure uniqueness. Foreign Keys, by definition point to the primary key (or unique field, which could be a primary key). If it isn't a key, it is just a lovely denormalized field. Having primary keys doesn't hurt performance, unless the developers are morons. So only semi-frequently does it hurt performance. Most databases will automatically create an index on a column once it is the primary key anyway. quote:Yes, but constraints are not free. Plus, if you can't be sure you data is accurate why are you bothering to store it? A full scan, only if you don't bother making an index, is a required thing. Otherwise you breach data integrity. You have orphaned records. You are now hoping that something else doesn't take the old PK and completely gently caress your system. I prefer to rely on something more than hope and dumb luck... Any system in which deletes would cause that much of a strain that it would be unworkable with todays hardware probably isn't deleting anyway. They simply end date the record, set a delete flag or otherwise execute a logical delete. Data has value, and you don't just throw it away. Xae fucked around with this message at 04:41 on Jan 17, 2008 |
# ¿ Jan 17, 2008 04:35 |
|
Victor posted:I'm afraid I have a hard time seeing how you have the experience to back this up. At least, where I work, sometimes we go without constraints, but have nightly integrity checks that run to ensure there are no orphaned records. This doesn't catch the circumstance where one can have orphan records created, incorrectly, that then get deleted before this integrity check runs. However, from my small amount of experience, I do not see this special case as a bug pattern common enough to worry about. You are still running the checks, is your machine that pegged on CPU that you can't afford to ensure that the data you are bothering to store is being stored correctly? Lets face it, hardware is cheap, the programmer time to build and maintain your own custom system to keep integrity cost more than just buying a better boxx. Also if you are that desperate for computing power why are you deleting in the first place? That isn't even touching on the fact that a single failure on an enterprise level system can cost serious cash.
|
# ¿ Jan 17, 2008 05:06 |
|
Victor posted:You're just assuming too much. The rule that programmer time is more valuable than computer cost is a good one, but it isn't all-covering, all-important, never-inapplicable. To give you an idea, checking a foreign key constraint can keep transactions open a tiny bit longer, which result in locks being held a bit longer, which can, if one is near a certain breaking point (think adding 1% to a fully packed highway), will cause the system to go crazy. Moreover, dealing with constraints can be a royal PITA, due to having to delete in the right order, and having major issues with self-referencing tables. If your system is at capacity why not execute an immediate logical delete, then come back for the "real" delete during a batch window? It will relieve stress on your system and keep data integrity. The database doesn't simply decide to do checks for a random reason. It does them because it needs to. It needs to make sure that the integrity of the database is being kept. If you do not check for child records you will problems. Someone running an OLAP system is now getting records, which should have been deleted counted in their statistics. You don't need to throw big iron at things to solve performance problems. Even a $5 k or $10k server is a hell of a machine these days. If an application takes a month of programmer time, it costs ~$10K between salary, benefits and overhead. This is why throwing hardware at problems is popular, because it works. If you are working in a completely budget constrained organization things may be different, but in that case you shouldn't delete immediately anyway, because it takes too many resources. Constraints are a tool, and a very useful one. They may force you to change how your normally operate, but they do so for good reasons.
|
# ¿ Jan 17, 2008 13:38 |
|
m5 posted:You're not familiar with the real world. The tables I'm talking about have tens of millions of rows in them. We delete stuff because we need to curb infinite growth in the database - I think it's well over 100GB now in backup form. I have worked on systems ranging from a ~1995 sunbox to a 45node Terradata array. I have worked on databases that where written before "relational" was an option. I am quite familiar with the real world. I mainly do remediation work. I come in when company decides they want a data warehouse only to realize that their OLTP side is a mess and can't be processed until they unfuck it. I see this crap about "oh, we can manage it better than the database", every goddamn time. Other popular excuses include "Well, it slows down the database!" or "We didn't want to write joins". quote:You only have orphaned records if the software has bugs in it, mostly. quote:If there are a few orphaned records floating around among millions and millions of fine happy ones, I'm OK with that. I don't need my database to be a perfect shining jewel of integrity. I need it to mostly work, and to support the heavy load it's under with acceptable performance. There is a cost associated with using bad data, and it is almost always far greater than the cost of fixing the problem.
|
# ¿ Jan 17, 2008 22:56 |
|
Part of the thing is that he is asking for design advice. You need to catch things like "will we need to partition" before you start making the system. If you don't, then you end up in a situation where you become trapped and can't fix it. If he was running in a system I wouldn't advice changing it, but since the system isn't yet built it is best to design it correctly, so that when the inevitable happens, and you are in the best position to deal with the situation. You design following "the rules", then break where needed due to money, political or other "real" constraints.
|
# ¿ Jan 18, 2008 00:24 |
|
quote:Specifically, I have a Java system that uses ResultSet from the JDBC and PreparedStatement for every single query and I think this is causing real performance issues on huge (20G) databases where the code is doing hundreds of sequential queries. I suspect that cursor performance on PostgreSQL is really bad and should switch to using limit/offset, but I need more ideas to test. That is your problem. Figuring out what a query wants, especially for a small query, will take longer than retrieving the data. You need to figure out exactly what queries are being run against the database. There is no RDBMS on the planet that will perform great with bad code running against it. The majority of tuning is done at an application/design level. It isn't done to database settings, or even individual queries. Your system has to be designed to perform well, and throwing hundreds of queries at a database is designing to kill the DB.
|
# ¿ Jan 28, 2008 04:29 |
|
Is there a reason why the SQL was being generated instead of hand written? I know I'm a huge DB nerd, but I always like to right the SQL myself.
|
# ¿ Jan 28, 2008 14:43 |
|
Victor posted:That's actually not surprising -- I have seen very few people add a condition to a left join statement that has nothing to do with the join. It's a sad state of affairs, that SQL is so poorly taught... Sad, but profitable. You can spend a couple weeks, maybe a month, learning about SQL and how to properly use it, and every one thinks you are some super genius.
|
# ¿ Feb 20, 2008 02:20 |
|
Anyone have experience with Unit Testing frameworks for (PL)SQL? I have looked at UTPLSQL and SQLUnit, but I figure there has to be more than just two out there. Both of them look to be overly complex for just PL/SQL testing, as they both seem to be hacks to get JUnit to test SQL.
|
# ¿ Feb 24, 2008 05:53 |
|
fletcher posted:PHP/PDO question: If I have a table with an auto-incrementing id field, what is the fastest way to get the id of a new row I just throw in? Just do a SELECT? Or is there some shortcut? Ghetto Method: code:
Xae fucked around with this message at 02:02 on Feb 27, 2008 |
# ¿ Feb 27, 2008 01:59 |
|
fletcher posted:I'm running MySQL 5. code:
If it doesn't look at your data, it may be that you don't have any, or some other issue exists.
|
# ¿ Feb 27, 2008 03:53 |
|
Toanek posted:Today I was working on my forum project and while adding in the 'Show My Posts' function I got confused as to how I should format my database query. I initially wanted to sort it so it would return an array of topics ordered by the date of your most recent post in the topic. I eventually decided to sort it by the last post from any user in the topic, but I figured I'd still like to learn how to do it the initial way for well... the learning experience. I can't for the life of me seem to figure it out. What RDBMS let you get away with that? Oracle will throw an exception if you try something like that.
|
# ¿ Feb 29, 2008 17:17 |
|
Victor posted:When in doubt with this question, MySQL. Well, there is another reason to hate the drat thing.
|
# ¿ Feb 29, 2008 17:54 |
|
bitprophet posted:QFT. A while ago I went and thought of the reasons why MySQL is better than Postgres. I couldn't think of a single one. Postgres is more secure, more stable, and faster. What? Faster? Yes, when it matters. MySQL gives you back that basic query when there is 0 load a split second faster than Postgres, but when the machine is under a heavy load MySQL shits it self.
|
# ¿ Mar 1, 2008 00:09 |
|
brae posted:I don't know. Several companies have scaled in a big way using MySQL (Yahoo) and carefully choosing your engine type per-table in MySQL can help with scaling because you get a little more fine grained control over locking mechanisms. My company uses MySQL and we've found it to be scaleable. Granted, we haven't done a side-by-side with Postgres, but it's certainly fast enough and for certain optimized queries it is very, very fast even under heavy load. The tweakers benchmarks are pretty enlightening. In particular on page 6 you can see how badly MySQL just curls into a ball and cries for mommy on a Niagara (SUN) chip. You also see how quickly MySQL hits the limit of its engine and the requests per second start to decline as concurrency increases, where as Postgres holds stable longer. MySQL seems to be popular because it is popular. Things are written for it, because it lets you be sloppy, so people run it.
|
# ¿ Mar 1, 2008 16:11 |
|
fletcher posted:Great info about postgres guys, thanks. If I was going to try something new should I go postgres or oracle 10g? I don't think I will be going beyond there computer limitations any time soon. I would go with Oracle. It defiantly has more "quirks" to familiarize your self with, and it has broader commercial use as well.
|
# ¿ Mar 3, 2008 13:41 |
|
yaoi prophet posted:Unindexed, I get identical plans: This. Trying to game the CBO usually ends up with you failing. There are times when the CBO just plain ol' fucks up, but those are rare. The query as you write it is almost never as it is executed. I see the same poo poo with COUNT(1) vs COUNT(*) constantly, or people claiming that subselects are faster than joins or some other poo poo. The CBO is good, real good.
|
# ¿ Mar 5, 2008 23:49 |
|
Little Brittle posted:I have a table set up like this: MS SQL Server MySQL Postgres Oracle Just google "Full Text Search" and the name of your DB, looks like MySQL.
|
# ¿ Mar 8, 2008 16:00 |
|
Nibelheim posted:Good day, Using like can kill performance quickly. I would suggest finding a way to extract the PIN from the affected_item column. Play around wiht it a bit and see what performance different methods give you on small samples. Separating the PIN can be done through either by truncating, or built in functions like Oracle's 'TRANSLATE'. If this comes up often you might want to consider going to First Normal Form and placing the PIN in its own column.
|
# ¿ Mar 10, 2008 23:40 |
|
yaoi prophet posted:You should think about whether or not you want the actual PDF's in your database. The alternative is to not store the actual files, but to load them to a more typical file server running a webserver, and then store the path to the file in the database. This is a bit of a religious war where I work, but the long and the short of it is that these two approaches have (as always) advantages and disadvantages:
|
# ¿ Mar 29, 2008 20:20 |
|
yaoi prophet posted:Well, that solves some cases but not all. In our case this wouldn't solve the problem, because you are still left with the critical disadvantages of having all of your file data being served at some point via the database (even with your solution, which just adds another layer, it doesn't really take out the bottleneck), which means pushing a high volume of data through what is commonly an already overstressed component. It also means we'd be storing terabytes of file data as part of the database and it would make managing of table data exponentially more difficult. By storing the files on external SAN nodes we can trivially add more storage without database downtime. You should be able to partition the table, vertically if the volume of the files is too much for your database. Split the tablespace and have the blob table on a separate set of disks.
|
# ¿ Mar 29, 2008 23:33 |
|
fletcher posted:So will I end up having columns that are indexed multiple times depending on how I'm looking up the data? Yes. When in Doubt check out Postgres' Docs http://www.postgresql.org/docs/8.3/static/indexes-bitmap-scans.html http://www.postgresql.org/docs/8.3/static/indexes-multicolumn.html Xae fucked around with this message at 04:19 on Apr 3, 2008 |
# ¿ Apr 3, 2008 04:17 |
|
chocojosh posted:Can you give me some reading material as to why? At my job the DBAs do not use triggers (I was told we have one in the entire application). I'd really like to know more about any dangers/problems with triggers. Using triggers for business logic = Bad. Why? Because it is hidden. And if it is hidden it is forgotten. And unless you are home to the world's most careful programmers who always check the documentation, you will get burned. It is far better design to have an explicit function that will update and manage complex relationship then it is to hide the process in a dozen triggers. One of my previous employer abused triggers. Almost all logic was performed in triggers. It was to the point where no one really knew what was supposed to happen. It just did. Except when it didn't. Then all hell broke lose because no one knew how something was supposed to be handled. And you had to jump through 4 triggers a statement, some of which altered other tables, meaning you had then jump through 4 more triggers, and it goes on and one.
|
# ¿ Apr 6, 2008 17:12 |
|
Stabby McDamage posted:Thank you thank you thank you. I was rapidly losing hope. I'll try that out when I get back to work. Postgres tends to be faster, when using constraints or with multiple users, more reliable and have more features. MySQL tends to be a bit easier to use and setup.
|
# ¿ Apr 12, 2008 21:34 |
|
StonedogJones posted:I have a text column in a table that has random comma's in it. I think Replace() should do the trick http://doc.ddart.net/mssql/sql70/ra-rz_6.htm
|
# ¿ Apr 15, 2008 02:41 |
|
|
# ¿ Apr 19, 2024 19:09 |
|
StonedogJones posted:no i dont think replace will do it. the column has different text in each row, so i cant search for a string, and then replace something in that string, because the string is not constant for each row. Alter for your flavor of SQL: code:
|
# ¿ Apr 15, 2008 12:47 |