SELECT * FROM Questions WHERE Type = 'Stupid'

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > SELECT * FROM Questions WHERE Type = 'Stupid'

«‹›10 »

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

Beelzebozo posted:

This is called an anti-join. I'm a postgres DBA so I'm not sure if MySQL has some proprietary syntax that makes this nicer, but in ANSI standard SQL you can always implement an RA anti-join as an SQL OUTER JOIN where one key is NULL.
...

Why is this not just:

code:

SELECT name,
       a.studentid,
       coursecode
FROM   assignment a,
       students s
WHERE  s.studentid = a.studentid
       AND assignnumber = 1
EXCEPT
SELECT name,
       a.studentid,
       coursecode
FROM   assignment a,
       students s
WHERE  s.studentid = a.studentid
       AND assignnumber = 2
ORDER  BY coursecode,
          name;

But if the next question is going to be, "How then to find those that have submitted '3 but not 4'?", and onward, and you're really asking "What is the minimum non-submitted assignment?", perhaps the CTE approach is the way to go, if it is supported:

code:

WITH RECURSIVE lasta(sid,code,anum) AS (
  SELECT StudentID,CourseCode,AssignNumber
    FROM Assignment
    WHERE AssignNumber=1
  UNION
  SELECT l.sid,l.code,a.AssignNumber
    FROM lasta l,Assignment a
  WHERE
    l.sid=a.StudentID AND l.code=a.CourseCode
    AND a.AssignNumber=l.anum+1
 )
SELECT sid,Name,code,max(anum) from lasta,Students s
WHERE
  lasta.sid=s.StudentID
GROUP BY code,sid,Name
ORDER BY code,Name
;

Modulo requirements for nulls and so forth.

# ¿ May 8, 2013 04:31

Adbot: ADBOT LOVES YOU

# ¿ Apr 28, 2024 20:14

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

Scaramouche posted:

This is another kind of an academic question, since I've already done it but I was wondering if there's a better/different way. I basically need to choose a random record id from a pool of non-incremental ids. This means that I can't just go random range, since each ID has to be valid. What I ended up doing was getting a count of all possible ids, generate a random within that range, and then select the ROW_NUMBER within the counted range, like so (if this looks familiar I've jammed together two very popular yet separate tutorials on random numbers and ROW_NUMBER):
code:
DECLARE @Random int
DECLARE @Upper int
DECLARE @Lower int

SELECT @Upper = COUNT(TableID) FROM Table1 WHERE Criteria=@blah
SELECT @Lower = 1
SELECT @Random = ROUND(((@Upper - @Lower -1) * RAND() + @Lower), 0)

SELECT * FROM
 (SELECT row_number() OVER (ORDER BY Rand() DESC) AS RecNum,
 * FROM Table WHERE Criteria=@blah) T WHERE T.RecNum = @Random
Is there a better way? I'm introducing a bit of overhead this way but it was the first thing I could think of.

I would expect a true min/max for lower/upper to be faster (generate the random ID in the same way) if coupled with SELECT * FROM T WHERE id>=random ORDER BY id LIMIT 1 ?

# ¿ May 16, 2013 12:30

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

RoadCrewWorker posted:

The expression is an aggregate clause, and no, the subquery table can be generated in under 20 seconds. I gave your idea a shot but it behaved similarly. Thanks for the hint though!

In the end i went with a combination of unlogged temporary tables using proper indexing to improve the join (since that's apparently not kept in memory). Still ends up at 45 GB IO writes (which still baffles me a bit) but at least it only takes 15 minutes.

I guess i bought those SSD NAND cells to be used after all.

Creating temporary tables with indexing should give you better insight into this behavior, but what's missing from this story is the posted EXPLAIN statements, together with the actual aggregate expression, together with the relevant description from the tables. As Aredna pointed out, this may well all come down to indexes and references, and the data type of B.y/A.x is of the utmost importance, particularly regarding the field's storage format.

I imagine that you were generating a race condition: You fire off updates of a single row that forms updates in multiple pages. The primary data pages may not have room for the updates, so it has to create new pages. It updates so many records so quickly that your logs start filling up with CHECKPOINT warnings because checkpoint_segments is too low for this type of operation. A new checkpoint will be forced, so each buffer must be fully backed up to the WAL if it is changed (see full_page_writes). Meanwhile, you may be triggering autovacuum because it's emptying out page storage where it couldn't fit the updates, and now the transaction log is filling up with those as well...

Have you checked your logs?

DreadCthulhu posted:

Unwashed newb question here: what are some easily achievable precautions I can take with Postgres to make sure that I don't cause too much havoc on my production data in case I fat finger something, in case my app goes nuts or I accidentally format the machine etc.? I imagine daily (or even more frequent) automated backups are one step. What else?

Define an app going "nuts". To prevent downtime from a formatted machine, server crash, etc., run streaming replication (possibly synchronous) to provide failover. To handle a loading event in the application, design it to permit task separate to offload read-only queries to standby systems. To prevent data corruption due to a DBA's inability to type ROLLBACK, build your database to have audit logging so you can rollback. To permit reversal to some arbitrary date three weeks ago because you don't catch your mistakes for months, use Point-in-Time Recovery and roll forward with your audit log. For backups, consider the size of the data set and the inherent daily transaction volume: Run your backups on the master or use WAL shipping; have a good reason to decide to run a base backup off a 9.2 standby.

# ¿ Jun 4, 2013 02:41

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

Agrikk posted:

I've completely spaced out on how to subtract one row from another. If I have unordered data:
code:
units	stamp
18892	2013.06.13.AM.11.48
18892	2013.06.13.PM.01.48
18891	2013.06.12.PM.10.48
18887	2013.06.11.PM.08.47
18887	2013.06.11.PM.10.48
18891	2013.06.13.AM.05.48
18888	2013.06.12.AM.02.48
With the timestamped samples taken roughly every hour, how do I retrieve the hourly increase/decrease? This is in SQL Server 2008 R2.

Sadly, this seems to suggest that LEAD/LAG were not made available until 2012, and comments note that you'll be stuck doing a join on rownumber-1. As I was searching for a way to do this, I came up with something like the following in PostgreSQL using window functions:

code:

SELECT stamp,units,units-LAG(units) OVER (ORDER BY stamp) AS diff FROM table ORDER BY stamp DESC;

Is it too much to hope there's a WITH clause so something like this will work?

code:

WITH t2 AS (
  SELECT t.*,ROW_NUMBER() OVER (ORDER BY stamp DESC) AS rnum FROM table t
  )
SELECT
  t1.units,t1.stamp,t1.rnum,t2.rnum,t1.units-t2.units AS diff
FROM t2 t1
  LEFT OUTER JOIN t2 ON (t1.rnum+1=t2.rnum)
ORDER BY stamp DESC;

(Yeah, the costs for the window approach are excessively below those for the join and, no, I have no way to test sanity of any of this against squirrel server. Maybe this is not the best approach to solving your problem?)

# ¿ Jun 14, 2013 06:02

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

Sab669 posted:

I have a folder with a bunch of images in it, and a table with a bunch of empty "image_loc" entries. Is there an easy way to populate that image_loc column with the file names from my images folder? The image doesn't need to match up to the right record, it's a test database

We know you've probably already spent hours doing a bunch of work by hand, as people are apt to do after they ask these types of questions, but you can get around the standard issue of RANDOM() only firing once in a subselect by using an array and a one-liner (for a kinda-short line):

code:

TB=silly ; K=id ; C=word ; \
echo "WITH arr AS (SELECT regexp_split_to_array('"`ls`"',' ') t),
tb AS (SELECT $K,arr.t[1+floor(random()*array_length(t,1))] as res FROM $TB,arr)
UPDATE $TB SET $C=res FROM tb WHERE $C IS NULL AND $TB.$K=tb.$K;" \ 
| psql sillytest

Using a table of values as the inputs is left as an exercise to the reader.

vvvv Thanks for the positive feedback. I expected to get banned for feeding the trolls, but it was actually not the most obvious thing in the world and I felt there was a cute trick or two in there worth sharing. Oh yeah, it suffers from SQL injection problems and filenames with spaces (which are dumb), but that's why I figured I'd leave the Join mah tabels! as an exercise for now, so people can work on how to avoid those issues (with other expenses, of course).

PhantomOfTheCopier fucked around with this message at 22:54 on Jun 20, 2013

# ¿ Jun 20, 2013 04:49

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

You can tune for years only to have an increase in traffic defeat everything, followed by a complete redesign of how you handle replication, query offloading, failover, backup, and so forth. I can't imagine tuning for a 20MB database to really require anything beyond the defaults these days. At 20GB, you need to tune and be a bit smart about how you do things. At 200GB, you have to be capable of reading the documentation for PG tuning, and you have to integrate and apply knowledge from other sources, including arguments with others; you can still get away with some mistakes. At 2TB, you can no longer live with a crappy application in most cases, so you have to design for the 'horizontal' benefits of replication.

I'd wonder if you're making this decision based on current or expected needs, or some other factors that you haven't identified here. For example, there seem to be some reasonable discussions online comparing cloud providers, and you might save time and/or money by going a different route.

# ¿ Jun 26, 2013 23:03

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

Fruit Smoothies posted:

...I want a permissions table...

// Linking table

code:

---------------------------------
| user_id | perm_id  | hasPerm	|
---------------------------------
| 1	  | 1	     | 1	|
| 2	  | 1	     | 0	|
---------------------------------

Why store 2,1,0 or the hasPerm field at all, when it's clear that the non-existence of the 2,1,0 row implies that the permission is not, in fact, granted?

# ¿ Jun 28, 2013 03:00

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

Regular expressions, but this type of thing is how software ends up needing completely rewritten at some point after version 0.10.

# ¿ Jul 3, 2013 02:39

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

NOT IN is based on a value comparison*, whereas EXCEPT is row based. In all databases that I just checked, NOT IN was actually faster and the execution cost lower. Nevertheless, I believe the arguments that NOT IN is likely only useful when the expected number of exceptions is small. If you have a complicated subquery or a great number of rows, it seems that EXCEPT is likely to win.

They are also not precisely the same: "Note that if the left-hand expression yields null, or if there are no equal right-hand values and at least one right-hand row yields null, the result of the NOT IN construct will be null, not true. This is in accordance with SQL's normal rules for Boolean combinations of null values."

Whatever you do, stay away from those Rails queries that dump out "IN (1,2,3,4,...,1998,1999,2001,2002,2003...)". In case you missed it, NOT IN behaves differently in different database engines.

* Unless you're using a multi-column construction.

# ¿ Jul 9, 2013 03:45

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

Definitely the client: http://stackoverflow.com/questions/13626965/how-to-ignore-pkix-path-building-failed-sun-security-provider-certpath-suncertp but check the PG server logs if you can just to be sure nothing else is missing.

# ¿ Jul 9, 2013 06:49

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

Scaramouche posted:

A) They usually represent application logic in the data layer
B) Depending what they do they can help contribute to locking and unintended consequences from updates/inserts
C) It feels like every time I've ran across one it's because of either a design flaw or it prevents me from doing something I want to do

What I find amusing about this list is that it could well represent the troubles with crappy code. An application doing data layer fappery is no good, applications cause all sorts of locking problems (particularly if they don't let the database handle locking), and every single application I've seen prevents me from doing what needs to be done (which is why DBAs end up running SQL directly).

I recall looking into the current state of 'application layer' versus 'data layer' versus 'business layer', figuring that the current agile crowd would have made some arguments that resulted in a database being nothing beyond grep. Silly me for thinking there would be any documentation or commentary. What I found was that these things are roughly as ill-defined as they were 15-20 years ago, except now people are ripping things out of the database and wondering why performance is so bad.

For a database with proper trigger support, they: Enforce transaction isolation very well, compared to an application that may break atomicity by running multiple, independent statements. Are likely to be more efficient as the database can manage any optimizations. Are less likely to cause locking issues as they will complete faster (even an application that correctly handles the transaction has to perform multiple steps before a commit, so more time is spent on the transaction). May be more efficient for page storage on disk, as things like transaction logs are likely to more quickly represent the actual data state. Are likely easier on and certainly enforce replication.

Certainly I find this complaint about audits to be a bit strange. SQL is notoriously bad with audit support, and triggers are pretty much the only way to push a database upward in the world to support auditing correctly. Indeed, the audit triggers could be really crappy, even to the point of causing server crashes, but proper triggers for auditing seem to be one of the best uses for triggers that exists.

This makes me wonder what the NoSQL model is for audits. hash-key=value, then hash-key-audit-prior=value and hash-key-audit-post=valueB, and hash-key-audit-prior-audit-prior=value,... :razz:

# ¿ Jul 11, 2013 21:56

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

DreadCthulhu posted:

Awful question of the day... I need to record the # of times each of these uniquely identifiable questions were answered wrong/right and when... YAGNI... raw datapoints for now until I see the performance degrade to justify optimization? Also, is there an accepted way of "compacting" old data that might have little to no value at its original degree of granularity?

Part 2: on an unrelated note, is it ever ok NOT to have a primary id in a table? In the analytics example, I might not really ever need to identify a datapoint uniquely and might want to save 4-8 bytes. Is it cool to never create that column?

I find this question awful only because I somehow equate it with "build this for me", but you've given enough info to warrant a few answers that are of general interest.

For your part two, if you follow the table design you gave yourself at the beginning, the table structure given by Super Delegate does have a primary key, it just doesn't happen to be a single column constructed for that purpose. (This is a valid use of the term 'primary key' in PG, though it might be distinguished from an abstract 'compound key'.) Indeed, (SurveyResponseID,PersonID,QuestionID,AnswerDate,Correct) is your key, and there's nothing particularly wrong with that. While most tables benefit from a separate PK column, it is of little use here.

As others have mentioned, your data set is really not that large. You have used a word that throws crowds into fits of hash stores and NoSQL chaos, but with a proper index or three, you should be able to get the data you need very quickly. Make sure your columns are all plain storage so you don't have to deal with extended column retrieval. That you should never UPDATE this data is very useful, so dinking with table and index storage parameters and considering CLUSTER may prove useful.

For now, stick with what appears to be the most straightforward storage approach. The trouble with YAGNI here is that it is appealing from a runtime point of view, but you cannot 'unsummarize/deaggregate' data if you haven't stored it. There is a great deal of potential algorithm design and optimization hiding here, and until you have a notion of those needs, you'll be wasting time building useless procedures. Any optimizations that work for even a month of data will most likely still help you in future redesigns.

# ¿ Jul 12, 2013 12:36

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

Anaxandrides posted:

I do still highly recommend that a proper primary key and clustered index be created, otherwise you end up with a heap table. This table would be slow indeed for data aggregation if it existed as a heap.

I guess I don't see the benefit of having a dedicated primary key using a sequence in this case. Adding that gives (ID,SurveyResponseID,PersonID,QuestionID,AnswerDate,Correct), a NOT NULL constraint to check, and a UNIQUE index on (ID) to check, update, and for queries, plus the sequence to nextval() on insertions. In contrast, I propose creating the table with PRIMARY KEY (question,person,survey) (and maybe date?), requiring a multicolumn NOT NULL constraint and forcing a UNIQUEness on (q,p,s,d?).

But why create a separate ID field when DreadCthulhu is likely to want to enforce unique+notnull on (s,p,q,d) in any case, and the index on ID doesn't apply to any queries planned for the data. Based on the earlier description, we can surmise needing at least an index on QuestionID, but would an additional ID PRIMARY KEY make those scans any faster? As this StackExchange discussion suggests, PostgreSQL is more likely to make use of leading columns in a multi-column index, so I still feel like a primary key of (q,p,s,d?) is ample for this case. (It seems an index on person may well be required in both cases.)

Of course, what we haven't heard from DreadCthulhu is whether or not a good supply of one million faked data points have been inserted into the relevant database and a few of the approaches profiled for performance based on the needs of their actual algorithm. Assuming no data changes, nor really even any deletes except routine maintenance, it seems like CLUSTER plus proper partial indexes might make the existence or lack of a singular ID field imperceptible or superfluous.

edit: vvvv I misread. Thanks for clarifying, Anaxandrides. We can now :cheers:

together.

PhantomOfTheCopier fucked around with this message at 04:57 on Jul 13, 2013

# ¿ Jul 12, 2013 22:22

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

Safe and Secure! posted:

House, Rooms, Chairs, Stuffs

Wait, isn't this equivalent to the species question, page 160 of this thread, depending on what you want to emphasize? In particular, if you're trying to focus on an easy solution to limit Stuff to a single container:

pre:

Container:  ID, Name, TypeID, ParentID
Stuff:  ID, ContainerID, ItemID
Items:  ID, etc.
ContainerTypes:  ID, Name

Or if you're trying to focus on the relationship between the containers:

pre:

House:  ID, Name
Room:  ID, Name, HouseID
Chair:  ID, Name, RoomID
Stuff:  ID, House, Room, Chair, etc

In the first case, you get to quickly deal with Stuff at the expense of doing more SQL work in support of your Container requirements. In the second case, you get your container structure for free at the expense of spending time dealing with needing a multicolumn uniqueness constraint that considers nulls as equal (for example).

# ¿ Jul 15, 2013 20:32

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

DreadCthulhu posted:

- a bunch of admins with potentially overlapping sets of users they oversee (they can read their messages)
- admins to groups join table
- groups of users
- groups to users join table
- users
- messages owned by users
Now if you want to figure out, as an admin, which messages you have access to...

Instead of going off the deep end, allow me to start off with the Professor's/Student's Nightmare/Bliss: "Not enough information". If the first step is to understand the problem, then we friendly goons have an inkling, but only you know what you want. Ergo, we move to step two, the generation of solutions. I'm not saying you haven't thought about any of this, but:

I propose that you think about all ways this could be done. You may already be stuck with the layout you describe, but I'm going to assume you aren't :iamafag: so there are a great many pertinent questions: Are administrators just users with special privileges, or an entirely different collection of people (presumably)? Can users be members of multiple groups? What privileges does group membership provide? Are groups recursive/hierarchical? Do administrators just have the ability to see things, or are there other privileges they do or might have? Is a message posted just "by a user", or is it by a user into a specific group (to which the user belongs)?

Generate ideas based on potential pragmatic models. Is this supposed to be like the Unix filesystem? Is it more like the SQL role system? Should administrators mirror SA moderators? Is it a corporate structure with workers, supervisors, managers, middle managers, middle supervisors, dumbjerkery, chiefs officers, seventeenth level presidents, and bigwhigs that are so far gone no one knows their title?

It's clear that one can propose all manner of theoretical contrivances --- no, your example is not contrived, just not really narrowed down in my mind --- that will require many joins. We frequently see examples approaching arbitrary tree structures, and those are certainly the simplest examples of this list of joins you've indicated. Separated tree structures are most likely to require joins by their very nature, whereas some of the above models get around that by flattening "administrator privilege".

For example, one could contrive: Employees -- Supervisors -- Managers -- Administrators as a pure hierarchy, and then create a completely separate "Quality Control" organization of overseers of each one of those levels: EmployeeWatchers -- SupervisorWatchers -- ManagerWatchers -- AdministratorWatchers. Getting "All employees under admins that AdminWatcher 552 watches" is clearly not so easy. If supervisors have multiple managers, it's more complicated, not to mention the question about who manages the ManagerWatchers and who watches them. :buddy:

You could likely make lots of progress by flattening a bit. Unix groups, for example, require explicit user membership and there's no recursive lookup to consider. SA permits everyone to see everything (well, as far as I know), but a system of privileges controls other changes. SQL roles, PG privileges, SELinux, and others, are all based on tagging each object so access/privilege control can be enforced, so there's a great deal more operational overhead.

Think about lots of possibilities, narrow it down to two or three models that seem to fit your needs, and then consider the functional and performance side of the issue. One could flatten your tables with triggers or materialized views, but not without first being very clear (with oneself) about the expected data set size, update cycle, and expected optimizations.

# ¿ Aug 4, 2013 20:13

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

DreadCthulhu posted:

That was awesome... It actually took me a while to process what you said,... Once (and if) we need to add a few more complex scenarios I'll certainly revisit the design and see if it can't be significantly simplified by imitating one of the existing patterns out there, as you suggested.

On an unrelated note, am I correctly understanding that in PG with WAL-E I could safely NOT base backup pg_xlogs (by say moving the folder outside of pgdata and symlinking to it) and still be able to do full recovery in case of meteor strike? If I'm getting this right, the WAL segments and the base backup without pg_xlogs should have everything I need to do full recovery.

You're welcome. If I did one post per thought instead of lumping them all into a single page, I'd have a higher post count. Haha. Whatever you do, remember that things like tree structures or the examples given are the types of things that suffer from design lock-in. You may get stuck with inefficient extensions, and processing the data to model it in a different way might take considerable processing time.

For base backups, note that the documented procedure (24.3.3.5) is to remove pg_xlog anyway. On the other hand, if you're not appropriately using the -X option, you may miss transactions on high-traffic systems. The argument is that a base backup plus all WAL archives is sufficient to roll forward to a specific point in time. Note that point-in-time recovery takes time to execute (though there are some tricks to speed it up) and that's after you've transferred the base backup onto the recovery system in addition to all the archives (whether streamed in or flat copied). If your archived transaction logs exceed the size of a base backup (for >20GB clusters), you're getting to the point where you need to consider the balance between the two. In any case, it seems unwise to have more than a day without a base backup. Bleh, there are lots of exceptions here.

# ¿ Aug 9, 2013 23:23

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

Sab669 posted:

I have a sort of database design question I think... Variables, formulae, functions, oh my!

(((D54*0.30)+(E14 X E11 X 0.3464)+(E14 X E11 X 0.125)+95+35+5+((D54 X 0.95) X 1.11))+600) / E11

You need to ask yourself what is appropriate to store in the database given your needs. For example, if you have a small number of frequently-used formulas that appear in a number of queries, you can get away with using a stored procedure. Once the number of such grows rather large, however, that becomes somewhat non-optimal, particularly if you're expecting to use a query to resolve a function name and magically "dereference" it to execute as a procedure.

What is the nature of these constants, and the expected purpose of database storage for them? Are you expecting parameters to be varied by user, invoice, or billing cycle? Do you need an audit trail for parameter changes? Are you merely storing them because they are used again later?

Regarding the functions, there are security concerns dependent on the type of stored procedure you create, but it does not sound like you have enough function variation to warrant a more complicated approach to storage. I mean, you could create a table of named functions, their TeX representation, and any of a number of valid computational representations, followed by EXECUTE calls and some flappery to get the appropriate parameters, but that's no small amount of work, blows query optimization, and it's quite likely that you have a programming language of choice that would permit you to do all that work at the application layer.

Edit: Based on your response below, it sounds like you just need to save them in a shipping parameters table so you can select them, load them into a hash, and use them when you're generating output reports. As you've mentioned no need to have all that generation within the database, there will be less pain if you stick with whatever approach you find easiest for those steps, because while you can building a reporting view with the appropriate per-item columns for weight and size charges (assuming the appropriate data is already stored with each item), some values likely depend on aggregates, so you're really talking about a two step (or more) report and a bunch of things that you'll likely want in revision control anyway unless you really are setting out to hurt yourself.

PhantomOfTheCopier fucked around with this message at 02:38 on Aug 20, 2013

# ¿ Aug 19, 2013 19:57

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

DreadCthulhu posted:

Is there any reasons not to specify ON DELETE CASCADE in PG? I really like the convenience of being able to delete one single row at the top of the relationship chain and not have to bother cleaning the DB bottom-up. I've heard arguments that it protects you against fat fingering things accidentally in production, but seriously? I'd appreciate an expert opinion on this one.

Perhaps SET DEFAULT or SET NULL makes more sense for the columns with references. :razz:

To be more serious, ask yourself an opposing question: Under what circumstances would you want CASCADE? You've mentioned wanting everything else referencing the data to disappear, but in what situation does that make sense? Will a cascading delete lead to data invalidation or inconsistency (with the real-world reality being modeled)? Does it save you from considerable work or only a minimal effort?

Now think of some of the things that could go wrong: Do you truly want it to delete, or is a null or default option more appropriate? What should happen if a cascade starts but encounters a referenced column down the line that doesn't cascade? How do you handle cascading into a many-to-many relationship (when you delete an account, do you just delete the item from the receipt that credited that account, or do you delete the entire receipt)? Do you want it to cascade into trillions of referencing rows, or should that be handled manually during maintenance to prevent database response issues? While auto-vacuum will mark pages for use, does this change your routine vacuum schedule?

If you're building proper triggers and/or procedures for more complicated structures, then you need to set RESTRICT so you can raise and trap errors. You'd be stuck with a two-pass approach in the application otherwise (check for references before deleting), whereas a procedure can ensure that appropriately-referencing rows are removed and raise an exception to return to the application. The distinction is "delete from user where name='moses'" and not really knowing what CASCADE deleted, versus the noise that is likely to be generated when you "strike_from_the_memory_of_man('moses')". :buddy:

PhantomOfTheCopier fucked around with this message at 07:22 on Aug 25, 2013

# ¿ Aug 25, 2013 07:19

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

DreadCthulhu posted:

How do people deal with needing different versions of a very similar parameterized query?

I didn't like the answers because I didn't like the question. :razz:

Do you mean a VIEW or a set-returning FUNCTION? If you want something returned, why not always return it and ignore it when you don't need it? Did you mean LIMIT 1 or CURSOR or PREPARE? A database reorg requires appropriate attention be given to all the objects, not just the tables, and while PostgreSQL tries to help you (ERROR: cannot drop table X because other objects depend on it... DETAIL: type Y depends on table X...), it certainly won't catch trouble outside the database until the code passes in the old/faulty queries.

Pardot posted:

I did actually look up a fast fourier transform postgres extension to respond with

Ah, now I see why he edited out the entire post... well, it was a cute question anyway, and now I know someone has already built the functions for FFTs. That might come in handy.

# ¿ Sep 7, 2013 21:48

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

DreadCthulhu posted:

When trying to optimize queries, is it pretty normal to be fighting with the query planner not using a multiple column index correctly when inner joining against it? As in, say I have a table with user_id and type, or something like that. I have an index on the combo of the two columns, but a query with three inner joins, the latter against that table, result in a seq scan on the 2nd column, completely killing performance. I have to basically rewrite the query to have both columns in the ON clause, including a user_id IN (..subquery..) to make it work correctly.

http://www.postgresql.org/docs/9.2/static/indexes-multicolumn.html

Read the end of the first big paragraph. The planner is deciding that the sequence scan is faster than a full index scan.

# ¿ Oct 12, 2013 07:48

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

DreadCthulhu posted:

...Ok so here's a paste of the explains: https://www.refheap.com/c32e41c10dc93d83e35cb536d
...I'm sure I derped out somewhere in there, oh well.

The fiddle wouldn't load, so I'll ask a ridiculous question. There are six different orderings of lines 28-30. Do you get the same query plan for all six?

# ¿ Oct 13, 2013 03:51

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

roomforthetuna posted:

Thinking more on my aforementioned "incredibly slow InnoDB update" issue,..

Because it wasn't slow at all when the database was started, and I can't imagine 32 rows is enough to make it have a meltdown, but 4500 reusable spaces that are each too small to fit the new data-lump, being examined sequentially and scattered through a file that's always being randomly disc accessed, might do it maybe? Is that what InnoDB would do?

Nope, the standard size "issue" for MySQL is 8000 bytes, so you shouldn't even have to worry about your row being stored "off page", except perhaps for the rules field (depending on its size). After reading your initial post a few times, I came to realize that your update is of a single row. Updating a text field in a single row of a 100kB data set should never take anything close to a second, particularly since it's referenced by a single, primary key field.

Is this the only statement within the transaction, or are you hiding thousands of other things in there? What else is happening on the system? Do you have a bunch of waiting metadata locks contending for update? Check your logging settings and SHOW PROCESSLIST to see what is causing the delay. Assuming you can even use ed/vim/nano/cat/more on this system, and that these query times are so highly variable, it seems most likely that you are locking the row in some fashion and not releasing it.

There are many other choices, "Using an 8MHz processor and an IBM WD-12 5.25" 10MB Internal Hard Drive" not the least among them, but you have some very serious issues if you have to blink more than once to append a nominal string to a text field.

# ¿ Oct 14, 2013 05:05

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

roomforthetuna posted:

... There may be something insanely slow about the hard drive of the server though - saving a file from vi often takes a second. If MyISAM is caching things in memory and writing out in the background or something then that might explain the ridiculous speed difference. It's a virtual private server so it's hard to say what the hardware is like.

It's likely that the VPS provider will tell you that you are using shared resources and should expect to see periodic and random performance degradation, so you'll want to run some more basic tests before sending them a complaint. In particular, if they have a database or MySQL person, they're likely to check all your performance settings which is a very nice idea but should honestly not matter for a single-row transaction in a 100kB database (unless you've done something silly like lowering the buffer pool size to 10 bytes). If they hand the issue to a systems administrator, they might recognize the SQL, but they'll be more interested in disk performance as a whole.

Find yourself any basic intro to using `dd` as a benchmarking tool (such as this) to get some basic timing information. Test reads and writes. See if you can narrow this misbehavior down to a fixed period of the day or not. Try to get some base CPU values as well (time to calculate the first 100000 primes, etc.). You'll want to demonstrate something specific (unless they really just have lots of spare time to help you), or that the disks are seemingly fine but MySQL isn't (which is a bit non-sensical, though it pains me to say it). If you get to that point, a full dump and restore may help (maybe without your configuration, since there could still be something hiding there).

There are a few other MySQLish possibilities, but I should think you'd have to be hiding lots of triggers and bloated indexes from us for them to be likely. Being on a virtual system, there's more potential trouble there. You could have bad neighbors, a RAID drive that's rebuilding, and a host of other things going on that you can't see.

# ¿ Oct 14, 2013 14:12

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

roomforthetuna posted:

$ dd bs=1K count=2 if=/dev/zero of=test conv=fdatasync
2+0 records in
2+0 records out
2048 bytes (2.0 kB) copied, 6.19403 s, 0.3 kB/s

Ye freaking :effort:

s, it's worse than even I imagined. The forking 1541 drive on a Commodore 64 got 300 bytes per second!

# ¿ Oct 15, 2013 00:19

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

DreadCthulhu posted:

...make a DB call before insertion to make sure the row doesn't exist. Obviously this fails in the case where two of these calls are made...(like Rails)...

Yeah, this is called a race condition, and it plagues Rails types because MySQL seems to keep changing how they handle things like auto-incrementing fields, return values, upserts, replication key failures, blardy blar.

Let your real database do its job.

DreadCthulhu posted:

I should just attempt inserts, and then catch and interpret the exception correctly in the web app. This way I get to make only 1 round-trip and don't have to have account for the possibility of uniqueness violation in 2 different chunks of code.

For even the simplest of applications, insert failures tend to result in finite-looped retries, possibly with exponential back-off timing. Applications that are more intelligent would be more concerned with the type of error returned. Note specifically that an INSERT can fail for reasons other than a uniqueness violation. In any case, your application should not be causing uniqueness violations; i.e., it should not be making choices about things like primary, randomly-or-sequentially assigned keys, because the database can handle the uniqueness and check constraints and return the values ultimately inserted to the application.

# ¿ Oct 20, 2013 08:58

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

DreadCthulhu posted:

Am I correctly understanding that a serializable isolation level transaction should be able to take care of most of the scenarios where I need a bunch of tables to be in sync and follow a certain set of constraints?

While that should be able to "take care" of those scenarios, you have not yet given a use case for it here. You should be able to keep a bunch of tables "in sync", follow constraints, and handle all typical insertions and updates, at the default, read committed isolation level. If your application can't facilitate retries and proper error handling with a normal rate of transactions at that level, there's no way it will be ready to handle serializable.

One of the benefits of a layered design is that you have multiple chances to prevent a mistake, reducing the likelihood of data corruption over time. None of the layers should particularly "trust" what it's being told, nor should it waste much time on things that belong elsewhere. Remember that there are different techniques for different purposes, so while you can easily cache a "group permissions array" in an application, for example, you need to program as if larger things like tables of user data always require a database call.

# ¿ Oct 22, 2013 14:14

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

DreadCthulhu posted:

Well one situation is, to steal the scenario from SO, I have a table of photographers and a table of photos. I don't want a photographers to have more than 10 photos each...

Your proposals all make me very sad. While it doesn't address some issues in the generic case, on user creation (probably) create ten photos for the user (row=userid,NULL photoid). When the application wants to add a photo, take an update lock on the first userid/NULL row and add the image there. If a second transaction attempts to update the same null row, it will block, then fail, and your retry can handle the scenario of no free buckets. Likewise, if the user wishes to replace a photo, it's in a specific slot/bucket, so you select the row for update.

In other words, the "upload a photo" portion of the application has no functionality nor permissions to add rows to the photo table, only to issue updates to existing rows. This locks a single row, namely a single row per user, for uploads, instead of an entire table (which could mean lots of things, but yeah) or, :ohdear:

the entire transaction engine.

(The generic case isn't fixed at "ten", but you could create a "pre-update" routine in the "add pictures" portion of the application that would add a chunk of free buckets up to a limit so there were always a certain number showing as free. You can also do forceful things like always dropping off the last/oldest, but users tend to prefer errors to those types of behaviors.)

# ¿ Oct 25, 2013 14:04

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

GrumpyDoctor posted:

...but as there become fewer data points it gets weirder, and by the time the series has a single measure because it's just the total annual heating load I think a timestamp (on when, January 1st?) is downright silly. Of course, I could replace "timestamp" with "series index." Is that weird? That was kind of my original question.

Are you trying to create a round robin database, or a store of data on which you actually need to perform structured queries? Neither invent arbitrary data to associate to values, nor omit values of interest from the data actually collected. Model the actual data.

Sooo... what is the structure of the actual data, what is your intended use of it, and what are your constraints?

# ¿ Oct 27, 2013 22:51

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

DreadCthulhu posted:

The problem here is that this is an area of SQL I'm very unfamiliar with. I have done little to no real concurrency-conscious work with performance and medium sized data sets in mind, so a lot of the stuff I propose has the likelihood to not scale in any sane way. Is there anywhere I can learn more about this specific topic? Even the postgres perf book barely grazes this, although it does mention count being a tricky beast.

I'd say the reason you have yet to get an answer in the friendly forum is that this is a combination of a bunch of very simple topics and a few that can become terribly complicated. A performance book is likely to concentrate on server settings and maintenance (perhaps) related to running a smooth database (as a server), but it can't really have a notion of the types of incoming connections expected beyond those supported by its model. For any system, you can always arrange to kill it, but no book is going to belabor the ways in which it can be done.

Presumably you've read the entirety of the Concurrency Control chapter (PG documentation) and the SELECT FOR UPDATE notes and caveats. Otherwise, I myself am having difficulty finding a reference for this issue, beyond the obvious Internet and book searches, merely on the grounds that the extremes are obvious: With no concurrency control, you can get conflicting or multiple writes. With pure serialization, everyone has to wait around for each transaction to conclude (assuming it ever does). You don't ever want either of those --- the former is used for key/value stores where consistency is dropped in favor of availability, and the latter is useful for schedule maintenance events when you must ensure certain changes occur in order.

There's a lot of cool stuff hiding in the PG docs, but first ask yourself how you'd manage a problem with the defaults, and then determine what is gained and lost if you use something from the database that seems to help. It will give you a better idea of costs and benefits.

# ¿ Oct 30, 2013 00:43

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

Vasja posted:

Is it worth it to use a prepared statement when doing a large set of inserts into one table? I have to process and insert around ~140,000 records into my PostgreSQL database, and I am wondering how much of a savings using a prepared statement vs. just straight inserts would give me. I'm working on adding the prepared statement and profiling it, but I have to change quite a bit of code to do so, so I figured I'd ask here first.

Some random test suggests something within reason. COPY will be fastest*, followed by INSERT...VALUES.... Prepared inserts are likely to be about the same or a tad slower than that, as there's very little functional gain to be had, followed by multiple inserts being the likely slowest, particularly if you have them in independent transactions. Of course, if you can't fit everything in one transaction, then you split it for performance reasons.

Regarding the scenario you've described, if you can get to the point of inserting multiple records in the VALUES... statement, you will likely see enough of an improvement that anything further won't be worth the effort. Anything that I can envisioning complicating your situation would be an argument against COPY/PREPARE.

* For loading huge amounts of data, there are lots of tricks with server settings, disabling indexes and rebuilding them, adding constraints after the data is loaded, and so forth.

# ¿ Oct 31, 2013 14:12

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

Pardot posted:

Postgres 9.4 is going to so loving own. Nested hstore that's faster than mongo http://obartunov.livejournal.com/175235.html

Have you seen these?

https://commitfest.postgresql.org/action/patch_view?id=1184
http://www.postgresql.org/message-id/CAFcNs+oXq5ZPaKARRn3VZGzX+AnTZCaEjR6VmQT_tvoFiqgaSg@mail.gmail.com
http://www.postgresql.org/message-id/20131031.194251.12577697.horiguchi.kyotaro@lab.ntt.co.jp
http://www.postgresql.org/message-id/CAFj8pRB8=cGxi1XAtjiTD0Sn3ZqYkTomEd96OdzGPUVs7tX9Cg@mail.gmail.com

Oh it's just all good.

# ¿ Nov 6, 2013 05:45

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

Pardot posted:

Surprised you didn't link upsert

I'll take that trailing smiley to signify some manner of sarcasm in the post, as indeed your crystal ball likely knows that my response will be that they can easily be obtained with triggers and are, therefore, as atomic as anything else (they just happen to be more visible to the user). On the other hand, my first reading of the patch is that it only provides limited behavioral changes in a direction that will aide upsert implementation. If 9.4 is slated for mid-to-late 2014, however, that's a good deal of time for the upsert crowd to get all excited about having it ready.

All this is causing me to raise my personal project priority/awareness for "PG development". :buddy:

# ¿ Nov 6, 2013 08:04

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

Frozen-Solid posted:

Is there any way to do this without doing a long hardcoded select statement for each value?

How many values are of interest here? How much data is there? If it's utterly static, can you just stuff it in a partition so it can be selected from a parent? Should you be doing something even more silly like using a foreign data wrapper onto a secondary server? Can you just process the data once and turn it into the new format? Is it worth it to write a stored procedure to 'do the work' (even if the procedure seems to be kinda lengthy)?

As I was thinking about your question, I started to realized that the scope is a bit broad at present. I have another idea that I may be able to post soonish, but I have to test something first. :}

Veskit posted:

In the employee table I want to have a check box for whether or not the employee is still working in the department.

Isn't that data duplication? An employee is active if their termination date is NULL and, presumably, the list of employees is a table of (userid,start,end). Namely, you've normalized off things like the employee's name and contact information, and not placed payroll-related items like 'actively working', 'on extended leave', and so forth, within that 'employee info' table.

# ¿ Nov 7, 2013 14:54

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

roomforthetuna posted:

No, that's not what I was saying at all (not in the bit you quoted anyway).

Table 1: Employees
-- id (a unique column)
-- name
-- job
-- whatever
-- (active?)

Table 2: Statuschanges
-- employeeid
-- date
-- action (hire/fire/vacation/return/?)

This approach would seem to warrant the Active flag, together with a trigger on the audit table (which sounds backwards) to ensure that the flag is consistent with the status as it changes. Indeed, that's not what I was thinking at all, and might be a bit too much for Veskit. I was saying

pre:

EmployeeDetails
  id, name, contact info, etc

Employees
  id -- key field, autoincr, durr
  employeeID
  start -- date NOT NULL
  end -- date

where Employees permits duplication of employeeID but would only permit a single (employeeID, end==NULL) row. This permits the creation of a view of 'active' employees as "end IS NOT NULL", but could also lead to oddities if the data entry mechanisms aren't smart enough (to prevent adding a start date for an already-active employee, etc.). It does avoid the trigger requirements of the previous approach. One could add termination codes or comments to Employees as well, depending on the needs. Things like a vacation day don't really count as "termination", however, so you'd still need to deal with that (for payroll purposes and so forth).

# ¿ Nov 8, 2013 05:34

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

McGlockenshire posted:

... upserts

I think that's the second to last thing I miss from MySQL. The other is user management. To hell with editing an INI for basic connection privs..

I don't see what the big deal is. If your application is so dumm that it needs such functionality, just follow the yellow brick road. And FFS, please do not make PostgreSQL permissions resemble anything like MySQL. PostgreSQL supports GRANT for object permissions and a model for host based access that is typical of the bulk of Unix applications. In PG, I know the precise ordering of authentication attempts, and the exact matching required. In MySQL, we're lucky if 127.0.0.1 actually matches localhost, and most DBAs I've seen end up with dozens of lines per user because the grants don't actually seem to behave (and seem to be invisible at times, whereafter people end up just adding more to ensure they are present).

p.s. Does MySQL even have native LDAP or Kerberos authentication yet? The user mappings in pg_hba are somewhat integral to that functionality. I also have a few circumstances where ROLE inheritance is useful for enforcing security on application connections.

PhantomOfTheCopier fucked around with this message at 04:55 on Nov 11, 2013

# ¿ Nov 11, 2013 04:46

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

McGlockenshire posted:

I'm not gonna write one stored proc per table, that's an absurd and unscalable maintenance nightmare. My application is "dumm" enough to expect the database to issue new identifiers for objects on an as-needed basis and uses a single code path (not quite ORM) for those saves.

As a short example in the documentation, they are likely unwilling to provide you with the complete and potentially-troublesome example that uses a table as an input parameter, whence the upsert function can be defined to apply to any table. Moreover, stored procedures are bound to a schema, so all objects in that schema have access to functions that you create.

The fact that so many solutions are built on PostgreSQL and Oracle without this functionality suggests to me that its provisional availability may well cause its use, but I do wonder at what expenses. I here envision the process of "finding just the thing" in the documentation leading to poor design, such as lack of error handling, in the application.

McGlockenshire posted:

MySQL grants... Postgres host based authentication...

I'll have to see what I can find regarding some of the original designs and likely arguments, of which I can envision many. Considering the number of different access methods available, it seems something of an issue to handle all possible user mappings, where system and login names don't match, without some manner of mapping file. At the same time, having this placed within the database seems somewhat disingenuous, as I don't want my collection of tables to deal with a host of connection parameters. As this very, very closely follows the TCP Wrappers model, this layered approach to security and authentication seems something of a foregone conclusion given the period in which it was developed.

I can see a great host of :supaburn:

resulting from those trying to use collections of Chef and Puppet scripts that are incapable of modifying a configuration file, and therefore require independent files for every configuration variable in use on the system, but this was certainly not an issue when PostgreSQL and Linux were being built. Nothing fundamentally prevents in-database alteration to the host-based access list, which is my a simple reload (without restart) is sufficient to rebuild that table, just as nothing prevents someone from submitting a code patch to issue a reload or write pg_hba.conf based on the in-database access list.

More on this later; busy right now...

PhantomOfTheCopier fucked around with this message at 15:44 on Nov 11, 2013

# ¿ Nov 11, 2013 15:20

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

Bognar posted:

Thanks! From a bit of research, it looks like my best bet is to still use a stored procedure for performance concerns since the only other possibility is a slow recursive CTE.

While I haven't done the same research, I'm curious how you expect there to be a difference in performance as the algorithm used is almost certain to be recursive simply based on this storage format of the graph data. I can conceive of an iterative approach to the question, based, for example, on fundamental techniques like "Walking through a binary tree" (see Knuth, The Art of Computer Programming), I don't see such an approach working with this type of data nor storage. I'd be happy to see some of your results, though, based on this stored procedure magic that logarithmically reduces big :aaaaa:

Or are you just using a database engine with a crappy CTE implementation?

code:

CREATE TABLE edge (p text NOT NULL,c text NOT NULL);

INSERT INTO edge VALUES
  ('A','B')
 ,('C','D'),('C','E'),('C','F'),('G','F')
 ,('H','I'),('J','I')
-- Cyclic multigraph
 ,('L','M'),('M','N'),('O','N'),('P','O')
 ,('P','Q'),('O','Q'),('O','Q')
;

SAVEPOINT preparer; DEALLOCATE community; ROLLBACK TO preparer;

PREPARE community(text) AS
WITH RECURSIVE neighbors(l,r) AS (
  SELECT p,c FROM edge WHERE p=$1 OR c=$1
  UNION
  SELECT e.p,e.c
    FROM neighbors n,edge e
    WHERE e.p=n.l
      OR e.p=n.r
      OR e.c=n.l
      OR e.c=n.r )
SELECT l FROM neighbors UNION SELECT r FROM neighbors;

EXECUTE community('D');
 l 
---
 C
 D
 E
 F
 G
(5 rows)

EXECUTE community('A');
 l 
---
 A
 B
(2 rows)

EXECUTE community('J');
 l 
---
 H
 I
 J
(3 rows)

EXECUTE community('N');
 l 
---
 L
 M
 N
 O
 P
 Q
(6 rows)

PhantomOfTheCopier fucked around with this message at 06:29 on Nov 16, 2013

# ¿ Nov 16, 2013 05:10

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

Frozen-Solid posted:

We're doing a pretty massive data conversion...

I have a table that has table names, field names, valid values for fields,... and the legacy value.
code:
SELECT 
	CustomerNo, 
	(SELECT legacyvalue FROM Orders_Prototype1.dbo.FieldValues 
         WHERE TableMap = 'Customers' AND FieldMap = 'InvoiceSequence' 
         AND KeyCode = InvoiceSequence) as CI_INV_SEQ
FROM Orders_Prototype1.dbo.Customers 
Is there any way to do this without doing a long hardcoded select statement for each value?

You'll want to perform your data conversion and migration so you can use sane SELECT statements to get the data, and feed anything that needs "old style results" through VIEWs. Even if you have to partition your data to 'archive tables', perhaps to enforce some manner of permissions on those rows, you'll be better off having the resulting actual query planning capabilities than you will trying to force your 'new database' through an unjoinable 'lookup table' as described above. A great deal of the details depend entirely on the data size, whether or not old applications will still have write access, and so forth, but actually converting to proper data, even if it takes adding a new column to avoid translating everything at once, will serve you better than something that consistently requires you to write objects as strings:

code:

CREATE TABLE reftbl (tbl text, fieldname text, fieldvalue int, returnvalue text);

CREATE TABLE tablea (id int PRIMARY KEY,flda int,fldb int);

INSERT INTO reftbl VALUES
  ('tablea','flda',0,'Zero')
 ,('tablea','flda',1,'One')
 ,('tablea','flda',2,'Two')
 ,('tablea','fldb',0,'Alpha')
 ,('tablea','fldb',1,'Beta')
 ,('tblA','fldb',2,'Gamma')
 ,('tblB','flda',0,'Dog')
 ,('tblB','flda',1,'Pig')
 ,('tblB','flda',2,'Cat')
;

INSERT INTO tablea VALUES
  (0,0,1)
 ,(1,1,1)
 ,(2,1,1)
 ,(3,2,0)
 ,(4,0,0)
 ,(5,2,2)
;

CREATE OR REPLACE FUNCTION oref(text,text) RETURNS TABLE ("id" int,"reffld" text)
LANGUAGE plpgsql AS $$
DECLARE
  from_table text;
  cond_tname text;
  cond_fname text;
  cond_fval  text;
BEGIN
  from_table=' FROM '||quote_ident($1);
  cond_tname=' tbl='||quote_literal($1);
  cond_fname=' fieldname='||quote_literal($2);
  cond_fval=' fieldvalue='||quote_ident($2);
  RETURN QUERY EXECUTE '
    SELECT id'||
    ',(SELECT returnvalue FROM reftbl WHERE '
      ||cond_tname||' AND '||cond_fname||' AND '||cond_fval||' LIMIT 1)'
    ||from_table
  ;
END;
$$;

SELECT * FROM oref('tablea','flda');

# ¿ Nov 16, 2013 22:28

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

YF-23 posted:

In PL/SQL I want to set a trigger that, when a table is updated, it updates an entry for the same id on another table. So for instance if I have a table called ThingsPerYear and a table called TotalThings, when I insert an entry in ThingsPerYear, say I give 'A' 5 things in 2013, the trigger updates A's entry in TotalThings to the sum of things A has across ThingsPerYear. I've tried several things but the only thing I got to work updated all entries of TotalThings instead of the specific one I wanted, and stuff I found online have generally not been very helpful for this sort of thing.

Have you read this? What you're talking about should be 'CREATE VIEW TotalThings...', of course, unless you have fallen into one of the specific scenarios where the type of trigger you define is actually needed.

# ¿ Nov 28, 2013 05:22

Adbot: ADBOT LOVES YOU

# ¿ Apr 28, 2024 20:14

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

Sockser posted:

... var connectionString = new MySqlConnectionStringBuilder

And I'm following everything I've seen online and my server string is "machine1.domain.com, machine2.domain.com, machine3.domain.com."...

Have you tried building a static string by hand to ensure that it isn't an issue with the builder class? The online documentation is pretty bad, so it's hard to determine exactly how that class behaves. It does say separated by commas, but I see examples with spaces so that's probably not it. Can you simultaneously connect to m1 and m2 using two separate connections?

# ¿ Dec 17, 2013 15:13

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > SELECT * FROM Questions WHERE Type = 'Stupid'

«‹›10 »