Обсуждение: SQL:2003 Window Functions for postgresql 8.3?
I know that in pgsql.hackers they are discussing what to market the upcoming 8.2 release as. They mention updatable views, but realistically, PostgreSQL has had them via rules forever. I consider myself a database novice , and even I've created updatable views using rules quite easily. It would be really great if PostgreSQL supported SQL:2003 Window functions. I know that oracle and sql server have them already, so it would make postgres competitive in that area. I know there is a feature freeze for 8.2, is it doable for 8.3?
Karen Hill wrote: > It would be really great if PostgreSQL supported SQL:2003 Window > functions. I know that oracle and sql server have them already, so it > would make postgres competitive in that area. I know there is a > feature freeze for 8.2, is it doable for 8.3? The sooner you start writing a patch, the sooner you will be done ;-) I agree it would be nice to have them, but currently I don't think there's anyone working on'em. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Aug 24, 2006, at 14:11 , Alvaro Herrera wrote: > Karen Hill wrote: > >> It would be really great if PostgreSQL supported SQL:2003 Window >> functions. I know that oracle and sql server have them already, >> so it >> would make postgres competitive in that area. I know there is a >> feature freeze for 8.2, is it doable for 8.3? > > The sooner you start writing a patch, the sooner you will be done ;-) > > I agree it would be nice to have them, but currently I don't think > there's anyone working on'em. Could someone elaborate on the window functions? This page http:// en.wikipedia.org/wiki/SELECT has some examples but they make it seem like the functions are an overly-verbose LIMIT statement. So what's the benefit? -M
> -----Original Message----- > From: pgsql-general-owner@postgresql.org [mailto:pgsql-general- > owner@postgresql.org] On Behalf Of AgentM > Sent: Thursday, August 24, 2006 11:27 AM > To: PostgreSQL General ML > Subject: Re: [GENERAL] SQL:2003 Window Functions for postgresql 8.3? > > > On Aug 24, 2006, at 14:11 , Alvaro Herrera wrote: > > > Karen Hill wrote: > > > >> It would be really great if PostgreSQL supported SQL:2003 Window > >> functions. I know that oracle and sql server have them already, > >> so it > >> would make postgres competitive in that area. I know there is a > >> feature freeze for 8.2, is it doable for 8.3? > > > > The sooner you start writing a patch, the sooner you will be done ;-) > > > > I agree it would be nice to have them, but currently I don't think > > there's anyone working on'em. > > Could someone elaborate on the window functions? This page http:// > en.wikipedia.org/wiki/SELECT has some examples but they make it seem > like the functions are an overly-verbose LIMIT statement. So what's > the benefit? > > -M Window functions: SQL 2003 defines aggregates computed over a window with ROW_NUMBER function, rank functions (i.e., RANK, DENSE_RANK, PERCENT_RANK, CUME_DIST), and aggregate functions (e.g., inverse distribution, hypothetical set function) From: http://savage.net.au/SQL/sql-2003-2.bnf.html 6.10 <window function> (p193) <window function> ::= <window function type> OVER <window name or specification> <window function type> ::= <rank function type> <left paren> <right paren> | ROW_NUMBER <left paren> <right paren> | <aggregate function> <rank function type> ::= RANK | DENSE_RANK | PERCENT_RANK | CUME_DIST <window name or specification> ::= <window name> | <in-line window specification> <in-line window specification> ::= <window specification> 7.11 <window clause> (p331) Specify one or more window definitions. <window clause> ::= WINDOW <window definition list> <window definition list> ::= <window definition> [ { <comma> <window definition> }... ] <window definition> ::= <new window name> AS <window specification> <new window name> ::= <window name> <window specification> ::= <left paren> <window specification details> <right paren> <window specification details> ::= [ <existing window name> ] [ <window partition clause> ] [ <window order clause> ] [ <window frame clause> ] <existing window name> ::= <window name> <window partition clause> ::= PARTITION BY <window partition column reference list> <window partition column reference list> ::= <window partition column reference> [ { <comma> <window partition column reference> }... ] <window partition column reference> ::= <column reference> [ <collate clause> ] <window order clause> ::= ORDER BY <sort specification list> <window frame clause> ::= <window frame units> <window frame extent> [ <window frame exclusion> ] <window frame units> ::= ROWS | RANGE <window frame extent> ::= <window frame start> | <window frame between> <window frame start> ::= UNBOUNDED PRECEDING | <window frame preceding> | CURRENT ROW <window frame preceding> ::= <unsigned value specification> PRECEDING <window frame between> ::= BETWEEN <window frame bound 1> AND <window frame bound 2> <window frame bound 1> ::= <window frame bound> <window frame bound 2> ::= <window frame bound> <window frame bound> ::= <window frame start> | UNBOUNDED FOLLOWING | <window frame following> <window frame following> ::= <unsigned value specification> FOLLOWING <window frame exclusion> ::= EXCLUDE CURRENT ROW | EXCLUDE GROUP | EXCLUDE TIES | EXCLUDE NO OTHERS > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq
On Thu, Aug 24, 2006 at 02:26:53PM -0400, AgentM wrote: > Could someone elaborate on the window functions? This page http:// > en.wikipedia.org/wiki/SELECT has some examples but they make it seem > like the functions are an overly-verbose LIMIT statement. So what's > the benefit? Look for more sources, but they're kinda cool. The main thing I want to use them for is for cumulative output. Think of a table with data like this: Foo | 3 Bar | 6 Baz | 5 Blah | 6 What you want is an output that goes down the table and gives a cumulative percentage. First row is 3/20, second 9/20, etc... In normal SQL this is painful, with selfjoins and such. With window functions you define for each row a "window" which is from the beginning of the table to that row and then sum the values, for each row. Then you just divide by the total, nice. A "window" can be specified in a number of ways, such as "two rows back to two rows ahead" or from the beginning or end of output, so you can easily do averages covering the surrounding week (if you had daily data). A window is an ordered set, rather than the usual unordered sets SQL usually has. The standard has much more detail, but this is just a taste. Hope this helps, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Вложения
Martijn van Oosterhout <kleptog@svana.org> writes: > The main thing I want to use them for is for cumulative output. > ... > With window functions you define for each row a "window" which is from > the beginning of the table to that row and then sum the values, for > each row. Then you just divide by the total, nice. Egad. Wouldn't that involve O(N) memory and O(N^2) operations? Perhaps an extremely smart optimizer could improve this using knowledge of the specific aggregates' behaviors, but for "black box" aggregates it sounds pretty unworkable. regards, tom lane
Postgres' DISTINCT ON clause is an example of a window function, though as it stands today it seems to be a special-case hack, instead of an example of a more generalized feature. On Thu, 24 Aug 2006, AgentM wrote: > > On Aug 24, 2006, at 14:11 , Alvaro Herrera wrote: > >> Karen Hill wrote: >> >>> It would be really great if PostgreSQL supported SQL:2003 Window >>> functions. I know that oracle and sql server have them already, so it >>> would make postgres competitive in that area. I know there is a >>> feature freeze for 8.2, is it doable for 8.3? >> >> The sooner you start writing a patch, the sooner you will be done ;-) >> >> I agree it would be nice to have them, but currently I don't think >> there's anyone working on'em. > > Could someone elaborate on the window functions? This page > http://en.wikipedia.org/wiki/SELECT has some examples but they make it seem > like the functions are an overly-verbose LIMIT statement. So what's the > benefit? > > -M > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq
On Thu, Aug 24, 2006 at 02:47:20PM -0400, Tom Lane wrote: > Martijn van Oosterhout <kleptog@svana.org> writes: > > The main thing I want to use them for is for cumulative output. > > ... > > With window functions you define for each row a "window" which is from > > the beginning of the table to that row and then sum the values, for > > each row. Then you just divide by the total, nice. > > Egad. Wouldn't that involve O(N) memory and O(N^2) operations? > Perhaps an extremely smart optimizer could improve this using knowledge > of the specific aggregates' behaviors, but for "black box" aggregates > it sounds pretty unworkable. Yeah well, what's more important: speed or the fact you can write it at all. Currently you'd do it with a self join, which is at least as expensive. For windows running from the beginning, it's just a matter of outputting at each step of the aggregate, that's not hard. I beleive the window definitions are clear enough that you can place an upper bound on the number of rows you have to remember. An important point is, once a tuple has left the "window" it never comes back. Thus a tuple is in the "window" for a specific range of source tuples. Tuples leave the window in the same order they entered. The conditions of a range are basically one of: - fixed number of rows from beginning of table So RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING is the whole frame. You can use CURRENT ROW - fixed number of rows relative to current row Like 10 PRECEDING includes the previous ten rows. - all rows within a certain "range" relative to current sort key. Like a fix number of days ahead or behind a date type. Ofcourse, if user say they want the last 7 days and you come toa seciton of the table where this happens to match a lot of rows, well, tough. But it's not necessarily a huge amount of data, though you're going to run any aggregate *lots* times... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Вложения
Tom Lane wrote on 24.08.2006 20:47: > Perhaps an extremely smart optimizer could improve this using knowledge > of the specific aggregates' behaviors, but for "black box" aggregates > it sounds pretty unworkable. I don't know how they do it, but those functions in Oracle are pretty fast. Usually ways faster than some equivalent combinations of self joins, outer joins and such. Thomas
Clinging to sanity, tgl@sss.pgh.pa.us (Tom Lane) mumbled into her beard: > Martijn van Oosterhout <kleptog@svana.org> writes: >> The main thing I want to use them for is for cumulative output. >> ... >> With window functions you define for each row a "window" which is from >> the beginning of the table to that row and then sum the values, for >> each row. Then you just divide by the total, nice. > > Egad. Wouldn't that involve O(N) memory and O(N^2) operations? > Perhaps an extremely smart optimizer could improve this using knowledge > of the specific aggregates' behaviors, but for "black box" aggregates > it sounds pretty unworkable. Doing this *efficiently* presumably isn't in the cards for 8.2 :-). The way that I'd do this sort of thing right now would be by writing a set-returning stored proc that walks through tuples in some order. Returning, let's say, the sum up to the current row shouldn't require special amounts of memory. sum := 0; select * into rec from my_table order by trans_on loop sum += rec.amount; ret.sum := sum; -- set ret.* to rec.* return next ret; end loop; At worst, that should cost O(N) memory; no need to cost O(N^2) operations... -- let name="cbbrowne" and tld="gmail.com" in name ^ "@" ^ tld;; http://linuxdatabases.info/info/emacs.html "I've seen a look in dogs' eyes, a quickly vanishing look of amazed contempt, and I am convinced that basically dogs think humans are nuts." -- John Steinbeck
Tom Lane <tgl@sss.pgh.pa.us> writes: > Martijn van Oosterhout <kleptog@svana.org> writes: > > The main thing I want to use them for is for cumulative output. > > ... > > With window functions you define for each row a "window" which is from > > the beginning of the table to that row and then sum the values, for > > each row. Then you just divide by the total, nice. > > Egad. Wouldn't that involve O(N) memory and O(N^2) operations? > Perhaps an extremely smart optimizer could improve this using knowledge > of the specific aggregates' behaviors, but for "black box" aggregates > it sounds pretty unworkable. Yeah when I looked at it it seemed like it would in general require O(n) or O(n^2) in either time or space. In particular you can have the windows be ordered and ordered in a different order for each window function. So for example you could generate the dense_rank for a list of people according to various metrics both within their group and overall in a single query. I couldn't see how the database could do that other than storing up the whole group and sorting it n different ways and then somehow doing some kind of join before proceeding to the next group. I'm not sure if the spec is designed around the assumption that programmers would be clever about writing things that the database could optimize or if it was designed around the idea that programmers wouldn't care about O(n^2) performance because they would just spend $^2 on hardware. -- greg
Alvaro Herrera wrote: > Karen Hill wrote: > > > It would be really great if PostgreSQL supported SQL:2003 Window > > functions. I know that oracle and sql server have them already, so it > > would make postgres competitive in that area. I know there is a > > feature freeze for 8.2, is it doable for 8.3? > > The sooner you start writing a patch, the sooner you will be done ;-) I looked at the TODO list at http://www.postgresql.org/docs/faqs.TODO.html, and I don't see SQL:2003 Window Functions listed. Is it because they are not desired, or is it because there are more pressing things to accomplish? I noticed that Tom has mentioned that it appears unworkable in this thread.
Karen Hill wrote: > > Alvaro Herrera wrote: > > Karen Hill wrote: > > > > > It would be really great if PostgreSQL supported SQL:2003 Window > > > functions. I know that oracle and sql server have them already, so it > > > would make postgres competitive in that area. I know there is a > > > feature freeze for 8.2, is it doable for 8.3? > > > > The sooner you start writing a patch, the sooner you will be done ;-) > > I looked at the TODO list at > http://www.postgresql.org/docs/faqs.TODO.html, and I don't see SQL:2003 > Window Functions listed. Is it because they are not desired, or is it > because there are more pressing things to accomplish? I noticed that > Tom has mentioned that it appears unworkable in this thread. This is the first time I have seen them mentioned. Do we want them? -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian wrote: > Karen Hill wrote: > > > > Alvaro Herrera wrote: > > > Karen Hill wrote: > > > > > > > It would be really great if PostgreSQL supported SQL:2003 Window > > > > functions. I know that oracle and sql server have them already, so it > > > > would make postgres competitive in that area. I know there is a > > > > feature freeze for 8.2, is it doable for 8.3? > > > > > > The sooner you start writing a patch, the sooner you will be done ;-) > > > > I looked at the TODO list at > > http://www.postgresql.org/docs/faqs.TODO.html, and I don't see SQL:2003 > > Window Functions listed. Is it because they are not desired, or is it > > because there are more pressing things to accomplish? I noticed that > > Tom has mentioned that it appears unworkable in this thread. > > This is the first time I have seen them mentioned. Do we want them? They are in the standard and have been mentioned many times. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera wrote: > Bruce Momjian wrote: > > Karen Hill wrote: > > > > > > Alvaro Herrera wrote: > > > > Karen Hill wrote: > > > > > > > > > It would be really great if PostgreSQL supported SQL:2003 Window > > > > > functions. I know that oracle and sql server have them already, so it > > > > > would make postgres competitive in that area. I know there is a > > > > > feature freeze for 8.2, is it doable for 8.3? > > > > > > > > The sooner you start writing a patch, the sooner you will be done ;-) > > > > > > I looked at the TODO list at > > > http://www.postgresql.org/docs/faqs.TODO.html, and I don't see SQL:2003 > > > Window Functions listed. Is it because they are not desired, or is it > > > because there are more pressing things to accomplish? I noticed that > > > Tom has mentioned that it appears unworkable in this thread. > > > > This is the first time I have seen them mentioned. Do we want them? > > They are in the standard and have been mentioned many times. Mentioned how? Window functions? I have seem people ask for them in the past week, but never before that. What should the TODO be? -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian wrote: > Alvaro Herrera wrote: > > Bruce Momjian wrote: > > > Karen Hill wrote: > > > > > > > > Alvaro Herrera wrote: > > > > > Karen Hill wrote: > > > > > > > > > > > It would be really great if PostgreSQL supported SQL:2003 Window > > > > > > functions. I know that oracle and sql server have them already, so it > > > > > > would make postgres competitive in that area. I know there is a > > > > > > feature freeze for 8.2, is it doable for 8.3? > > > > > > > > > > The sooner you start writing a patch, the sooner you will be done ;-) > > > > > > > > I looked at the TODO list at > > > > http://www.postgresql.org/docs/faqs.TODO.html, and I don't see SQL:2003 > > > > Window Functions listed. Is it because they are not desired, or is it > > > > because there are more pressing things to accomplish? I noticed that > > > > Tom has mentioned that it appears unworkable in this thread. > > > > > > This is the first time I have seen them mentioned. Do we want them? > > > > They are in the standard and have been mentioned many times. > > Mentioned how? Window functions? I have seem people ask for them in > the past week, but never before that. Yeah, window functions. I remember Chris Kings-Lynne mentioning them since at least a year ago ... Maybe the addition to the TODO was stopped by the fact that anything specified by the SQL standard is already a TODO, thus putting each item on the TODO is just redundant. > What should the TODO be? I guess "Implement SQL:2003 window functions" should be enough :-) Not sure if you should append the [Karen Hill] bit to it though ;-) -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
>>> I looked at the TODO list at >>> http://www.postgresql.org/docs/faqs.TODO.html, and I don't see SQL:2003 >>> Window Functions listed. Is it because they are not desired, or is it >>> because there are more pressing things to accomplish? I noticed that >>> Tom has mentioned that it appears unworkable in this thread. >> This is the first time I have seen them mentioned. Do we want them? > > They are in the standard and have been mentioned many times. They are one of the items holding us back from taking Oracle off its perch. Joshua D. Drake > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/
Alvaro Herrera wrote: > > > They are in the standard and have been mentioned many times. > > > > Mentioned how? Window functions? I have seem people ask for them in > > the past week, but never before that. > > Yeah, window functions. I remember Chris Kings-Lynne mentioning them > since at least a year ago ... Maybe the addition to the TODO was > stopped by the fact that anything specified by the SQL standard is > already a TODO, thus putting each item on the TODO is just redundant. > > > What should the TODO be? > > I guess "Implement SQL:2003 window functions" should be enough :-) Not > sure if you should append the [Karen Hill] bit to it though ;-) Added to TODO. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +