Обсуждение: [patch] A \pivot command for psql

Поиск
Список
Период
Сортировка

[patch] A \pivot command for psql

От
"Daniel Verite"
Дата:
  Hi,

I want to suggest a client-side \pivot command in psql, implemented
in the attached patch.

\pivot takes the current query in the buffer, execute it and
display it pivoted by interpreting the result as:

column1 => row in pivoted output
column2 => column in pivoted output
column3 => value at (row,column) in pivoted ouput

The advantage over the server-side crosstab() from contrib is in
ease of use, mostly because \pivot doesn't need in advance the
list of columns that result from pivoting.

Typical use cases are queries that produce values
along two axes:
 SELECT a,b,aggr(something) FROM tbl GROUP a,b;
\pivot displays immediately the matrix-like representation

Or displaying "pairing" between columns, as in
 SELECT a,b,'X' FROM tblA [LEFT|RIGHT|FULL] JOIN tblB...
which once pivoted shows in an easily readable way
what "a" is/isn't in relation with any "b".

Columns are sorted with strcmp(). I think a more adequate sort could
be obtained through a separate query with ORDER BY just for these
values (casted to their original type), but the patch doesn't do that
yet.

Also, \pivot could take optionally the query as an argument instead
of getting it only from the query buffer.

Anyway, does it look useful enough to be part of psql?
I guess I should push this to commitfest if that's the case.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Вложения

Re: [patch] A \pivot command for psql

От
Tom Lane
Дата:
"Daniel Verite" <daniel@manitou-mail.org> writes:
> I want to suggest a client-side \pivot command in psql, implemented
> in the attached patch.

> \pivot takes the current query in the buffer, execute it and
> display it pivoted by interpreting the result as:

> column1 => row in pivoted output
> column2 => column in pivoted output
> column3 => value at (row,column) in pivoted ouput

I can see the value of a feature like this, but doing it in psql sure
seems like the wrong place.  It would be unavailable to anything
except interactive use.

Is there a way to implement pivoting as a set-returning function?
        regards, tom lane



Re: [patch] A \pivot command for psql

От
Joe Conway
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/09/2015 10:37 AM, Tom Lane wrote:
> I can see the value of a feature like this, but doing it in psql
> sure seems like the wrong place.  It would be unavailable to
> anything except interactive use.
> 
> Is there a way to implement pivoting as a set-returning function?

That's exactly what crosstab() in the tablefunc contrib module is.
I've heard rumblings of someone looking to build crosstab/pivot into
the core grammar, but have not yet seen any patches...

Joe

- -- 
Joe Conway
Crunchy Data
Enterprise PostgreSQL
http://crunchydata.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJVx6b7AAoJEDfy90M199hlOJYP/3WJTFrB22SlC7pxTFl/L8La
9F0DbNlCyyK/oUkYH+dy5MXBnwHTFAwj2sEik7xNhax7aPIMnXh095f0AaMjouVU
S0J0dGb5quYYK+TUWBBL745nRIl786H2/XbZbP+L7pz2W72ITtTbKqMHpXVyzMiF
DfEN9xnCd05MF0WiaAXtwtz8HXQkFJxD/r5kgmHvwMIBPvvzrAg6CwHh/8kQMlY2
1kvkYnKvFSNp+PrZ0CvANs6ZyqaDJN1w7nsXul7HAWu+KhBkHxAilnBkCjjOT5JD
beF8E1KNo60k+3zjphEN1yKBl5tT7r9mYLhuOLuI0UdpNpdd8u+rSRTSFaIRMGQ7
UDQIXOtVHSlKSSOpXHH6FYvksUQMDVvPeSwKoI2imwAjSStkInHJMj6cs2uKxsZi
5P0sRnulx7Ur4M1mCq7t3+IKiaZ2KzO0WZ5ewJ1mMt5hIqD7QADjvDoPn0qozV5T
lX3pY4PUaL+N8XwlgBb69vGgUeG3N4nEvDSiUyLbFC3au/osczRGloYBq8AgDNuX
Dta9AOkzbNAA82ODUzFoFts8/1Ydu8vhwnpQiNcl772Hf0fpNvdFGtclc4K+2ToX
4k2WmezT5y8d6IdPPDqM92wbWXAOBIy1I+kaYnbnQpxn2QYzkKrPbJ2unTTKTdUa
5/uFA0fg37acwloBMux/
=5lhb
-----END PGP SIGNATURE-----



Re: [patch] A \pivot command for psql

От
"Daniel Verite"
Дата:
Tom Lane wrote:

> Is there a way to implement pivoting as a set-returning function?

Not with the same ease of use. We have crosstab functions
in contrib/tablefunc already, but the killer problem with PIVOT
is that truly dynamic columns are never reachable directly.

If we could do this:
 SELECT * FROM crosstab('select a,b,c FROM tbl');

and the result came back pivoted, with a column for each distinct value
of b, there will be no point in a client-side pivot. But postgres (or
I suppose any SQL interpreter) won't execute this, for not knowing
beforehand what structure "*" is going to have.

So what is currently required from the user, with dynamic columns,
is more like:

1st pass: identify the columnsSELECT DISTINCT a FROM tbl;

2nd pass: inject the columns, in a second embedded query
and in a record definition, with careful quoting:
 select * from crosstab(   'SELECT a,b,c FROM tbl ORDER BY 1',    ' VALUES (col1),(col2),(col3)...' -- or 'select
distinct...'again ) AS ct(b type, "col1" type, "col2" type, "col3" type) 


Compared to this, \pivot limited to the psql  interpreter
is a no-brainer, we could just write instead:

=> select a,b,c FROM tbl;
=> \pivot

This simplicity is the whole point. It's the result of doing
the operation client-side, where the record structure can be
pivoted without the target structure being formally declared.

Some engines have a built-in PIVOT syntax (Oracle, SQL server).
I have looked only at their documentation.
Their pivot queries look nicer and are possibly more efficient than
with SET functions, but AFAIK one still needs to programmatically
inject the list of column values into them, when that list
is not static.


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: [patch] A \pivot command for psql

От
Tom Lane
Дата:
"Daniel Verite" <daniel@manitou-mail.org> writes:
> Tom Lane wrote:
>> Is there a way to implement pivoting as a set-returning function?

> Not with the same ease of use. We have crosstab functions
> in contrib/tablefunc already, but the killer problem with PIVOT
> is that truly dynamic columns are never reachable directly.

I'm not sure how pushing it out to psql makes that better.  There is
no way to do further processing on something that psql has printed,
so you've punted on solving that issue just as much if not more.

I agree that the crosstab solution leaves a lot to be desired
as far as ease of use goes, but I don't want to define fixing its
ease-of-use problem as being "it's easy for manual use in psql".
psql is a minority API, you know.

Besides which, psql is not all that great for looking at very wide tables,
so I'm not sure that this would be very useful for dynamic column sets
anyway.
        regards, tom lane



Re: [patch] A \pivot command for psql

От
David Fetter
Дата:
On Sun, Aug 09, 2015 at 07:29:40PM +0200, Daniel Verite wrote:
>   Hi,
> 
> I want to suggest a client-side \pivot command in psql, implemented
> in the attached patch.
> 
> \pivot takes the current query in the buffer, execute it and
> display it pivoted by interpreting the result as:
> 
> column1 => row in pivoted output
> column2 => column in pivoted output
> column3 => value at (row,column) in pivoted ouput

This is really neat work, and thanks for the patch.

This issue in this one as in the crosstab extension is that it's
available only if you are using some particular piece of extra
software, in this case the psql client.

I'm working up a proposal to add (UN)PIVOT support to the back-end.
Would you like to join in on that?

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: [patch] A \pivot command for psql

От
Pavel Stehule
Дата:


2015-08-10 6:04 GMT+02:00 David Fetter <david@fetter.org>:
On Sun, Aug 09, 2015 at 07:29:40PM +0200, Daniel Verite wrote:
>   Hi,
>
> I want to suggest a client-side \pivot command in psql, implemented
> in the attached patch.
>
> \pivot takes the current query in the buffer, execute it and
> display it pivoted by interpreting the result as:
>
> column1 => row in pivoted output
> column2 => column in pivoted output
> column3 => value at (row,column) in pivoted ouput

This is really neat work, and thanks for the patch.

This issue in this one as in the crosstab extension is that it's
available only if you are using some particular piece of extra
software, in this case the psql client.

I'm working up a proposal to add (UN)PIVOT support to the back-end.
Would you like to join in on that?

PIVOT, UNPIVOT should be preferred solution

Pavel

I can look on implementations in other db
 

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [patch] A \pivot command for psql

От
"Daniel Verite"
Дата:
    David Fetter wrote:

> I'm working up a proposal to add (UN)PIVOT support to the back-end.

I was under the impression that a server-side PIVOT *with dynamic
columns* was just unworkable as an SQL query, because it couldn't
be prepared if it existed.

I am wrong on that? I feel like you guys are all telling me that \pivot
should happen on the server, but the point that it would not be
realistic to begin with is not considered.


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: [patch] A \pivot command for psql

От
"Daniel Verite"
Дата:
    Tom Lane wrote:

> I'm not sure how pushing it out to psql makes that better.  There is
> no way to do further processing on something that psql has printed,
> so you've punted on solving that issue just as much if not more.

It's the same spirit as \x : the only thing it achieves is better
readability in certain cases, I don't expect more from it.
But I admit that \pivot would be much more of a "niche use case"
than \x

> Besides which, psql is not all that great for looking at very wide tables,
> so I'm not sure that this would be very useful for dynamic column sets
> anyway.

I use \x all the time with wide tables and \x and \pivot happen to play
out well together (In fact I was pleasantly surprised by this)

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: [patch] A \pivot command for psql

От
David Fetter
Дата:
On Mon, Aug 10, 2015 at 07:10:41PM +0200, Daniel Verite wrote:
>     David Fetter wrote:
> 
> > I'm working up a proposal to add (UN)PIVOT support to the back-end.
> 
> I was under the impression that a server-side PIVOT *with dynamic
> columns* was just unworkable as an SQL query, because it couldn't be
> prepared if it existed.

That depends on what you mean by "dynamic columns."  The approach
taken in the tablefunc extension is to use functions which return
SETOF RECORD, which in turn need to be cast at runtime.

At least one other implementation takes two separate approaches, both
interesting at least from my point of view.

The first is to spell out all the columns in the query, which is not
"dynamic columns" in any reasonable sense, as "hand-craft one piece of
SQL whose purpose is to generate the actual pivot SQL" is not
a reasonable level of dynamic.

The second, more on point, is to specify a serialization for the rows
in the "dynamic columns" case.  Their syntax is "PIVOT XML", but I
would rather do something more like "PIVOT (SERIALIZATION XML)".

A third idea I have only roughed out feels a bit like cheating:
creating a function which returns two distinct rowtypes via REFCURSORs
or similar.  The first result, used to create a CAST, spells out the
row type of the second.

> I am wrong on that? I feel like you guys are all telling me that
> \pivot should happen on the server, but the point that it would not
> be realistic to begin with is not considered.

I think that starting the conversation again, especially at this stage
of the 9.6 cycle, is a very good thing, whatever its outcome.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: [patch] A \pivot command for psql

От
Joe Conway
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/10/2015 04:03 PM, David Fetter wrote:
>> I am wrong on that? I feel like you guys are all telling me that 
>> \pivot should happen on the server, but the point that it would
>> not be realistic to begin with is not considered.
> 
> I think that starting the conversation again, especially at this
> stage of the 9.6 cycle, is a very good thing, whatever its
> outcome.

Talk to Atri Sharma -- he recently indicated to me that he might be
working on built-in pivot in the near term future (if not already
started).

- -- 
Joe Conway
Crunchy Data
Enterprise PostgreSQL
http://crunchydata.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJVyUVSAAoJEDfy90M199hlhEgP/1mQpZ0JOR2kbaP5Iw6FY4q+
P1SLWbCWMhVDp97OAjbP0T34YmG5Rw0fDh4WLPbNBp6xyPTQQgIVFvQAu6kDSngk
X9kbQHKGE+pNqc9hMc7CyuSse8Pw8VeQGRDU5a+8E3fPdi9rbB2YTDt7SJIXlavc
zJ8kZlUj59Xw9Kdkpon8Jb/Nxn3JV4GWLTe4+nxRZoH/9POjslyM+rVGtrMlHA59
0NWWPnTsLfU1HNk+olf72ihZpmQM4KPog8PdWo0GyFqJhMwmYtE/WTJh4jNDygBe
NXTQE7/I5OA6fYQniq+7Wsyhc5raOH16lVSSo0QquuTtGG2mrYsvd82Zx7J0SDQp
NfVk4qgjJYMNBN9/hvPXZZ2+LReYqliloR2PqLqxgDn3DyGgSpSUs1JyDZRtoJEj
P4jEVGVWnUbjKuNldRJZi1DmMTKVSS2RSnoC0RJ7DzAfUyQ4oH4FmX8jX5rZpoXN
nJLtVPhIRpp0A2Lq849hXB3/LuNzjYmZ3VvGNUwffTD7d3XxQ/Zeb9ZtWZszfJUo
zjMAh0wFwDtdVBYdFStzlByGTSEIqeqdGCDSHhiZlNw/E0xV2YjNzdUP9N34aWml
i+KUgPMxTJ/GAineSXpjt+I6Y+cHA10JpQLOf6fVdrOp/R1Z2EqFfZl2GzbyXJ5n
lz4QjhOfx6YG1Tdi5qYW
=nyuc
-----END PGP SIGNATURE-----



Re: [patch] A \pivot command for psql

От
Robert Haas
Дата:
On Sun, Aug 9, 2015 at 8:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> psql is a minority API, you know.

Not for me.  psql has already got a bunch of bells and whistles to
format things in particular ways that people have wanted, and I'm not
really sure why the bar for this proposal should be any higher.  Is
this less useful than \pset linestype unicode, complete with trying to
auto-detect whether to enable the feature?  Than automatically
switching between expanded mode and regular mode based on the width of
the TTY?  I considered those features rather frivolous, but it seemed
to me that their inclusion was widely supported and that a number of
people were really quite happy about them.

I guess you could argue that those are inherently client-side features
and this is not, and I'll grant that point.  But a client-side feature
in 9.5 beats a server-side feature in 10.6 every day of the week.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [patch] A \pivot command for psql

От
David Fetter
Дата:
On Tue, Aug 11, 2015 at 10:13:48AM -0400, Robert Haas wrote:
> On Sun, Aug 9, 2015 at 8:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > psql is a minority API, you know.
> 
> Not for me.  psql has already got a bunch of bells and whistles to
> format things in particular ways that people have wanted, and I'm not
> really sure why the bar for this proposal should be any higher.  Is
> this less useful than \pset linestype unicode, complete with trying to
> auto-detect whether to enable the feature?  Than automatically
> switching between expanded mode and regular mode based on the width of
> the TTY?  I considered those features rather frivolous, but it seemed
> to me that their inclusion was widely supported and that a number of
> people were really quite happy about them.
> 
> I guess you could argue that those are inherently client-side features
> and this is not, and I'll grant that point.  But a client-side feature
> in 9.5

9.5?!?  I hope you meant 9.6 because if 9.5 is still open for new
features, we're going to have to explain to all the people who thought
they missed the cut-off why this one is jumping the line in front of
them.

> beats a server-side feature in 10.6 every day of the week.

Having done some of the preliminary research into a server-side
implementation, I see it a different way.

First, this is more of a 9.6 feature than a 10.6, at least in terms of
the first cut functionality, and I don't see good reasons why the
performance would need to be terrible in the first cut.  In
particular, the PIVOT case looks like it could pretty easily use the
FILTER machinery directly, while the UNPIVOT case could use what
LATERAL does.  Doubtless, further optimizations are possible for each,
and for interesting sub-cases, but that's a project for later, as
performance optimizations should be.

Second, if we put this feature as-is in psql, we're stuck supporting
it in psql until the end of time, even if (when, I believe) we have a
fuller and likely not perfectly compatible feature on the back-end.

That said, a thing in psql that could slice serialized output into
columns would be handy as a broad, general part of reporting in psql,
and would mesh quite nicely with a back-end PIVOT (SERIALIZATION FOO)
or whatever syntax we land on.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: [patch] A \pivot command for psql

От
"Daniel Verite"
Дата:
    David Fetter wrote:

> That depends on what you mean by "dynamic columns."  The approach
> taken in the tablefunc extension is to use functions which return
> SETOF RECORD, which in turn need to be cast at runtime.

For me, "PIVOT with dynamic columns" would be a pivot query
whose output columns are not enumerated as input in the
SQL query itself, in any form.

> The second, more on point, is to specify a serialization for the rows
> in the "dynamic columns" case.  Their syntax is "PIVOT XML", but I
> would rather do something more like "PIVOT (SERIALIZATION XML)".

The SERIALIZATION  looks interesting, but I believe these days JSON
would make more sense than XML, both as easier for the client-side and
because of all the json_* functions we now have to mix json with
relational structures.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: [patch] A \pivot command for psql

От
"Daniel Verite"
Дата:
    David Fetter wrote:

> Second, if we put this feature as-is in psql, we're stuck supporting
> it in psql until the end of time, even if (when, I believe) we have a
> fuller and likely not perfectly compatible feature on the back-end.

To me, doing \pivot in psql vs PIVOT in the backend is a false
dichotomy, both would be desirable to have in any order.

Having PIVOT in SQL is more important because it will help in a
broader range of cases.

But, even if I fail to be convincing on this, it will still be
much easier in a psql context to just type \pivot than turning a
non-pivot query into a pivoted equivalent, even if we achieve the
best possible (in ease of use) server-side SQL implementation.


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: [patch] A \pivot command for psql

От
David Fetter
Дата:
On Tue, Aug 11, 2015 at 05:13:03PM +0200, Daniel Verite wrote:
>     David Fetter wrote:
> 
> > That depends on what you mean by "dynamic columns."  The approach
> > taken in the tablefunc extension is to use functions which return
> > SETOF RECORD, which in turn need to be cast at runtime.
> 
> For me, "PIVOT with dynamic columns" would be a pivot query
> whose output columns are not enumerated as input in the
> SQL query itself, in any form.

I'm pretty sure people will want to be able to specify them in some
form.  On one implementation, it looks like:
   select * from (      select times_purchased as "Purchase Frequency", state_code      from customers t   )   pivot
xml  (       count(state_code)       for state_code in (select state_code from preferred_states)   )   order by 1
 

Another basically punts by making you responsible for generating the
SQL dynamically, a move I regard as a horrible UX failure.

> > The second, more on point, is to specify a serialization for the rows
> > in the "dynamic columns" case.  Their syntax is "PIVOT XML", but I
> > would rather do something more like "PIVOT (SERIALIZATION XML)".
> 
> The SERIALIZATION  looks interesting, but I believe these days JSON
> would make more sense than XML, both as easier for the client-side and
> because of all the json_* functions we now have to mix json with
> relational structures.

I proposed SERIALIZATION as a parameter precisely so we could use
different ones for different cases.  JSON is certainly popular this
year, as XML was in prior years.  I may be wrong, but I'm certain that
there will be new ones, even popular ones, that haven't yet been
invented.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: [patch] A \pivot command for psql

От
"Daniel Verite"
Дата:
David Fetter wrote:

> That said, a thing in psql that could slice serialized output into
> columns would be handy as a broad, general part of reporting in
> psql

To avoid any confusion with server-side PIVOT, I suggest that the
currently proposed command in psql should have a different name
than \pivot.

The general idea is indeed pivoting, but what it precisely does
is project a resultset from a 3-column query onto cells in a
2D grid-like arrangement.

It differs significantly from what existing PIVOT SQL commands
do, looking closely at them it appears in particular that:

- they don't care about a leftmost column to hold "row titles", whereas it's essential to the psql feature.

- as mentioned upthread, the need to enumerate in advance the output columns is a crucial point with the SQL pivot,
whereaspsql doesn't care, as it just displays a resultset that is already obtained. 

- they operate with an aggregate function at their heart, whereas it's also irrelevant to the psql display feature
whetherthere's an aggregate  in the query. 

For a different name, I've thought of these alternatives that belong
to client-side vocabulary:

\rotate
\sheetview
\gridview
\grid3view (to insist that it works on 3 columns).
\matrix
\matview
\matrixview
\crosstab (at the risk of confusion with the contrib feature)

Opinions? I'd go for \rotate personally.

Also I'll try to demonstrate use case with concrete examples.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite