Re: ANY_VALUE aggregate
От | Vik Fearing |
---|---|
Тема | Re: ANY_VALUE aggregate |
Дата | |
Msg-id | e25c63af-bea3-87ab-f93c-f28187dde318@postgresfriends.org обсуждение исходный текст |
Ответ на | Re: ANY_VALUE aggregate ("David G. Johnston" <david.g.johnston@gmail.com>) |
Список | pgsql-hackers |
On 12/8/22 06:48, David G. Johnston wrote: > On Wed, Dec 7, 2022 at 10:00 PM Vik Fearing <vik@postgresfriends.org> wrote: > >> On 12/7/22 04:22, David G. Johnston wrote: >>> On Mon, Dec 5, 2022 at 10:40 PM Vik Fearing <vik@postgresfriends.org> >> wrote: >>> >>>> On 12/6/22 05:57, David G. Johnston wrote: >>>>> On Mon, Dec 5, 2022 at 9:48 PM Vik Fearing <vik@postgresfriends.org> >>>> wrote: >>>>> >>>>>> I can imagine an optimization that would remove an ORDER BY clause >>>>>> because it isn't needed for any other aggregate. >>>>> >>>>> >>>>> I'm referring to the query: >>>>> >>>>> select any_value(v order by v) from (values (2),(1),(3)) as vals (v); >>>>> // produces 1, per the documented implementation-defined behavior. >>>> >>>> Implementation-dependent. It is NOT implementation-defined, per spec. >>> >>> I really don't care all that much about the spec here given that ORDER BY >>> in an aggregate call is non-spec. >> >> >> Well, this is demonstrably wrong. >> >> <array aggregate function> ::= >> ARRAY_AGG <left paren> >> <value expression> >> [ ORDER BY <sort specification list> ] >> <right paren> >> > > Demoable only by you and a few others... The standard is publicly available. It is strange that we, being so open, hold ourselves to such a closed standard; but that is what we do. > We should update our documentation - the source of SQL Standard knowledge > for mere mortals. > > https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-AGGREGATES > > "Note: The ability to specify both DISTINCT and ORDER BY in an aggregate > function is a PostgreSQL extension." > > Apparently only DISTINCT remains as our extension. Using DISTINCT in an aggregate is also standard. What that note is saying is that the standard does not allow *both* to be used at the same time. The standard defines these things for specific aggregates whereas we are much more generic about it and therefore have to deal with the combinations. I have submitted a doc patch to clarify that. >>> You are de-facto creating a first_value aggregate (which is by definition >>> non-standard) whether you like it or not. >> >> >> I am de jure creating an any_value aggregate (which is by definition >> standard) whether you like it or not. >> > > Yes, both statements seem true. At least until we decide to start ignoring > a user's explicit order by clause. I ran some tests and including an ORDER BY in an aggregate that doesn't care (like COUNT) is devastating for performance. I will be proposing a solution to that soon and I invite you to participate in that conversation when I do. -- Vik Fearing
В списке pgsql-hackers по дате отправления: