Обсуждение: [BUG] pg_stat_statements and extended query protocol

Поиск
Список
Период
Сортировка

[BUG] pg_stat_statements and extended query protocol

От
"Imseih (AWS), Sami"
Дата:

Doing some work with extended query protocol, I encountered the same
issue that was discussed in [1]. It appears when a client is using
extended query protocol and sends an Execute message to a portal with
max_rows, and a portal is executed multiple times,
pg_stat_statements does not correctly track rows and calls.

Using the attached jdbc script, TEST.java, which can reproduce the issue
with setFetchSize of 100 with autocommit mode set to OFF. We can
see that although pg_class has 414 rows, the total call and
rows returned is 14. the first 4 * 100 fetches did not get accounted for,.

postgres=# select calls, rows, query from pg_stat_statements
postgres-# where queryid = '-1905758228217333571';
calls | rows | query
---------------------------------
1 | 14 | select * from pg_class
(1 row)

The execution work flow goes something like this:
ExecutorStart
ExecutorRun – which will be called multiple times to fetch from the
                           portal until the caller Closes the portal or the portal
                           runs out of rows.
ExecutorFinish
ExecutorEnd – portal is closed & pg_stat_statements stores the final rows processed

Where this breaks for pg_stat_statements is during ExecutorRun,
es_processed is reset to 0 every iteration. So by the time the portal
is closed, es_processed will only show the total from the last execute
message.

This appears to be only an issue for portals fetched
through extended query protocol and not explicit cursors
that go through simple query protocol (i.e. FETCH <cursor>)

I attached a JDBC script to repro the issue.

One potential fix I see is to introduce 2 new counters in the
ExecutionState which will track the total rows processed
and the number of calls. These counters can then be used
by pg_stat_statements. Attached is an experimental patch
which shows the correct number of rows and number of
calls.

postgres=# select calls, rows, query from pg_stat_statements
postgres-# where queryid = '-1905758228217333571';
calls | rows | query
---------------------------------
5 | 414 | select * from pg_class
(1 row)

[1] https://www.postgresql.org/message-id/flat/c90890e7-9c89-c34f-d3c5-d5c763a34bd8%40dunslane.net

Thanks


Sami Imseih
Amazon Web Services (AWS)

 

Вложения

Re: [BUG] pg_stat_statements and extended query protocol

От
Michael Paquier
Дата:
On Wed, Jan 25, 2023 at 11:22:04PM +0000, Imseih (AWS), Sami wrote:
> Doing some work with extended query protocol, I encountered the same
> issue that was discussed in [1]. It appears when a client is using
> extended query protocol and sends an Execute message to a portal with
> max_rows, and a portal is executed multiple times,
> pg_stat_statements does not correctly track rows and calls.

Well, it is one of these areas where it seems to me we have never been
able to put a definition on what should be the correct behavior when
it comes to pg_stat_statements.  Could it be possible to add some
regression tests using the recently-added \bind command and see how
this affects things?  I would suggest splitting these into their own
SQL file, following an effort I have been doing recently for the
regression tests of pg_stat_statements.  It would be good to know the
effects of this change for pg_stat_statements.track = (top|all), as
well.

@@ -657,7 +657,9 @@ typedef struct EState

        List       *es_tupleTable;      /* List of TupleTableSlots */

-       uint64          es_processed;   /* # of tuples processed */
+       uint64          es_processed;   /* # of tuples processed at the top level only */
+       uint64          es_calls;       /* # of calls */
+       uint64          es_total_processed; /* total # of tuples processed */

So the root of the logic is here.  Anything that makes the executor
structures larger freaks me out, FWIW, and that's quite an addition.
--
Michael

Вложения

Re: [BUG] pg_stat_statements and extended query protocol

От
"Imseih (AWS), Sami"
Дата:
> Well, it is one of these areas where it seems to me we have never been
> able to put a definition on what should be the correct behavior when
> it comes to pg_stat_statements. 

What needs to be defined here is how pgss should account for # of rows
processed when A) a select goes through extended query (EP) protocol, and 
B) it requires multiple executes to complete a portal.

The patch being suggested will treat every 'E' ( execute message ) to the same
portal as a new call ( pgss will increment the calls ) and the number of rows
processed will be accumulated for every 'E' message.

Currently, only the rows fetched in the last 'E' call to the portal is tracked by 
pgss. This is incorrect.

> Could it be possible to add some
> regression tests using the recently-added \bind command and see how
> this affects things? 

\bind alone will not be enough as we also need a way to fetch from
a portal in batches. The code that needs to be exercised
as part of the test is exec_execute_message with max_rows != 0.

\bind will call exec_execute_message with max_rows = 0 to fetch
all the rows.

> I would suggest splitting these into their own
> SQL file, following an effort I have been doing recently for the
> regression tests of pg_stat_statements. It would be good to know the
> effects of this change for pg_stat_statements.track = (top|all), as
> well.

Yes, I agree that proper test coverage is needed here. Will think
about how to accomplish this.

> - uint64 es_processed; /* # of tuples processed */
> + uint64 es_processed; /* # of tuples processed at the top level only */
> + uint64 es_calls; /* # of calls */
> + uint64 es_total_processed; /* total # of tuples processed */


> So the root of the logic is here. Anything that makes the executor
> structures larger freaks me out, FWIW, and that's quite an addition.

I am not sure how to get around the changes to Estate and fixing this
Issue. 

We could potentially only need the es_total_processed field and
continue to track calls in pgss. 

es_total_processed in EState however is still needed.

Regards,

--

Sami Imseih
Amazon Web Services (AWS)




Re: [BUG] pg_stat_statements and extended query protocol

От
David Zhang
Дата:
> Yes, I agree that proper test coverage is needed here. Will think
> about how to accomplish this.

Tried to apply this patch to current master branch and the build was ok, 
however it crashed during initdb with a message like below.

"performing post-bootstrap initialization ... Segmentation fault (core 
dumped)"

If I remove this patch and recompile again, then "initdb -D $PGDATA" works.

Thanks,

David




Re: [BUG] pg_stat_statements and extended query protocol

От
"Imseih (AWS), Sami"
Дата:
> If I remove this patch and recompile again, then "initdb -D $PGDATA" works.

It appears you must "make clean; make install" to correctly compile after
applying the patch.

Regards,

Sami Imseih
Amazon Web Services (AWS)










Re: [BUG] pg_stat_statements and extended query protocol

От
Michael Paquier
Дата:
On Sat, Mar 11, 2023 at 11:55:22PM +0000, Imseih (AWS), Sami wrote:
> It appears you must "make clean; make install" to correctly compile after
> applying the patch.

In a git repository, I've learnt to rely on this simple formula, even
if it means extra cycles when running ./configure:
git clean -d -x -f
--
Michael

Вложения

Re: [BUG] pg_stat_statements and extended query protocol

От
"Drouvot, Bertrand"
Дата:
Hi,

On 3/2/23 8:27 AM, Michael Paquier wrote:
> On Wed, Jan 25, 2023 at 11:22:04PM +0000, Imseih (AWS), Sami wrote:
>> Doing some work with extended query protocol, I encountered the same
>> issue that was discussed in [1]. It appears when a client is using
>> extended query protocol and sends an Execute message to a portal with
>> max_rows, and a portal is executed multiple times,
>> pg_stat_statements does not correctly track rows and calls.
> 
> Well, it is one of these areas where it seems to me we have never been
> able to put a definition on what should be the correct behavior when
> it comes to pg_stat_statements.  Could it be possible to add some
> regression tests using the recently-added \bind command and see how
> this affects things?  I would suggest splitting these into their own
> SQL file, following an effort I have been doing recently for the
> regression tests of pg_stat_statements.  It would be good to know the
> effects of this change for pg_stat_statements.track = (top|all), as
> well.
> 
> @@ -657,7 +657,9 @@ typedef struct EState
>   
>          List       *es_tupleTable;      /* List of TupleTableSlots */
>   
> -       uint64          es_processed;   /* # of tuples processed */
> +       uint64          es_processed;   /* # of tuples processed at the top level only */
> +       uint64          es_calls;       /* # of calls */
> +       uint64          es_total_processed; /* total # of tuples processed */
> 
> So the root of the logic is here.  Anything that makes the executor
> structures larger freaks me out, FWIW, and that's quite an addition.
> --
> Michael

I wonder if we can't "just" make use of the "count" parameter passed to the
ExecutorRun_hook.

Something like?

- Increment a "es_total_processed" counter in pgss based on the count received in pgss_ExecutorRun()
- In pgss_ExecutorEnd(): substract the last count we received in pgss_ExecutorRun() and add
queryDesc->estate->es_processed?(we'd
 
need to be able to distinguish when we should apply this rule or not).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: [BUG] pg_stat_statements and extended query protocol

От
David Zhang
Дата:
>> It appears you must "make clean; make install" to correctly compile after
>> applying the patch.
> In a git repository, I've learnt to rely on this simple formula, even
> if it means extra cycles when running ./configure:
> git clean -d -x -f
>
Thank you all for pointing out that it needs make clean first. After 
make clean followed by recompile with the patch then both make check 
from regression test and pg_stat_statements extension report all test 
passed. So the current existing test cases can't really detect any 
change from this patch, then it would be better to add some test cases 
to cover this.

Best regards,

David





Re: [BUG] pg_stat_statements and extended query protocol

От
"Imseih (AWS), Sami"
Дата:
Sorry about the delay in response about this.

I was thinking about this and it seems to me we can avoid
adding new fields to Estate. I think a better place to track
rows and calls is in the Instrumentation struct.

--- a/src/include/executor/instrument.h
+++ b/src/include/executor/instrument.h
@@ -88,6 +88,8 @@ typedef struct Instrumentation
        double          nfiltered2;             /* # of tuples removed by "other" quals */
        BufferUsage bufusage;           /* total buffer usage */
        WalUsage        walusage;               /* total WAL usage */
+       int64       calls;
+       int64       rows_processed;
 } Instrumentation; 


If this is more palatable, I can prepare the patch.

Thanks for your feedback.

Regards.

Sami Imseih
Amazon Web Services (AWS)








Re: [BUG] pg_stat_statements and extended query protocol

От
Michael Paquier
Дата:
On Mon, Mar 20, 2023 at 09:41:12PM +0000, Imseih (AWS), Sami wrote:
> I was thinking about this and it seems to me we can avoid
> adding new fields to Estate. I think a better place to track
> rows and calls is in the Instrumentation struct.
>
> If this is more palatable, I can prepare the patch.

This indeed feels a bit more natural seen from here, after looking at
the code paths using an Instrumentation in the executor and explain,
for example.  At least, this stresses me much less than adding 16
bytes to EState for something restricted to the extended protocol when
it comes to monitoring capabilities.
--
Michael

Вложения

Re: [BUG] pg_stat_statements and extended query protocol

От
"Imseih (AWS), Sami"
Дата:
> This indeed feels a bit more natural seen from here, after looking at
> the code paths using an Instrumentation in the executor and explain,
> for example. At least, this stresses me much less than adding 16
> bytes to EState for something restricted to the extended protocol when
> it comes to monitoring capabilities.

Attached is the patch that uses Instrumentation. 

I did not add any new tests, and we do not have anyway now
of setting a row count when going through the Execute message.
I think this may be need to be addressed separately since there
Seems to be. Gap in extended query protocol testing.

For this fix however, The JDBC test does show correct results.


Regards,

Sami Imseih
Amazon Web Services (AWS)




Вложения

Re: [BUG] pg_stat_statements and extended query protocol

От
"Drouvot, Bertrand"
Дата:
Hi,

On 3/21/23 2:16 PM, Imseih (AWS), Sami wrote:
>> This indeed feels a bit more natural seen from here, after looking at
>> the code paths using an Instrumentation in the executor and explain,
>> for example. At least, this stresses me much less than adding 16
>> bytes to EState for something restricted to the extended protocol when
>> it comes to monitoring capabilities.
> 
> Attached is the patch that uses Instrumentation.
> 

Thanks, I think this new approach makes sense.

-                                          const BufferUsage *bufusage,
+                                          int64 calls, const BufferUsage *bufusage,

What about using an uint64 for calls? That seems more appropriate to me (even if
queryDesc->totaltime->calls will be passed (which is int64), but that's already
also the case for the "rows" argument and queryDesc->totaltime->rows_processed)

@@ -88,6 +88,8 @@ typedef struct Instrumentation
         double          nfiltered2;             /* # of tuples removed by "other" quals */
         BufferUsage bufusage;           /* total buffer usage */
         WalUsage        walusage;               /* total WAL usage */
+       int64       calls;                      /* # of total calls to ExecutorRun */
+       int64       rows_processed;     /* # of total rows processed in ExecutorRun */

I'm not sure it's worth mentioning that the new counters are "currently" used with the ExecutorRun.

What about just "total calls" and "total rows processed" (or "total rows", see below)?

Also, I wonder if "rows" (and not rows_processed) would not be a better naming.

Those last comments regarding the Instrumentation are done because ISTM that at the end their usage
could vary depending of the use case of the Instrumentation.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: [BUG] pg_stat_statements and extended query protocol

От
"Imseih (AWS), Sami"
Дата:
> What about using an uint64 for calls? That seems more appropriate to me (even if
> queryDesc->totaltime->calls will be passed (which is int64), but that's already
> also the case for the "rows" argument and queryDesc->totaltime->rows_processed)

That's fair


> I'm not sure it's worth mentioning that the new counters are "currently" used with the ExecutorRun.

Sure, I suppose these fields could be used outside of ExecutorRun. Good point.


> Also, I wonder if "rows" (and not rows_processed) would not be a better naming.

Agree.

I went with rows_processed initially, since it was accumulating es_processed,
but as the previous point, this instrumentation could be used outside of
ExecutorRun.

v3 addresses the comments.


Regards,


--
Sami Imseih
Amazon Web Services (AWS)




Вложения

Re: [BUG] pg_stat_statements and extended query protocol

От
"Drouvot, Bertrand"
Дата:
Hi,

On 3/22/23 10:35 PM, Imseih (AWS), Sami wrote:
>> What about using an uint64 for calls? That seems more appropriate to me (even if
>> queryDesc->totaltime->calls will be passed (which is int64), but that's already
>> also the case for the "rows" argument and queryDesc->totaltime->rows_processed)
> 
> That's fair
> 
> 
>> I'm not sure it's worth mentioning that the new counters are "currently" used with the ExecutorRun.
> 
> Sure, I suppose these fields could be used outside of ExecutorRun. Good point.
> 
> 
>> Also, I wonder if "rows" (and not rows_processed) would not be a better naming.
> 
> Agree.
> 
> I went with rows_processed initially, since it was accumulating es_processed,
> but as the previous point, this instrumentation could be used outside of
> ExecutorRun.
> 
> v3 addresses the comments.
> 

Thanks! LGTM and also do confirm that, with the patch, the JDBC test does show the correct results.

That said, not having a test (for the reasons you explained up-thread) associated with the patch worry me a bit.

But, I'm tempted to say that adding new tests could be addressed separately though (as this patch looks pretty
straightforward).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: [BUG] pg_stat_statements and extended query protocol

От
Michael Paquier
Дата:
On Thu, Mar 23, 2023 at 09:33:16AM +0100, Drouvot, Bertrand wrote:
> Thanks! LGTM and also do confirm that, with the patch, the JDBC test
> does show the correct results.

How does JDBC test that?  Does it have a dependency on
pg_stat_statements?
>
> That said, not having a test (for the reasons you explained
> up-thread) associated with the patch worry me a bit.

Same impression here.

> But, I'm tempted to say that adding new tests could be addressed
> separately though (as this patch looks pretty straightforward).

Even small patches can have gotchas.  I think that this should have
tests in-core rather than just depend on JDBC and hope for the best.
Even if \bind does not allow that, we could use an approach similar to
libpq_pipeline, for example, depending on pg_stat_statements for the
validation with a test module in src/test/modules/?
--
Michael

Вложения

Re: [BUG] pg_stat_statements and extended query protocol

От
"Imseih (AWS), Sami"
Дата:
> How does JDBC test that? Does it have a dependency on
> pg_stat_statements?

No, at the start of the thread, a sample jdbc script was attached.
But I agree, we need to add test coverage. See below.


>> But, I'm tempted to say that adding new tests could be addressed
>> separately though (as this patch looks pretty straightforward).


> Even small patches can have gotchas. I think that this should have
> tests in-core rather than just depend on JDBC and hope for the best.
> Even if \bind does not allow that, we could use an approach similar to
> libpq_pipeline, for example, depending on pg_stat_statements for the
> validation with a test module in src/test/modules/?

Yes, that is possible but we will need to add a libpq API
that allows the caller to pass in a "fetch size".
PQsendQueryParams does not take in a "fetch size", 
so it returns all rows, through PQsendQueryParams

https://github.com/postgres/postgres/blob/master/src/interfaces/libpq/fe-exec.c#L1882

Adding such an API that takes in a "fetch size" will be beneficial 
not just for this test, but I can see it enabling another psql meta command,
similar to \bind but that takes in a "fetch size".

Regards

--
Sami Imseih
Amazon Web Services (AWS)




Re: [BUG] pg_stat_statements and extended query protocol

От
Yugo NAGATA
Дата:
On Thu, 23 Mar 2023 09:33:16 +0100
"Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com> wrote:

> Hi,
> 
> On 3/22/23 10:35 PM, Imseih (AWS), Sami wrote:
> >> What about using an uint64 for calls? That seems more appropriate to me (even if
> >> queryDesc->totaltime->calls will be passed (which is int64), but that's already
> >> also the case for the "rows" argument and queryDesc->totaltime->rows_processed)
> > 
> > That's fair
> > 
> > 
> >> I'm not sure it's worth mentioning that the new counters are "currently" used with the ExecutorRun.
> > 
> > Sure, I suppose these fields could be used outside of ExecutorRun. Good point.
> > 
> > 
> >> Also, I wonder if "rows" (and not rows_processed) would not be a better naming.
> > 
> > Agree.
> > 
> > I went with rows_processed initially, since it was accumulating es_processed,
> > but as the previous point, this instrumentation could be used outside of
> > ExecutorRun.
> > 
> > v3 addresses the comments.

I wonder that this patch changes the meaning of "calls" in the pg_stat_statement
view a bit; previously it was "Number of times the statement was executed" as
described in the documentation, but currently this means  "Number of times the
portal was executed". I'm worried that this makes users confused. For example,
a user may think the average numbers of rows returned by a statement is given by
rows/calls, but it is not always correct because some statements could be executed
with multiple portal runs. 

Although it might not big issue to users, I think it is better to add an explanation
to the doc for clarification.

Regards,
Yugo Nagata

> > 
> 
> Thanks! LGTM and also do confirm that, with the patch, the JDBC test does show the correct results.
> 
> That said, not having a test (for the reasons you explained up-thread) associated with the patch worry me a bit.
> 
> But, I'm tempted to say that adding new tests could be addressed separately though (as this patch looks pretty
straightforward).
> 
> Regards,
> 
> -- 
> Bertrand Drouvot
> PostgreSQL Contributors Team
> RDS Open Source Databases
> Amazon Web Services: https://aws.amazon.com
> 
> 


-- 
Yugo NAGATA <nagata@sraoss.co.jp>



Re: [BUG] pg_stat_statements and extended query protocol

От
"Imseih (AWS), Sami"
Дата:
> I wonder that this patch changes the meaning of "calls" in the pg_stat_statement
> view a bit; previously it was "Number of times the statement was executed" as
> described in the documentation, but currently this means "Number of times the
> portal was executed". I'm worried that this makes users confused. For example,
> a user may think the average numbers of rows returned by a statement is given by
> rows/calls, but it is not always correct because some statements could be executed
> with multiple portal runs.

I don't think it changes the meaning of "calls" in pg_stat_statements, since every
time the app fetches X amount of rows from a portal, it's still done in a separate
execution, and thus a separate call.

I agree, the meaning of "calls" should be clarified in docs.


Regards,

Sami Imseih
Amazon Web Services (AWS)


Re: [BUG] pg_stat_statements and extended query protocol

От
Michael Paquier
Дата:
On Thu, Mar 23, 2023 at 01:54:05PM +0000, Imseih (AWS), Sami wrote:
> Yes, that is possible but we will need to add a libpq API
> that allows the caller to pass in a "fetch size".
> PQsendQueryParams does not take in a "fetch size",
> so it returns all rows, through PQsendQueryParams
>
> https://github.com/postgres/postgres/blob/master/src/interfaces/libpq/fe-exec.c#L1882
>
> Adding such an API that takes in a "fetch size" will be beneficial
> not just for this test, but I can see it enabling another psql meta command,
> similar to \bind but that takes in a "fetch size".

So...  The idea here is to set a custom fetch size so as the number of
calls can be deterministic in the tests, still more than 1 for the
tests we'd have.  And your point is that libpq enforces always 0 when
sending the EXECUTE message causing it to always return all the rows
for any caller of PQsendQueryGuts().

The extended protocol allows that, so you would like a libpq API to
have more control of what we send with EXECUTE:
https://www.postgresql.org/docs/current/protocol-overview.html#PROTOCOL-QUERY-CONCEPTS

The extended query protocol would require multiple 'E' messages, but
we would not need multiple describe or bind messages, meaning that
this cannot just be an extra flavor PQsendQueryParams().  Am I gettig
that right?  The correct API design seems tricky, to say the least.
Perhaps requiring this much extra work in libpq for the purpose of
having some tests in this thread is not a brilliant idea..  Or perhaps
we could just do it and have something a-la-JDBC with two routines?
That would be one libpq routine for describe/bind and one for execute
where the limit can be given by the caller in the latter case, similar
to sendDescribeStatement() and sendExecute() in
QueryExecutorImpl.java.
--
Michael

Вложения

Re: [BUG] pg_stat_statements and extended query protocol

От
"Imseih (AWS), Sami"
Дата:
> So...  The idea here is to set a custom fetch size so as the number of
> calls can be deterministic in the tests, still more than 1 for the
> tests we'd have.  And your point is that libpq enforces always 0 when
> sending the EXECUTE message causing it to always return all the rows
> for any caller of PQsendQueryGuts().

That is correct.

> The extended protocol allows that, so you would like a libpq API to
> have more control of what we send with EXECUTE:
> https://www.postgresql.org/docs/current/protocol-overview.html#PROTOCOL-QUERY-CONCEPTS


> The extended query protocol would require multiple 'E' messages, but
> we would not need multiple describe or bind messages, meaning that
> this cannot just be an extra flavor PQsendQueryParams().  Am I gettig
> that right?  

Correct, there will need to be separate APIs for Parse/Bind, Execute
and Close of a Portal.


> The correct API design seems tricky, to say the least.
> Perhaps requiring this much extra work in libpq for the purpose of
> having some tests in this thread is not a brilliant idea..  Or perhaps
> we could just do it and have something a-la-JDBC with two routines?
> That would be one libpq routine for describe/bind and one for execute
> where the limit can be given by the caller in the latter case, similar
> to sendDescribeStatement() and sendExecute() in
> QueryExecutorImpl.java.

I am not too clear on your point here. ISTM you are suggesting adding
new libpq APis similar to JDBC, which is what I am also suggesting.

Did I understand correctly?


--
Sami Imseih
Amazon Web Services (AWS)




Re: [BUG] pg_stat_statements and extended query protocol

От
Tom Lane
Дата:
"Imseih (AWS), Sami" <simseih@amazon.com> writes:
>> So...  The idea here is to set a custom fetch size so as the number of
>> calls can be deterministic in the tests, still more than 1 for the
>> tests we'd have.  And your point is that libpq enforces always 0 when
>> sending the EXECUTE message causing it to always return all the rows
>> for any caller of PQsendQueryGuts().

> That is correct.

Hi, I took a quick look through this thread, and I have a couple of
thoughts:

* Yeah, it'd be nice to have an in-core test, but it's folly to insist
on one that works via libpq and psql.  That requires a whole new set
of features that you're apparently designing on-the-fly with no other
use cases in mind.  I don't think that will accomplish much except to
ensure that this bug fix doesn't make it into v16.

* I don't understand why it was thought good to add two new counters
to struct Instrumentation.  In EXPLAIN ANALYZE cases those will be
wasted space *per plan node*, not per Query.

* It also seems quite bizarre to add counters to a core data structure
and then leave it to pg_stat_statements to maintain them.  (BTW, I didn't
much care for putting that maintenance into pgss_ExecutorRun without
updating its header comment.)

I'm inclined to think that adding the counters to struct EState is
fine.  That's 304 bytes already on 64-bit platforms, another 8 or 16
won't matter.

Also, I'm doubtful that counting calls this way is a great idea,
which would mean you only need one new counter field not two.  The
fact that you're having trouble defining what it means certainly
suggests that the implementation is out front of the design.

In short, what I think I'd suggest is adding an es_total_processed
field to EState and having standard_ExecutorRun do "es_total_processed
+= es_processed" near the end, then just change pg_stat_statements to
use es_total_processed not es_processed.

            regards, tom lane



Re: [BUG] pg_stat_statements and extended query protocol

От
"Imseih (AWS), Sami"
Дата:
> * Yeah, it'd be nice to have an in-core test, but it's folly to insist
> on one that works via libpq and psql. That requires a whole new set
> of features that you're apparently designing on-the-fly with no other
> use cases in mind. I don't think that will accomplish much except to
> ensure that this bug fix doesn't make it into v16.

I agree, Solving the lack of in-core testing for this area belong in a
different discussion. 


> * I don't understand why it was thought good to add two new counters
> to struct Instrumentation. In EXPLAIN ANALYZE cases those will be
> wasted space *per plan node*, not per Query.

Indeed, looking at ExplainNode, Instrumentation struct is allocated
per node and the new fields will be wasted space. Thanks for highlighting
this.

> * It also seems quite bizarre to add counters to a core data structure
> and then leave it to pg_stat_statements to maintain them. 

That is fair point

> I'm inclined to think that adding the counters to struct EState is
> fine. That's 304 bytes already on 64-bit platforms, another 8 or 16
> won't matter.

With the point you raise about Insrumentation per node, Estate 
is the better place for the new counters.


> Also, I'm doubtful that counting calls this way is a great idea,
> which would mean you only need one new counter field not two. The
> fact that you're having trouble defining what it means certainly
> suggests that the implementation is out front of the design.

ISTM you are not in agreement that a call count should be incremented 
after every executorRun, but should only be incremented after 
the portal is closed, at executorEnd. Is that correct?

FWIW, The rationale for incrementing calls in executorRun is that calls refers 
to the number of times a client executes a portal, whether partially or to completion.

Clients can also fetch rows from portals at various rates, so to determine the
"rows per call" accurately from pg_stat_statements, we should track a calls as 
the number of times executorRun was called on a portal.

> In short, what I think I'd suggest is adding an es_total_processed
> field to EState and having standard_ExecutorRun do "es_total_processed
> += es_processed" near the end, then just change pg_stat_statements to
> use es_total_processed not es_processed.

The original proposal in 0001-correct-pg_stat_statements-tracking-of-portals.patch,
was to add a "calls" and "es_total_processed" field to EState.


Regards,

Sami Imseih
Amazon Web Services (AWS)








Re: [BUG] pg_stat_statements and extended query protocol

От
Tom Lane
Дата:
"Imseih (AWS), Sami" <simseih@amazon.com> writes:
>> Also, I'm doubtful that counting calls this way is a great idea,
>> which would mean you only need one new counter field not two. The
>> fact that you're having trouble defining what it means certainly
>> suggests that the implementation is out front of the design.

> ISTM you are not in agreement that a call count should be incremented
> after every executorRun, but should only be incremented after
> the portal is closed, at executorEnd. Is that correct?

Right.  That makes the "call count" equal to the number of times the
query is invoked.

> FWIW, The rationale for incrementing calls in executorRun is that calls refers
> to the number of times a client executes a portal, whether partially or to completion.

Why should that be the definition?  Partial execution of a portal
might be something that is happening at the driver level, behind the
user's back.  You can't make rational calculations of, say, plan
time versus execution time if that's how "calls" is measured.

            regards, tom lane



Re: [BUG] pg_stat_statements and extended query protocol

От
"Imseih (AWS), Sami"
Дата:
> Why should that be the definition? Partial execution of a portal
> might be something that is happening at the driver level, behind the
> user's back. You can't make rational calculations of, say, plan
> time versus execution time if that's how "calls" is measured.

Correct, and there are also drivers that implement fetch size using
cursor statements, i.e. DECLARE CURSOR, FETCH CURSOR,
and each FETCH gets counted as 1 call.

I wonder if the right answer here is to track fetches as 
a separate counter in pg_stat_statements, in which fetch
refers to the number of times a portal is executed?

Regards,

Sami Imseih
Amazon Web Services (AWS)







Re: [BUG] pg_stat_statements and extended query protocol

От
Tom Lane
Дата:
"Imseih (AWS), Sami" <simseih@amazon.com> writes:
> I wonder if the right answer here is to track fetches as 
> a separate counter in pg_stat_statements, in which fetch
> refers to the number of times a portal is executed?

Maybe, but is there any field demand for that?

IMV, the existing behavior is that we count one "call" per overall
query execution (that is, per ExecutorEnd invocation).  The argument
that that's a bug and we should change it seems unsupportable to me,
and even the argument that we should also count ExecutorRun calls
seems quite lacking in evidence.  We clearly do need to fix the
reported rowcount for cases where ExecutorRun is invoked more than
once per ExecutorEnd call; but I think that's sufficient.

            regards, tom lane



Re: [BUG] pg_stat_statements and extended query protocol

От
"Imseih (AWS), Sami"
Дата:
> Maybe, but is there any field demand for that?

I don't think there is.

> We clearly do need to fix the
> reported rowcount for cases where ExecutorRun is invoked more than
> once per ExecutorEnd call; but I think that's sufficient.

Sure, the original proposed fix, but with tracking the es_total_processed
only in Estate should be enough for now.

Regards,

Sami Imseih
Amazon Web Services (AWS)




Re: [BUG] pg_stat_statements and extended query protocol

От
Michael Paquier
Дата:
On Tue, Apr 04, 2023 at 03:29:07AM +0000, Imseih (AWS), Sami wrote:
>> We clearly do need to fix the
>> reported rowcount for cases where ExecutorRun is invoked more than
>> once per ExecutorEnd call; but I think that's sufficient.
>
> Sure, the original proposed fix, but with tracking the es_total_processed
> only in Estate should be enough for now.

I was looking back at this thread, and the suggestion to use one field
in EState sounds fine to me.  Sami, would you like to send a new
version of the patch (simplified version based on v1)?
--
Michael

Вложения

Re: [BUG] pg_stat_statements and extended query protocol

От
"Imseih (AWS), Sami"
Дата:
> I was looking back at this thread, and the suggestion to use one field
> in EState sounds fine to me. Sami, would you like to send a new
> version of the patch (simplified version based on v1)?

Here is v4.

The "calls" tracking is removed from Estate. Unlike v1 however,
I added a check for the operation type. Inside ExecutorRun,
es_total_processed is incremented when the operation is
a SELECT. This check is done for es_processed as well inside
executorRun -> ExecutePlan.

For non select operations, es_total_processed is set to
es_processed in executorfinish. This is because the modify
plan nodes set es_processed outside of execMain.c


Regards,

Sami Imseih
Amazon Web Services (AWS)




Вложения

Re: [BUG] pg_stat_statements and extended query protocol

От
Michael Paquier
Дата:
On Tue, Apr 04, 2023 at 09:48:17PM +0000, Imseih (AWS), Sami wrote:
> The "calls" tracking is removed from Estate. Unlike v1 however,
> I added a check for the operation type. Inside ExecutorRun,
> es_total_processed is incremented when the operation is
> a SELECT. This check is done for es_processed as well inside
> executorRun -> ExecutePlan.

I see.  This seems in line with ExecutePlan() where es_processed is
incremented only for SELECT queries.

> For non select operations, es_total_processed is set to
> es_processed in executorfinish. This is because the modify
> plan nodes set es_processed outside of execMain.c

My java is very rusty but (after some time reminding myself how to do
a CLASSPATH) I can see the row counts being right for things like
SELECT, INSERT RETURNING, WITH queries for both of them and
autocommit, and the proposed fix influences SELECT only depending on
the fetch size.  Doing nothing for calls now is fine by me, though I
agree that this could be improved at some point, as seeing only 1
rather than N for each fetch depending on the size is a bit confusing.

 *      There is no return value, but output tuples (if any) are sent to
 *      the destination receiver specified in the QueryDesc; and the number
 *      of tuples processed at the top level can be found in
 *      estate->es_processed.

Doesn't this comment at the top of ExecutorRun() need an update?  It
seems to me that this comment should mention both es_total_processed
and es_processed, telling that es_processed is the number of rows
processed in a single call, while es_total_processed is the sum of all
tuples processed in the ExecutorRun() calls.

@@ -441,6 +451,13 @@ standard_ExecutorFinish(QueryDesc *queryDesc)
    if (queryDesc->totaltime)
        InstrStopNode(queryDesc->totaltime, 0);

+   /*
+    * For non-SELECT operations, es_total_processed will always be
+    * equal to es_processed.
+    */
+   if (operation != CMD_SELECT)
+       queryDesc->estate->es_total_processed = queryDesc->estate->es_processed;

There is no need for this part in ExecutorFinish(), actually, as long
as we always increment es_total_processed at the end ExecutorRun() for
all the operation types?  If the portal does not have a store, we
would do one ExecutorRun() call per execute fetch.  If the portal has
a store, we'd do only one ExecutorRun().  Both cases call once
ExecutorFinish(), but the finish would happen during the first
execute when filling a portal's tuple store, and during the last
execute fetch if the portal has no store.  This is Tom's point in [1],
from what I can see.  That seems simpler to me, as well.

-   uint64      es_processed;   /* # of tuples processed */
+   uint64      es_processed;   /* # of tuples processed for a single
+                                * execution of a portal */
+   uint64      es_total_processed; /* # of tuples processed for all
+                                    * executions of a portal */

Hmm.  This does not reflect completely the reality for non-SELECT
statements, no?  For SELECT statements, that's correct, because
es_processed is reset in standard_ExecutorFinish() each time the
backend does an execute fetch thanks to PortalRunSelect().  For
non-SELECT statements, the extended query protocol uses the tuples
stored in the portal after one execution, meaning that we run through
the executor once with both es_processed and es_total_processed set to
their final numbers from the start, before any fetches.  I would
suggest something like that to document both fields:
- es_processed: number of tuples processed during one ExecutorRun()
call.
- es_total_processed: total number of tuples aggregated across all
ExecutorRun() calls.

At the end, I'm OK with the proposal after a closer look, but I think
that we should do a much better job at describing es_processed and
es_total_processed in execnodes.h, particularly in the case of a
portal holding a store where es_processed may not reflect the number
of rows for a single portal execution, and it seems to me that the
proposal of incrementing es_total_processed at the end of
ExecutorRun() for all commands is simpler, based on what I have
tested.

[1]: https://www.postgresql.org/message-id/1311773.1680577992@sss.pgh.pa.us
--
Michael

Вложения

Re: [BUG] pg_stat_statements and extended query protocol

От
"Imseih (AWS), Sami"
Дата:
> Doing nothing for calls now is fine by me, though I
> agree that this could be improved at some point, as seeing only 1
> rather than N for each fetch depending on the size is a bit confusing.

I think we will need to clearly define what "calls" is. Perhaps as mentioned
above, we may need separate counters for "calls" vs "fetches". This is
definitely a separate thread.


> Doesn't this comment at the top of ExecutorRun() need an update? It
> seems to me that this comment should mention both es_total_processed

Yes, updated in v5.


> There is no need for this part in ExecutorFinish(), actually, as long
> as we always increment es_total_processed at the end ExecutorRun() for
> all the operation types? 

Ah, correct. I changed that and tested again.

> - es_processed: number of tuples processed during one ExecutorRun()
> call.
> - es_total_processed: total number of tuples aggregated across all
> ExecutorRun() calls.

I thought hard about this point and for some reason I did not want to
mention ExecutorRun in the comment. But, I agree with what you suggest.
It's more clear as to the intention of the fields.

Attached is v5 addressing the comments.


Regards,

Sami Imseih
Amazon Web Services (AWS)


Вложения

Re: [BUG] pg_stat_statements and extended query protocol

От
Michael Paquier
Дата:
On Wed, Apr 05, 2023 at 04:07:21AM +0000, Imseih (AWS), Sami wrote:
>> - es_processed: number of tuples processed during one ExecutorRun()
>> call.
>> - es_total_processed: total number of tuples aggregated across all
>> ExecutorRun() calls.
>
> I thought hard about this point and for some reason I did not want to
> mention ExecutorRun in the comment. But, I agree with what you suggest.
> It's more clear as to the intention of the fields.
>
> Attached is v5 addressing the comments.

Thanks, this should be enough to persist the number of tuples tracked
across multiple ExecutorRun() calls.  This looks pretty good to me.

We should do something about providing more control over that to
libpq in the long run, IMO, and have more test coverage, but let's see
about that in 17~.
--
Michael

Вложения

Re: [BUG] pg_stat_statements and extended query protocol

От
Tom Lane
Дата:
Michael Paquier <michael@paquier.xyz> writes:
> On Wed, Apr 05, 2023 at 04:07:21AM +0000, Imseih (AWS), Sami wrote:
>> Attached is v5 addressing the comments.

> Thanks, this should be enough to persist the number of tuples tracked
> across multiple ExecutorRun() calls.  This looks pretty good to me.

v5 seems OK to me except I think CreateExecutorState() should explicitly
zero the new es_total_processed field, alongside zeroing es_processed.
(I realize that the makeNode would have done it already, but our
coding conventions generally run towards not relying on that.  This is
mainly for greppability, so you can find where a field is initialized.)

            regards, tom lane



Re: [BUG] pg_stat_statements and extended query protocol

От
Michael Paquier
Дата:
On Wed, Apr 05, 2023 at 05:39:13PM -0400, Tom Lane wrote:
> v5 seems OK to me except I think CreateExecutorState() should explicitly
> zero the new es_total_processed field, alongside zeroing es_processed.
> (I realize that the makeNode would have done it already, but our
> coding conventions generally run towards not relying on that.  This is
> mainly for greppability, so you can find where a field is initialized.)

Makes sense to me.  I'll look at that again today, potentially apply
the fix on HEAD.
--
Michael

Вложения

Re: [BUG] pg_stat_statements and extended query protocol

От
"Imseih (AWS), Sami"
Дата:
> Makes sense to me. I'll look at that again today, potentially apply
> the fix on HEAD.

Here is v6. That was my mistake not to zero out the es_total_processed.
I had it in the first version.

--
Regards,

Sami Imseih
Amazon Web Services (AWS)



Вложения

Re: [BUG] pg_stat_statements and extended query protocol

От
Andres Freund
Дата:
Hi,

On 2023-04-06 07:09:37 +0900, Michael Paquier wrote:
> I'll look at that again today, potentially apply the fix on HEAD.

Seems like a complicated enough facility to benefit from a test or two? Peter
Eisentraut added support for the extended query protocol to psql, so it
shouldn't be too hard...

commit 5b66de3433e
Author: Peter Eisentraut <peter@eisentraut.org>
Date:   2022-11-15 13:50:27 +0100
 
    psql: Add command to use extended query protocol

Greetings,

Andres Freund



Re: [BUG] pg_stat_statements and extended query protocol

От
Michael Paquier
Дата:
On Wed, Apr 05, 2023 at 10:16:19PM +0000, Imseih (AWS), Sami wrote:
> Here is v6. That was my mistake not to zero out the es_total_processed.
> I had it in the first version.

The update of es_total_processed in standard_ExecutorRun() felt a bit
lonely, so I have added an extra comment, ran an indentation, and
applied the result.  Thanks Sami for the patch, and everyone else for
the feedback!
--
Michael

Вложения

Re: [BUG] pg_stat_statements and extended query protocol

От
Michael Paquier
Дата:
On Wed, Apr 05, 2023 at 05:39:35PM -0700, Andres Freund wrote:
> Seems like a complicated enough facility to benefit from a test or two? Peter
> Eisentraut added support for the extended query protocol to psql, so it
> shouldn't be too hard...

PQsendQueryGuts() does not split yet the bind/describe phase and the
execute phases, so we'd need a couple more libpq APIs to do that, with
more tracking of the state we're currently on when looping across
multiple execute fetches.  My guess is that it is possible to follow a
model similar to JDBC here.  I don't think that's necessarily
complicated, but it is not as straight-forward as it looks.  \bind was
much more straight-forward than that, as it can feed on a single call
of PQsendQueryParams() after saving a set of parameters.  An \exec
would not completely do that.

Attaching one of the scripts I've played with, in a very rusty java
with no classes or such, for future reference.  Just update CLASSPATH
to point to a copy of the JDBC driver, run it with a java command, and
then look at rows, query in pg_stat_statements.
--
Michael

Вложения