Обсуждение: Fix log_line_prefix to display the transaction id (%x) for statements not in a transaction block

Поиск
Список
Период
Сортировка

Fix log_line_prefix to display the transaction id (%x) for statements not in a transaction block

От
Quan Zongliang
Дата:

Implement TODO item:

Fix log_line_prefix to display the transaction id (%x) for statements 
not in a transaction block
     Currently it displays zero.


Check that the XID has been assigned at the location where the statement 
log is now printed. If not, no statement log is output.
And then before finish_xact_command. If the statement has not been 
output to the log. Here the log can get XID.

DML that does not manipulate any data still does not get XID.

[32718][788] LOG:  statement: insert into t1 values(1,0,'');
[32718][789] LOG:  statement: delete from t1;
[32718][0] LOG:  statement: delete from t1;


--
Quan Zongliang
Вложения
Hi.
+ /* Log immediately if dictated by log_statement and XID assigned. */
+ if (GetTopTransactionIdIfAny() != InvalidTransactionId &&
+ check_log_statement(parsetree_list))

change to

+ /* Log immediately if dictated by log_statement and XID assigned. */
+ if ( check_log_statement(parsetree_list)  &&
+ GetTopTransactionIdIfAny() != InvalidTransactionId)

I think it would reduce GetTopTransactionIdIfAny() calls.

I guess people will have different opinion that
simple query like:
`explain(analyze) select g from generate_series(1,1e6) g, pg_sleep(10);`
The log output will only be generated after 10 seconds.
of course, there is pg_stat_activity and other ways to view the running query.

playing around with the patch.
The patch is better than the current HEAD, in some cases.
both under condition:
alter system set log_line_prefix = '%m [%p] %q%u@%d/%a XID:%x ';
set log_statement = 'all';
select pg_reload_conf();

With Patch:
src3=# create table x1(a int);
2024-01-11 17:11:51.150 CST [115782] jian@src3/psql XID:814 LOG:
statement: create table x1(a int);
CREATE TABLE
src3=#
src3=# insert into x1 select 100;
2024-01-11 17:12:06.953 CST [115782] jian@src3/psql XID:815 LOG:
statement: insert into x1 select 100;
INSERT 0 1
src3=# begin;
2024-01-11 17:12:17.543 CST [115782] jian@src3/psql XID:0 LOG:
statement: begin;
BEGIN
src3=*# insert into x1 select 100;
2024-01-11 17:12:24.779 CST [115782] jian@src3/psql XID:816 LOG:
statement: insert into x1 select 100;
INSERT 0 1
src3=*# commit;
2024-01-11 17:12:29.851 CST [115782] jian@src3/psql XID:816 LOG:
statement: commit;
COMMIT
src3=# select 11;
2024-01-11 17:14:22.909 CST [115782] jian@src3/psql XID:0 LOG:
statement: select 11;
 ?column?
----------
       11
(1 row)
src3=# drop table x1;
2024-01-11 17:15:01.409 CST [115782] jian@src3/psql XID:817 LOG:
statement: drop table x1;
DROP TABLE
src3=# select pg_current_xact_id();
2024-01-11 17:21:55.602 CST [115782] jian@src3/psql XID:818 LOG:
statement: select pg_current_xact_id();
 pg_current_xact_id
--------------------
                818
(1 row)
---------------------------------------------------------------------------------
without patch:

src4=# insert into x1 select 100;
2024-01-11 17:07:13.556 CST [115240] jian@src4/psql XID:0 LOG:
statement: insert into x1 select 100;
INSERT 0 1
src4=# begin;
2024-01-11 17:07:31.345 CST [115240] jian@src4/psql XID:0 LOG:
statement: begin;
BEGIN
src4=*# insert into x1 select 100;
2024-01-11 17:07:35.475 CST [115240] jian@src4/psql XID:0 LOG:
statement: insert into x1 select 100;
INSERT 0 1
src4=*# commit;
2024-01-11 17:07:39.095 CST [115240] jian@src4/psql XID:863 LOG:
statement: commit;
COMMIT
src4=# show logging_collector;
2024-01-11 17:09:59.307 CST [115240] jian@src4/psql XID:0 LOG:
statement: show logging_collector;
 logging_collector
-------------------
 off
(1 row)
src4=# select 11;
2024-01-11 17:14:30.001 CST [115240] jian@src4/psql XID:0 LOG:
statement: select 11;
 ?column?
----------
       11
(1 row)
src4=# drop table x1;
2024-01-11 17:15:08.010 CST [115240] jian@src4/psql XID:0 LOG:
statement: drop table x1;
DROP TABLE
src4=# select pg_current_xact_id();
2024-01-11 17:21:22.085 CST [115240] jian@src4/psql XID:0 LOG:
statement: select pg_current_xact_id();
 pg_current_xact_id
--------------------
                867
(1 row)



Hi
another big  difference compare to HEAD:

select  name,setting
        from    pg_settings
        where name in
        ('auto_explain.log_timing','auto_explain.log_analyze',
'auto_explain.log_min_duration','log_statement','log_line_prefix')
    ;
             name              |          setting
-------------------------------+----------------------------
 auto_explain.log_analyze      | on
 auto_explain.log_min_duration | 0
 auto_explain.log_timing       | on
 log_line_prefix               | %m [%p] %q%u@%d/%a XID:%x
 log_statement                 | all

simple query: explain(analyze, costs off) select 1 from pg_sleep(10);

with patch:
src3=# explain(analyze, costs off) select 1 from pg_sleep(10);
2024-01-12 08:43:14.750 CST [5739] jian@src3/psql XID:0 LOG:
duration: 10010.167 ms  plan:
        Query Text: explain(analyze, costs off) select 1 from pg_sleep(10);
        Function Scan on pg_sleep  (cost=0.00..0.01 rows=1 width=4)
(actual time=10010.155..10010.159 rows=1 loops=1)
2024-01-12 08:43:14.750 CST [5739] jian@src3/psql XID:0 LOG:
statement: explain(analyze, costs off) select 1 from pg_sleep(10);
                                 QUERY PLAN
-----------------------------------------------------------------------------
 Function Scan on pg_sleep (actual time=10010.155..10010.159 rows=1 loops=1)
 Planning Time: 0.115 ms
 Execution Time: 10010.227 ms
(3 rows)

without patch:

src4=#
src4=# explain(analyze, costs off) select 1 from pg_sleep(10);
2024-01-12 08:43:13.462 CST [5869] jian@src4/psql XID:0 LOG:
statement: explain(analyze, costs off) select 1 from pg_sleep(10);
2024-01-12 08:43:23.473 CST [5869] jian@src4/psql XID:0 LOG:
duration: 10010.133 ms  plan:
        Query Text: explain(analyze, costs off) select 1 from pg_sleep(10);
        Function Scan on pg_sleep  (cost=0.00..0.01 rows=1 width=4)
(actual time=10010.126..10010.128 rows=1 loops=1)
                                 QUERY PLAN
-----------------------------------------------------------------------------
 Function Scan on pg_sleep (actual time=10010.126..10010.128 rows=1 loops=1)
 Planning Time: 0.031 ms
 Execution Time: 10010.172 ms
(3 rows)



A little tweak to the code.
GetTopTransactionIdIfAny() != InvalidTransactionId
changed to
TransactionIdIsValid(GetTopTransactionIdIfAny()


On 2024/1/12 08:51, jian he wrote:
> Hi
...
> with patch:
> src3=# explain(analyze, costs off) select 1 from pg_sleep(10);
> 2024-01-12 08:43:14.750 CST [5739] jian@src3/psql XID:0 LOG:
> duration: 10010.167 ms  plan:
>          Query Text: explain(analyze, costs off) select 1 from pg_sleep(10);
>          Function Scan on pg_sleep  (cost=0.00..0.01 rows=1 width=4)
> (actual time=10010.155..10010.159 rows=1 loops=1)
> 2024-01-12 08:43:14.750 CST [5739] jian@src3/psql XID:0 LOG:
> statement: explain(analyze, costs off) select 1 from pg_sleep(10);
>                                   QUERY PLAN
> -----------------------------------------------------------------------------
>   Function Scan on pg_sleep (actual time=10010.155..10010.159 rows=1 loops=1)
>   Planning Time: 0.115 ms
>   Execution Time: 10010.227 ms
> (3 rows)
This problem does exist in a statement that takes a long time to run.
XID is applied only for the first change tuple. If the user want to see 
it in a single statement log, they have to wait until the statement has 
finished executing. And we don't know how long it will take until the 
statement ends. It is not appropriate to output the log twice because of 
xid. Besides, without parsing log_line_prefix we don't know if the user 
wants to see xid.
Вложения

> On 12 Jan 2024, at 05:51, jian he <jian.universality@gmail.com> wrote:
>
> another big  difference compare to HEAD:

Hi Jian,

thanks for looking into this. Would you be willing to review the next version of the patch?
As long as you are looking into this, you might be interested in registering yourself in a CF entry as a reviewer. [0]

Thanks!


Best regards, Andrey Borodin.

[0] https://commitfest.postgresql.org/47/4606/


On Mon, Mar 4, 2024 at 2:12 PM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>
> > On 12 Jan 2024, at 05:51, jian he <jian.universality@gmail.com> wrote:
> >
> > another big  difference compare to HEAD:
>
> Hi Jian,
>
> thanks for looking into this. Would you be willing to review the next version of the patch?

I just applied postgres-v2.patch.
Previously, I just mentioned, there is a big difference between to HEAD.
I didn't mention it explicitly at [1]

postgres-v2.patch still has the problem.
it's not related to XID, it's the misbehavior of the patch.
When we load the 'auto_explain' module, we can log two things, the
query plan and statement.
After applying the patch, the outputs of the query plan and the
statement will be mixed together, which
obviously is far from the desired behavior.
Maybe we can tolerate LOG, first output the query plan then statement.

> As long as you are looking into this, you might be interested in registering yourself in a CF entry as a reviewer.
[0]
sure.

[1] https://www.postgresql.org/message-id/CACJufxHg1oir8sd=xScMP3n+tYcbug=zusG5KiA2KzH5PmOeuQ@mail.gmail.com




On 2024/3/4 15:48, jian he wrote:

> Maybe we can tolerate LOG, first output the query plan then statement.
> 
It is more appropriate to let the extension solve its own problems.
Of course, this change is not easy to implement.
Due to the way XID is assigned, there seems to be no good solution at 
the moment.





On 2024/3/11 09:25, Quan Zongliang wrote:
> 
> 
> On 2024/3/4 15:48, jian he wrote:
> 
>> Maybe we can tolerate LOG, first output the query plan then statement.
>>
> It is more appropriate to let the extension solve its own problems.
> Of course, this change is not easy to implement.
> Due to the way XID is assigned, there seems to be no good solution at 
> the moment.
> 
> 
According to the discussion with Jian He. Use the guc hook to check if 
the xid needs to be output. If needed, the statement log can be delayed 
to be output.
Вложения
On Tue, Apr 16, 2024 at 3:16 AM Quan Zongliang <quanzongliang@yeah.net> wrote:
> According to the discussion with Jian He. Use the guc hook to check if
> the xid needs to be output. If needed, the statement log can be delayed
> to be output.

I appreciate the work that both of you have put into this, but I think
we should reject this patch and remove the TODO item. We currently
have some facilities (like log_statement) that log the statement
before parsing it or executing it, and others (like
log_min_duration_statement) that log it afterward. That is probably
not documented as clearly as it should be, but it's long-established
behavior.

What this patch does is change the behavior of log_statement so that
log_statement sometimes logs the statement before it's executed, and
sometimes after the statement. I think that's going to be confusing
and unhelpful. In particular, right now you can assume that if you set
log_statement=all and there's a statement running, it's already been
logged. With this change, that would sometimes be true and sometimes
false.

For example, suppose that at 9am sharp, I run an UPDATE command that
takes ten seconds to complete. Right now, the log_statement message
will appear at 9am. With this change, it will run at 9am if I do it
inside a transaction block that has an XID already, and at 9:00:10am
if I do it in a transaction block that does not yet have an XID, or if
I do it outside of a transaction. I don't think the benefit of getting
the XID in the log message is nearly enough to justify such a strange
behavior.

--
Robert Haas
EDB: http://www.enterprisedb.com




On 2024/5/16 00:58, Robert Haas wrote:
> On Tue, Apr 16, 2024 at 3:16 AM Quan Zongliang <quanzongliang@yeah.net> wrote:
>> According to the discussion with Jian He. Use the guc hook to check if
>> the xid needs to be output. If needed, the statement log can be delayed
>> to be output.
> 
> I appreciate the work that both of you have put into this, but I think
> we should reject this patch and remove the TODO item. We currently
> have some facilities (like log_statement) that log the statement
> before parsing it or executing it, and others (like
> log_min_duration_statement) that log it afterward. That is probably
> not documented as clearly as it should be, but it's long-established
> behavior.
> 
> What this patch does is change the behavior of log_statement so that
> log_statement sometimes logs the statement before it's executed, and
> sometimes after the statement. I think that's going to be confusing
> and unhelpful. In particular, right now you can assume that if you set
> log_statement=all and there's a statement running, it's already been
> logged. With this change, that would sometimes be true and sometimes
> false.
> 
> For example, suppose that at 9am sharp, I run an UPDATE command that
> takes ten seconds to complete. Right now, the log_statement message
> will appear at 9am. With this change, it will run at 9am if I do it
> inside a transaction block that has an XID already, and at 9:00:10am
> if I do it in a transaction block that does not yet have an XID, or if
> I do it outside of a transaction. I don't think the benefit of getting
> the XID in the log message is nearly enough to justify such a strange
> behavior.
> 
I thought about writing statement log when xid assigned. But it's seemed 
too complicated.
I'm inclined to keep it for a while. Until we find a good way or give 
up. It's a reasonable request, after all.




On Thu, May 16, 2024 at 6:01 AM Quan Zongliang <quanzongliang@yeah.net> wrote:
> I thought about writing statement log when xid assigned. But it's seemed
> too complicated.
> I'm inclined to keep it for a while. Until we find a good way or give
> up. It's a reasonable request, after all.

I don't think it's reasonable at all. We can't log the XID before it's
assigned, and we can't log the statement after the XID is assigned
without completely changing how the parameter works.

--
Robert Haas
EDB: http://www.enterprisedb.com



On Thu, May 16, 2024 at 08:37:43AM -0400, Robert Haas wrote:
> On Thu, May 16, 2024 at 6:01 AM Quan Zongliang <quanzongliang@yeah.net> wrote:
> > I thought about writing statement log when xid assigned. But it's seemed
> > too complicated.
> > I'm inclined to keep it for a while. Until we find a good way or give
> > up. It's a reasonable request, after all.
> 
> I don't think it's reasonable at all. We can't log the XID before it's
> assigned, and we can't log the statement after the XID is assigned
> without completely changing how the parameter works.

I have removed the TODO item.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.