Обсуждение: Re: [HACKERS] make async slave to wait for lsn to be replayed

Поиск
Список
Период
Сортировка

Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Kartyshov Ivan
Дата:
On 2018-03-06 14:50, Simon Riggs wrote:
> On 6 March 2018 at 11:24, Dmitry Ivanov <d.ivanov@postgrespro.ru> 
> wrote:
>>> In PG11, I propose the following command, sticking mostly to Ants'
>>> syntax, and allowing to wait for multiple events before it returns. 
>>> It
>>> doesn't hold snapshot and will not get cancelled by Hot Standby.
>>> 
>>> WAIT FOR event [, event ...] options
>>> 
>>> event is
>>> LSN value
>>> TIMESTAMP value
>>> 
>>> options
>>> TIMEOUT delay
>>> UNTIL TIMESTAMP timestamp
>>> (we have both, so people don't need to do math, they can use 
>>> whichever
>>> they have)
>> 
>> 
>> I have a (possibly) dumb question: if we have specified several 
>> events,
>> should WAIT finish if only one of them triggered? It's not immediately
>> obvious if we're waiting for ALL of them to trigger, or just one will
>> suffice (ANY). IMO the syntax could be extended to something like:
>> 
>> WAIT FOR [ANY | ALL] event [, event ...] options,
>> 
>> with ANY being the default variant.
> 
> +1

Here I made new patch of feature, discussed above.

WAIT FOR - wait statement to pause beneath statements
==========

Synopsis
==========
     WAIT FOR [ANY | SOME | ALL] event [, event ...] options
     and event is:
         LSN value
         TIMESTAMP value

     and options is:
         TIMEOUT delay
         UNTIL TIMESTAMP timestamp
Description
==========
WAIT FOR - make to wait statements (that are beneath) on sleep until
event happens (Don’t process new queries until an event happens).

How to use it
==========
WAIT FOR LSN ‘LSN’ [, timeout in ms];

#Wait until LSN 0/303EC60 will be replayed, or 10 second passed.
WAIT FOR ANY LSN ‘0/303EC60’, TIMEOUT 10000;

#Or same without timeout.
WAIT FOR LSN ‘0/303EC60’;

#Or wait for some timestamp.
WAIT FOR TIMESTAMP '2020-01-02 17:20:19.028161+03';

#Wait until ALL events occur: LSN to be replayed and timestamp
passed.
WAIT FOR ALL LSN ‘0/303EC60’, TIMESTAMP '2020-01-28 11:10:39.021341+03';

Notice: WAIT FOR will release on PostmasterDeath or Interruption events
if they come earlier then LSN or timeout.

Testing the implementation
======================
The implementation was tested with src/test/recovery/t/018_waitfor.pl

-- 
Ivan Kartyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Вложения

Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Kyotaro Horiguchi
Дата:
Hello.

I looked this briefly but not tested.

At Fri, 06 Mar 2020 00:24:01 +0300, Kartyshov Ivan <i.kartyshov@postgrespro.ru> wrote in 
> On 2018-03-06 14:50, Simon Riggs wrote:
> > On 6 March 2018 at 11:24, Dmitry Ivanov <d.ivanov@postgrespro.ru>
> > wrote:
> >>> In PG11, I propose the following command, sticking mostly to Ants'
> >>> syntax, and allowing to wait for multiple events before it returns. It
> >>> doesn't hold snapshot and will not get cancelled by Hot Standby.
> >>> WAIT FOR event [, event ...] options
> >>> event is
> >>> LSN value
> >>> TIMESTAMP value
> >>> options
> >>> TIMEOUT delay
> >>> UNTIL TIMESTAMP timestamp
> >>> (we have both, so people don't need to do math, they can use whichever
> >>> they have)
> >> I have a (possibly) dumb question: if we have specified several
> >> events,
> >> should WAIT finish if only one of them triggered? It's not immediately
> >> obvious if we're waiting for ALL of them to trigger, or just one will
> >> suffice (ANY). IMO the syntax could be extended to something like:
> >> WAIT FOR [ANY | ALL] event [, event ...] options,
> >> with ANY being the default variant.
> > +1
> 
> Here I made new patch of feature, discussed above.
> 
> WAIT FOR - wait statement to pause beneath statements
> ==========
> 
> Synopsis
> ==========
>     WAIT FOR [ANY | SOME | ALL] event [, event ...] options
>     and event is:
>         LSN value
>         TIMESTAMP value
> 
>     and options is:
>         TIMEOUT delay
>         UNTIL TIMESTAMP timestamp

The syntax seems getting confused. What happens if we typed in the
command "WAIT FOR TIMESTAMP '...' UNTIL TIMESTAMP '....'"?  It seems
to me the options is useles.  Couldn't the TIMEOUT option be a part of
event?  I know gram.y doesn't accept that syntax but it is not
apparent from the description above.

As I read through the previous thread, one of the reason for this
feature implemented as a syntax is it was inteded to be combined into
BEGIN statement.  If there is not any use case for the feature midst
of a transaction, why don't you turn it into a part of BEGIN command?

> Description
> ==========
> WAIT FOR - make to wait statements (that are beneath) on sleep until
> event happens (Don’t process new queries until an event happens).
...
> Notice: WAIT FOR will release on PostmasterDeath or Interruption
> events
> if they come earlier then LSN or timeout.

I think interrupts ought to result in ERROR.

wait.c adds a fair amount of code and uses proc-array based
approach. But Thomas suggested queue-based approach and I also think
it is better.  We already have a queue-based mechanism that behaves
almost the same with this feature in the comit code on master-side. It
avoids spurious backend wakeups. Couldn't we extend SyncRepWaitForLSN
or share a part of the code/infrastructures so that this feature can
share the code?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Kartyshov Ivan
Дата:
As it was discussed earlier, I added wait for statement into begin/start 
statement.

Synopsis
==========
BEGIN [ WORK | TRANSACTION ] [ transaction_mode[, ...] ] wait_for_event
       where transaction_mode is one of:
             ISOLATION LEVEL { SERIALIZABLE | REPEATABLE READ | READ 
COMMITTED | READ UNCOMMITTED }
             READ WRITE | READ ONLY
             [ NOT ] DEFERRABLE

       WAIT FOR [ANY | SOME | ALL] event [, event ...]
       and event is:
           LSN value [options]
           TIMESTAMP value

       and options is:
           TIMEOUT delay
           UNTIL TIMESTAMP timestamp



-- 
Ivan Kartyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Вложения

Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Anna Akenteva
Дата:
On 2020-03-21 14:16, Kartyshov Ivan wrote:
> As it was discussed earlier, I added wait for statement into
> begin/start statement.
Thanks! To address the discussion: I like the idea of having WAIT as a 
part of BEGIN statement rather than a separate command, as Thomas Munro 
suggested. That way, the syntax itself enforces that WAIT FOR LSN will 
actually take effect, even for single-snapshot transactions. It seems 
more convenient for the user, who won't have to remember the details 
about how WAIT interacts with isolation levels.


> BEGIN [ WORK | TRANSACTION ] [ transaction_mode[, ...] ] wait_for_event
Not sure about this, but could we add "WAIT FOR .." as another 
transaction_mode rather than a separate thing? That way, user won't have 
to worry about the order. As of now, one should remember to always put 
WAIT FOR as the Last parameter in the BEGIN statement.


>       and event is:
>           LSN value [options]
>           TIMESTAMP value
I would maybe remove WAIT FOR TIMESTAMP. As Robert Haas has pointed out, 
it seems a lot like pg_sleep_until(). Besides, it doesn't necessarily 
need to be connected to transaction start, which makes it different from 
WAIT FOR LSN - so I wouldn't mix them together.


I had another look at the code:


===
In WaitShmemSize() you might want to use functions that check for 
overflow - add_size()/mul_size(). They're used in similar situations, 
for example in BTreeShmemSize().


===
This is how WaitUtility() is called - note that time_val will always be 
 > 0:
+    if (time_val <= 0)
+        time_val = 1;
+...
+    res = WaitUtility(lsn, (int)(time_val * 1000), dest);

(time_val * 1000) is passed to WaitUtility() as the delay argument. And 
inside WaitUtility() we have this:

+if (delay > 0)
+    latch_events = WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH;
+else
+    latch_events = WL_LATCH_SET | WL_POSTMASTER_DEATH;

Since we always pass a delay value greater than 0, we'll never get to 
the "else" clause here and we'll never be ready to wait for LSN forever. 
Perhaps due to that, the current test outputs this after a simple WAIT 
FOR LSN command:
psql:<stdin>:1: NOTICE:  LSN is not reached.


===
Speaking of tests,

When I tried to test BEGIN TRANSACTION WAIT FOR LSN, I got a segfault:
LOG:  statement: BEGIN TRANSACTION WAIT FOR LSN '0/3002808'
LOG:  server process (PID 10385) was terminated by signal 11: 
Segmentation fault
DETAIL:  Failed process was running: COMMIT

Could you add some more tests to the patch when this is fixed? With WAIT 
as part of BEGIN statement + with things such as WAIT FOR ALL ... / WAIT 
FOR ANY ... / WAIT FOR LSN ... UNTIL TIMESTAMP ...


===
In WaitSetLatch() we should probably check backend for NULL before 
calling SetLatch(&backend->procLatch)

We might also need to check wait statement for NULL in these two cases:
+   case T_TransactionStmt:
+   {...
+       result = transformWaitForStmt(pstate, (WaitStmt *) stmt->wait);

case TRANS_STMT_START:
{...
+   WaitStmt   *waitstmt = (WaitStmt *) stmt->wait;
+   res = WaitMain(waitstmt, dest);


===
After we added the "wait" attribute to TransactionStmt struct, do we 
perhaps need to add something to _copyTransactionStmt() / 
_equalTransactionStmt()?

--
Anna Akenteva
Postgres Professional:
The Russian Postgres Company
http://www.postgrespro.com



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Kartyshov Ivan
Дата:
Anna, thank you for your review.

On 2020-03-25 21:10, Anna Akenteva wrote:
> On 2020-03-21 14:16, Kartyshov Ivan wrote:
>>       and event is:
>>           LSN value [options]
>>           TIMESTAMP value
> I would maybe remove WAIT FOR TIMESTAMP. As Robert Haas has pointed
> out, it seems a lot like pg_sleep_until(). Besides, it doesn't
> necessarily need to be connected to transaction start, which makes it
> different from WAIT FOR LSN - so I wouldn't mix them together.
I don't mind.
But I think we should get one more opinions on this point.

> ===
> This is how WaitUtility() is called - note that time_val will always be 
> > 0:
> +    if (time_val <= 0)
> +        time_val = 1;
> +...
> +    res = WaitUtility(lsn, (int)(time_val * 1000), dest);
> 
> (time_val * 1000) is passed to WaitUtility() as the delay argument.
> And inside WaitUtility() we have this:
> 
> +if (delay > 0)
> +    latch_events = WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH;
> +else
> +    latch_events = WL_LATCH_SET | WL_POSTMASTER_DEATH;
> 
> Since we always pass a delay value greater than 0, we'll never get to
> the "else" clause here and we'll never be ready to wait for LSN
> forever. Perhaps due to that, the current test outputs this after a
> simple WAIT FOR LSN command:
> psql:<stdin>:1: NOTICE:  LSN is not reached.
I fix it, and Interruptions in last patch.

Anna, feel free to work on this patch.

-- 
Ivan Kartyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Вложения

Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Anna Akenteva
Дата:
On 2020-03-27 04:15, Kartyshov Ivan wrote:
> Anna, feel free to work on this patch.

Ivan and I worked on this patch a bit more. We fixed the bugs that we 
could find and cleaned up the code. For now, we've kept both options: 
WAIT as a standalone statement and WAIT as a part of BEGIN. The new 
patch is attached.

The syntax looks a bit different now:

- WAIT FOR [ANY | ALL] event [, ...]
- BEGIN [ WORK | TRANSACTION ] [ transaction_mode [, ...] ] [ WAIT FOR 
[ANY | ALL] event [, ...]]
where event is one of:
     LSN value
     TIMEOUT number_of_milliseconds
     timestamp

Now, one event cannot contain both an LSN and a TIMEOUT. With such 
syntax, the behaviour seems to make sense. For the (default) WAIT FOR 
ALL strategy, we pick the maximum LSN and maximum allowed timeout, and 
wait for the LSN till the timeout is over. If no timeout is specified, 
we wait forever. If no LSN is specified, we just wait for the time to 
pass. For the WAIT FOR ANY strategy, it's the same but we pick minimum 
LSN and timeout.

There are still some questions left:
1) Should we only keep the BEGIN option, or does the standalone command 
have potential after all?
2) Should we change the grammar so that WAIT can be in any position of 
the BEGIN statement, not necessarily in the end? Ivan and I haven't come 
to a consensus about this, so more opinions would be helpful.
3) Since we added the "wait" attribute to TransactionStmt struct, do we 
need to add something to _copyTransactionStmt() / 
_equalTransactionStmt()?

-- 
Anna Akenteva
Postgres Professional:
The Russian Postgres Company
http://www.postgrespro.com
Вложения

Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Alexey Kondratov
Дата:
On 2020-04-01 02:26, Anna Akenteva wrote:
> On 2020-03-27 04:15, Kartyshov Ivan wrote:
>> Anna, feel free to work on this patch.
> 
> Ivan and I worked on this patch a bit more. We fixed the bugs that we
> could find and cleaned up the code. For now, we've kept both options:
> WAIT as a standalone statement and WAIT as a part of BEGIN. The new
> patch is attached.
> 
> The syntax looks a bit different now:
> 
> - WAIT FOR [ANY | ALL] event [, ...]
> - BEGIN [ WORK | TRANSACTION ] [ transaction_mode [, ...] ] [ WAIT FOR
> [ANY | ALL] event [, ...]]
> where event is one of:
>     LSN value
>     TIMEOUT number_of_milliseconds
>     timestamp
> 
> Now, one event cannot contain both an LSN and a TIMEOUT.
> 

In my understanding the whole idea of having TIMEOUT was to do something 
like 'Do wait for this LSN to be replicated, but no longer than TIMEOUT 
milliseconds'. What is the point of having plain TIMEOUT? It seems to be 
equivalent to pg_sleep, doesn't it?


Regards
-- 
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Anna Akenteva
Дата:
I did some code cleanup and added tests - both for the standalone WAIT 
FOR statement and for WAIT FOR as a part of BEGIN. The new patch is 
attached.

On 2020-04-03 17:29, Alexey Kondratov wrote:
> On 2020-04-01 02:26, Anna Akenteva wrote:
>> 
>> - WAIT FOR [ANY | ALL] event [, ...]
>> - BEGIN [ WORK | TRANSACTION ] [ transaction_mode [, ...] ] [ WAIT FOR
>> [ANY | ALL] event [, ...]]
>> where event is one of:
>>     LSN value
>>     TIMEOUT number_of_milliseconds
>>     timestamp
>> 
>> Now, one event cannot contain both an LSN and a TIMEOUT.
>> 
> 
> In my understanding the whole idea of having TIMEOUT was to do
> something like 'Do wait for this LSN to be replicated, but no longer
> than TIMEOUT milliseconds'. What is the point of having plain TIMEOUT?
> It seems to be equivalent to pg_sleep, doesn't it?
> 

In the patch that I reviewed, you could do things like:
     WAIT FOR
         LSN lsn0,
         LSN lsn1 TIMEOUT time1,
         LSN lsn2 TIMEOUT time2;
and such a statement was in practice equivalent to
     WAIT FOR LSN(max(lsn0, lsn1, lsn2)) TIMEOUT (max(time1, time2))

As you can see, even though grammatically lsn1 is grouped with time1 and 
lsn2 is grouped with time2, both timeouts that we specified are not 
connected to their respective LSN-s, and instead they kinda act like 
global timeouts. Therefore, I didn't see a point in keeping TIMEOUT 
necessarily grammatically connected to LSN.

In the new syntax our statement would look like this:
     WAIT FOR LSN lsn0, LSN lsn1, LSN lsn2, TIMEOUT time1, TIMEOUT time2;
TIMEOUT-s are not forced to be grouped with LSN-s anymore, which makes 
it more clear that all specified TIMEOUTs will be global and will apply 
to all LSN-s at once.

The point of having TIMEOUT is still to let us limit the time of waiting 
for LSNs. It's just that with the new syntax, we can also use TIMEOUT 
without an LSN. You are right, such a case is equivalent to pg_sleep. 
One way to avoid that is to prohibit waiting for TIMEOUT without 
specifying an LSN. Do you think we should do that?

-- 
Anna Akenteva
Postgres Professional:
The Russian Postgres Company
http://www.postgrespro.com
Вложения

Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Alexander Korotkov
Дата:
Hi!

On Fri, Apr 3, 2020 at 9:51 PM Anna Akenteva <a.akenteva@postgrespro.ru> wrote:
> In the patch that I reviewed, you could do things like:
>      WAIT FOR
>          LSN lsn0,
>          LSN lsn1 TIMEOUT time1,
>          LSN lsn2 TIMEOUT time2;
> and such a statement was in practice equivalent to
>      WAIT FOR LSN(max(lsn0, lsn1, lsn2)) TIMEOUT (max(time1, time2))
>
> As you can see, even though grammatically lsn1 is grouped with time1 and
> lsn2 is grouped with time2, both timeouts that we specified are not
> connected to their respective LSN-s, and instead they kinda act like
> global timeouts. Therefore, I didn't see a point in keeping TIMEOUT
> necessarily grammatically connected to LSN.
>
> In the new syntax our statement would look like this:
>      WAIT FOR LSN lsn0, LSN lsn1, LSN lsn2, TIMEOUT time1, TIMEOUT time2;
> TIMEOUT-s are not forced to be grouped with LSN-s anymore, which makes
> it more clear that all specified TIMEOUTs will be global and will apply
> to all LSN-s at once.
>
> The point of having TIMEOUT is still to let us limit the time of waiting
> for LSNs. It's just that with the new syntax, we can also use TIMEOUT
> without an LSN. You are right, such a case is equivalent to pg_sleep.
> One way to avoid that is to prohibit waiting for TIMEOUT without
> specifying an LSN. Do you think we should do that?

I think specifying multiple LSNs/TIMEOUTs is kind of ridiculous.  We
can assume that client application is smart enough to calculate
minimum/maximum on its side.  When multiple LSNs/TIMEOUTs are
specified, what should we wait for?  Reaching all the LSNs?  Reaching
any of LSNs?  Are timeouts independent from LSNs or sticked together?
So if we didn't manage to reach LSN1 in TIMEOUT1, then we don't wait
for LSN2 with TIMEOUT2 (or not)?

I think that now we would be fine with single LSN and single TIMEOUT.
In future we may add multiple LSNs/TIMEOUTs or/and support for
expressions as LSNs/TIMEOUTs if we figure out it's necessary.

I also think it's good to couple waiting for lsn with beginning of
transaction is good idea.  Separate WAIT FOR LSN statement called in
the middle of transaction looks problematic for me. Imagine we have RR
isolation and already acquired the snapshot.  Then out snapshot can
block applying wal records, which we are waiting for.  That would be
implicit deadlock.  It would be nice to evade such deadlocks by
design.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Kartyshov Ivan
Дата:
On 2020-04-03 21:51, Anna Akenteva wrote:
> I did some code cleanup and added tests - both for the standalone WAIT
> FOR statement and for WAIT FOR as a part of BEGIN. The new patch is
> attached.

I did more cleanup and code optimization on waiting events on latch.
And rebase patch.

-- 
Ivan Kartyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Вложения

Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Kartyshov Ivan
Дата:
On 2020-04-04 03:14, Alexander Korotkov wrote:
> I think that now we would be fine with single LSN and single TIMEOUT.
> In future we may add multiple LSNs/TIMEOUTs or/and support for
> expressions as LSNs/TIMEOUTs if we figure out it's necessary.
> 
> I also think it's good to couple waiting for lsn with beginning of
> transaction is good idea.  Separate WAIT FOR LSN statement called in
> the middle of transaction looks problematic for me. Imagine we have RR
> isolation and already acquired the snapshot.  Then out snapshot can
> block applying wal records, which we are waiting for.  That would be
> implicit deadlock.  It would be nice to evade such deadlocks by
> design.
Ok, here is a new version of patch with single LSN and TIMEOUT.

Synopsis
==========
BEGIN [ WORK | TRANSACTION ] [ transaction_mode [, ...] ] [WAIT FOR LSN 
'lsn' [ TIMEOUT 'value']]
and
START TRANSACTION [ transaction_mode [, ...] ] [WAIT FOR LSN 'lsn' [ 
TIMEOUT 'value']]
      where lsn is result of pg_current_wal_flush_lsn on master.
      and value is uint time interval in milliseconds.
Description
==========
BEGIN/START...WAIT FOR - pause the start of transaction until a 
specified LSN has
been replayed. (Don’t open transaction if lsn is not reached on 
timeout).

How to use it
==========
WAIT FOR LSN ‘LSN’ [, timeout in ms];

# Before starting transaction, wait until LSN 0/84832E8 is replayed. 
Wait time is
not limited here because a timeout was not specified
BEGIN WAIT FOR LSN '0/84832E8';

# Before starting transaction, wait until LSN 0/84832E8 is replayed. 
Limit the wait
time with 10 seconds, and if LSN is not reached by then, don't start the 
transaction.
START TRANSACTION WAIT FOR LSN '0/8DFFB88' TIMEOUT 10000;

# Same as previous, but with transaction isolation level = REPEATABLE 
READ
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ WAIT FOR LSN 
'0/815C0F1' TIMEOUT 10000;

Notice: WAIT FOR will release on PostmasterDeath or Interruption events
if they come earlier than LSN or timeout.

Testing the implementation
======================
The implementation was tested with src/test/recovery/t/020_begin_wait.pl

-- 
Ivan Kartyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Вложения

Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Alexander Korotkov
Дата:
On Tue, Apr 7, 2020 at 12:58 AM Kartyshov Ivan
<i.kartyshov@postgrespro.ru> wrote:
> On 2020-04-04 03:14, Alexander Korotkov wrote:
> > I think that now we would be fine with single LSN and single TIMEOUT.
> > In future we may add multiple LSNs/TIMEOUTs or/and support for
> > expressions as LSNs/TIMEOUTs if we figure out it's necessary.
> >
> > I also think it's good to couple waiting for lsn with beginning of
> > transaction is good idea.  Separate WAIT FOR LSN statement called in
> > the middle of transaction looks problematic for me. Imagine we have RR
> > isolation and already acquired the snapshot.  Then out snapshot can
> > block applying wal records, which we are waiting for.  That would be
> > implicit deadlock.  It would be nice to evade such deadlocks by
> > design.
> Ok, here is a new version of patch with single LSN and TIMEOUT.

I think this quite small feature, which already received quite amount
of review.  The last version is very pinched.  But I think it would be
good to commit some very basic version, which is at least some
progress in the area and could be extended in future.  I'm going to
pass trough the code tomorrow and commit this unless I found major
issues or somebody objects.

------
Alexander Korotkov

Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Anna Akenteva
Дата:
On 2020-04-07 00:58, Kartyshov Ivan wrote:
> Ok, here is a new version of patch with single LSN and TIMEOUT.

I had a look at the code and did some more code cleanup, with Ivan's 
permission.
This is what I did:
- Removed "WAIT FOR" command tag from cmdtaglist.h and renamed WaitStmt 
to WaitClause (since there's no standalone WAIT FOR command anymore)
- Added _copyWaitClause() and _equalWaitClause()
- Removed unused #include-s from utility.c
- Adjusted tests and documentation
- Fixed/added some code comments

I have a couple of questions about WaitUtility() though:
- When waiting forever (due to not specifying a timeout), isn't 60 
seconds too long of an interval to check for interrupts?
- If we did specify a timeout, it might be a very long one. In this 
case, shouldn't we also make sure to wake up sometimes to check for 
interrupts?
- Is it OK that specifying timeout = 0 (BEGIN WAIT FOR LSN ... TIMEOUT 
0) is the same as not specifying timeout at all?

-- 
Anna Akenteva
Postgres Professional:
The Russian Postgres Company
http://www.postgrespro.com
Вложения

Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Amit Kapila
Дата:
On Tue, Apr 7, 2020 at 7:56 AM Anna Akenteva <a.akenteva@postgrespro.ru> wrote:
>
> On 2020-04-07 00:58, Kartyshov Ivan wrote:
> > Ok, here is a new version of patch with single LSN and TIMEOUT.
>
> I had a look at the code and did some more code cleanup, with Ivan's
> permission.
> This is what I did:
> - Removed "WAIT FOR" command tag from cmdtaglist.h and renamed WaitStmt
> to WaitClause (since there's no standalone WAIT FOR command anymore)
> - Added _copyWaitClause() and _equalWaitClause()
> - Removed unused #include-s from utility.c
> - Adjusted tests and documentation
> - Fixed/added some code comments
>
> I have a couple of questions about WaitUtility() though:
> - When waiting forever (due to not specifying a timeout), isn't 60
> seconds too long of an interval to check for interrupts?
> - If we did specify a timeout, it might be a very long one. In this
> case, shouldn't we also make sure to wake up sometimes to check for
> interrupts?
>

Right, we should probably wait for 100ms before checking the
interrupts.  See the similar logic in pg_promote where we wait for
specified number of seconds.

> - Is it OK that specifying timeout = 0 (BEGIN WAIT FOR LSN ... TIMEOUT
> 0) is the same as not specifying timeout at all?
>

Yes that sounds reasonable to me.

Review comments:
--------------------------
1.
+/*
+ * Delete wait event of the current backend from the shared memory array.
+ *
+ * TODO: Consider state cleanup on backend failure.
+ * Check:
+ * 1) nomal|smart|fast|immediate stop
+ * 2) SIGKILL and SIGTERM
+ */
+static void
+DeleteEvent(void)

I don't see how this is implemented or called to handle any errors.
For example in function WaitUtility if the WaitLatch errors out due to
any error, then the event won't be deleted.  I think we can't assume
WaitLatch or any other code in this code path will never error out.
For ex. WaitLatch---->WaitEventSetWaitBlock() can error out.  Also, in
future we can add more code which can error out.

2.
+ /*
+ * If received an interruption from CHECK_FOR_INTERRUPTS,
+ * then delete the current event from array.
+ */
+ if (InterruptPending)
+ {
+ DeleteEvent();
+ ProcessInterrupts();
+ }

We generally do this type of handling via CHECK_FOR_INTERRUPTS.  One
reason is that it behaves slightly differently in Windows.  I am not
sure why we want to do differently here? This looks quite adhoc to me
and may not be correct.  If we handle this event in error path, then
we might not need to do some special handling.

3.
+/*
+ * On WAIT use a latch to wait till LSN is replayed,
+ * postmaster dies or timeout happens.
+ * Returns 1 if LSN was reached and 0 otherwise.
+ */
+int
+WaitUtility(XLogRecPtr target_lsn, const float8 secs)

Isn't it better to have a return value as bool?  IOW, why this
function need int as its return value?

4.
+#define GetNowFloat() ((float8) GetCurrentTimestamp() / 1000000.0)

This same define is used elsewhere in the code as well, may be we can
define it in some central place and use it.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Amit Kapila
Дата:
On Tue, Apr 7, 2020 at 5:56 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
>
> On Tue, Apr 7, 2020 at 12:58 AM Kartyshov Ivan
> <i.kartyshov@postgrespro.ru> wrote:
> > On 2020-04-04 03:14, Alexander Korotkov wrote:
> > > I think that now we would be fine with single LSN and single TIMEOUT.
> > > In future we may add multiple LSNs/TIMEOUTs or/and support for
> > > expressions as LSNs/TIMEOUTs if we figure out it's necessary.
> > >
> > > I also think it's good to couple waiting for lsn with beginning of
> > > transaction is good idea.  Separate WAIT FOR LSN statement called in
> > > the middle of transaction looks problematic for me. Imagine we have RR
> > > isolation and already acquired the snapshot.  Then out snapshot can
> > > block applying wal records, which we are waiting for.  That would be
> > > implicit deadlock.  It would be nice to evade such deadlocks by
> > > design.
> > Ok, here is a new version of patch with single LSN and TIMEOUT.
>
> I think this quite small feature, which already received quite amount
> of review.  The last version is very pinched.  But I think it would be
> good to commit some very basic version, which is at least some
> progress in the area and could be extended in future.  I'm going to
> pass trough the code tomorrow and commit this unless I found major
> issues or somebody objects.
>

I have gone through this thread and skimmed through the patch and I am
not sure if we can say that this patch is ready to go.  First, I don't
think we have a consensus on the syntax being used in the patch
(various people didn't agree to LSN specific syntax).  They wanted a
more generic syntax and I see that we tried to implement it and it
turns out to be a bit complex but that doesn't mean we just give up on
the idea and take the simplest approach and that too without a broader
agreement.  Second, on my quick review, it seems there are a few
things like error handling, interrupt checking which need more work.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Anna Akenteva
Дата:
On 2020-04-07 13:32, Amit Kapila wrote:
> First, I don't
> think we have a consensus on the syntax being used in the patch
> (various people didn't agree to LSN specific syntax).  They wanted a
> more generic syntax and I see that we tried to implement it and it
> turns out to be a bit complex but that doesn't mean we just give up on
> the idea and take the simplest approach and that too without a broader
> agreement.

Thank you for your comments!

Initially, the syntax used to be "WAITLSN", which confined us with only 
waiting for LSN-s and not anything else. So we switched to "WAIT FOR 
LSN", which would allow us to add variations like "WAIT FOR XID" or 
"WAIT FOR COMMIT TOKEN" in the future if we wanted. A few people seemed 
to imply that this kind of syntax is expandable enough:

On 2018-02-01 14:47, Simon Riggs wrote:
> I agree that WAIT LSN is good syntax because this allows us to wait
> for something else in future.

On 2017-10-31 12:42:56, Ants Aasma wrote:
> For lack of a better proposal I would like something along the lines 
> of:
> WAIT FOR state_id[, state_id] [ OPTIONS (..)]

As for giving up waiting for multiple events: we can only wait for LSN-s 
at the moment, and there seems to be no point in waiting for multiple 
LSN-s at once, because it's equivalent to waiting for the biggest LSN. 
So we opted for simpler grammar for now, only letting the user specify 
one LSN and one TIMEOUT. If in the future we allow waiting for something 
else, like XID-s, we can expand the grammar as needed.

What are your own thoughts on the syntax?

-- 
Anna Akenteva
Postgres Professional:
The Russian Postgres Company
http://www.postgrespro.com



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Alexander Korotkov
Дата:
On Tue, Apr 7, 2020 at 3:07 PM Anna Akenteva <a.akenteva@postgrespro.ru> wrote:
> On 2017-10-31 12:42:56, Ants Aasma wrote:
> > For lack of a better proposal I would like something along the lines
> > of:
> > WAIT FOR state_id[, state_id] [ OPTIONS (..)]
>
> As for giving up waiting for multiple events: we can only wait for LSN-s
> at the moment, and there seems to be no point in waiting for multiple
> LSN-s at once, because it's equivalent to waiting for the biggest LSN.
> So we opted for simpler grammar for now, only letting the user specify
> one LSN and one TIMEOUT. If in the future we allow waiting for something
> else, like XID-s, we can expand the grammar as needed.

+1
In the latest version of patch we have very brief and simple syntax
allowing to wait for given LSN with given timeout.  In future we can
expand this syntax in different ways.  I don't see that current syntax
is limiting us from something.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Amit Kapila
Дата:
On Tue, Apr 7, 2020 at 5:37 PM Anna Akenteva <a.akenteva@postgrespro.ru> wrote:
>
> On 2020-04-07 13:32, Amit Kapila wrote:
> > First, I don't
> > think we have a consensus on the syntax being used in the patch
> > (various people didn't agree to LSN specific syntax).  They wanted a
> > more generic syntax and I see that we tried to implement it and it
> > turns out to be a bit complex but that doesn't mean we just give up on
> > the idea and take the simplest approach and that too without a broader
> > agreement.
>
> Thank you for your comments!
>
> Initially, the syntax used to be "WAITLSN", which confined us with only
> waiting for LSN-s and not anything else. So we switched to "WAIT FOR
> LSN", which would allow us to add variations like "WAIT FOR XID" or
> "WAIT FOR COMMIT TOKEN" in the future if we wanted. A few people seemed
> to imply that this kind of syntax is expandable enough:
>
> On 2018-02-01 14:47, Simon Riggs wrote:
> > I agree that WAIT LSN is good syntax because this allows us to wait
> > for something else in future.
>
> On 2017-10-31 12:42:56, Ants Aasma wrote:
> > For lack of a better proposal I would like something along the lines
> > of:
> > WAIT FOR state_id[, state_id] [ OPTIONS (..)]
>
> As for giving up waiting for multiple events: we can only wait for LSN-s
> at the moment, and there seems to be no point in waiting for multiple
> LSN-s at once, because it's equivalent to waiting for the biggest LSN.
> So we opted for simpler grammar for now, only letting the user specify
> one LSN and one TIMEOUT. If in the future we allow waiting for something
> else, like XID-s, we can expand the grammar as needed.
>
> What are your own thoughts on the syntax?
>

I don't know how users can specify the LSN value but OTOH I could see
if users can somehow provide the correct value of commit LSN for which
they want to wait, then it could work out.  It is possible that I
misread and we have a consensus on WAIT FOR LSN [option] because I saw
what Simon and Ants have proposed includes multiple state/events and
it might be fine to have just one event for now.

Anyone else wants to share an opinion on syntax?

I think even if we are good with syntax, I could see the code is not
completely ready to go as mentioned in few comments raised by me.  I
am not sure if we want to commit it in the current form and then
improve after feature freeze.  If it is possible to fix the loose ends
quickly and there are no more comments by anyone then probably we
might be able to commit it.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Anna Akenteva
Дата:
On 2020-04-07 12:58, Amit Kapila wrote:
> 
> Review comments:
> 1.
> +static void
> +DeleteEvent(void)
> I don't see how this is implemented or called to handle any errors.
> 
> 2.
> + if (InterruptPending)
> + {
> + DeleteEvent();
> + ProcessInterrupts();
> + }
> We generally do this type of handling via CHECK_FOR_INTERRUPTS.
> 
> 3.
> +int
> +WaitUtility(XLogRecPtr target_lsn, const float8 secs)
> Isn't it better to have a return value as bool?
> 
> 4.
> +#define GetNowFloat() ((float8) GetCurrentTimestamp() / 1000000.0)
> This same define is used elsewhere in the code as well, may be we can
> define it in some central place and use it.

Thank you for your review!
Ivan and I have worked on the patch and tried to address your comments:

0. Now we wake up at least every 100ms to check for interrupts.
1. Now we call DeleteWaitedLSN() from 
ProcessInterrupts()=>LockErrorCleanup(). It seems that we can only exit 
the WAIT cycle improperly due to interrupts, so this should be enough 
(?)
2. Now we use CHECK_FOR_INTERRUPTS() instead of ProcessInterrupts()
3. Now WaitUtility() returns bool rather than int
4. Now GetNowFloat() is only defined at one place in the code

What we changed additionally:
- Prohibited using WAIT FOR LSN on master
- Added more tests
- Checked the code with pgindent and adjusted pgindent/typedefs.list
- Changed min_lsn's type to pg_atomic_uint64 + fixed how we work with 
mutex
- Code cleanup in wait.[c|h]: cleaned up #include-s, gave better names 
to functions, changed elog() to ereport()

-- 
Anna Akenteva
Postgres Professional:
The Russian Postgres Company
http://www.postgrespro.com
Вложения

Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Alexander Korotkov
Дата:
On Tue, Apr 7, 2020 at 10:58 PM Anna Akenteva <a.akenteva@postgrespro.ru> wrote:
> Thank you for your review!
> Ivan and I have worked on the patch and tried to address your comments:

I've pushed this.  I promise to do careful post-commit review and
resolve any issues arising.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Kartyshov Ivan
Дата:
On 2020-04-08 00:27, Tom Lane wrote:
> Alexander Korotkov <akorotkov@postgresql.org> writes:
»   WAIT FOR LSN lsn [ TIMEOUT timeout ]
> 
> This seems like a really carelessly chosen syntax —- *three* new
> keywords, when you probably didn't need any.  Are you not aware that
> there is distributed overhead in the grammar for every keyword?
> Plus, each new keyword carries the risk of breaking existing
> applications, since it no longer works as an alias-not-preceded-by-AS.
> 

To avoid creating new keywords, we could change syntax in the following 
way:
WAIT FOR => DEPENDS ON
LSN => EVENT
TIMEOUT => WITH INTERVAL

So
START TRANSACTION WAIT FOR LSN '0/3F07A6B1' TIMEOUT 5000;
would instead look as
START TRANSACTION DEPENDS ON EVENT '0/3F07A6B1' WITH INTERVAL '5 
seconds';

[1] 
https://www.postgresql.org/message-id/28209.1586294824%40sss.pgh.pa.us

-- 
Ivan Kartyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Alexander Korotkov
Дата:
On Wed, Apr 8, 2020 at 2:14 AM Kartyshov Ivan
<i.kartyshov@postgrespro.ru> wrote:
> On 2020-04-08 00:27, Tom Lane wrote:
> > Alexander Korotkov <akorotkov@postgresql.org> writes:
> »   WAIT FOR LSN lsn [ TIMEOUT timeout ]
> >
> > This seems like a really carelessly chosen syntax —- *three* new
> > keywords, when you probably didn't need any.  Are you not aware that
> > there is distributed overhead in the grammar for every keyword?
> > Plus, each new keyword carries the risk of breaking existing
> > applications, since it no longer works as an alias-not-preceded-by-AS.
> >
>
> To avoid creating new keywords, we could change syntax in the following
> way:
> WAIT FOR => DEPENDS ON

Looks OK for me.

> LSN => EVENT

I think it's too generic.  Not every event is lsn.  TBH, lsn is not
event at all :)

I wonder is we can still use word lsn, but don't use keyword for that.
Can we take arbitrary non-quoted literal there and check it later?

> TIMEOUT => WITH INTERVAL

I'm not yet sure about this.  Probably there are better options.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Kyotaro Horiguchi
Дата:
At Wed, 8 Apr 2020 02:52:55 +0300, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote in 
> On Wed, Apr 8, 2020 at 2:14 AM Kartyshov Ivan
> <i.kartyshov@postgrespro.ru> wrote:
> > On 2020-04-08 00:27, Tom Lane wrote:
> > > Alexander Korotkov <akorotkov@postgresql.org> writes:
> > »   WAIT FOR LSN lsn [ TIMEOUT timeout ]
> > >
> > > This seems like a really carelessly chosen syntax —- *three* new
> > > keywords, when you probably didn't need any.  Are you not aware that
> > > there is distributed overhead in the grammar for every keyword?
> > > Plus, each new keyword carries the risk of breaking existing
> > > applications, since it no longer works as an alias-not-preceded-by-AS.
> > >
> >
> > To avoid creating new keywords, we could change syntax in the following
> > way:
> > WAIT FOR => DEPENDS ON
> 
> Looks OK for me.
> 
> > LSN => EVENT
> 
> I think it's too generic.  Not every event is lsn.  TBH, lsn is not
> event at all :)
> 
> I wonder is we can still use word lsn, but don't use keyword for that.
> Can we take arbitrary non-quoted literal there and check it later?
> 
> > TIMEOUT => WITH INTERVAL
> 
> I'm not yet sure about this.  Probably there are better options.

How about something like the follows.

BEGIN AFTER ColId Sconst
BEGIN FOLOWING ColId Sconst

UNTIL <absolute time>;
LIMIT BY <interval>;
WITHIN Iconst;

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Anna Akenteva
Дата:
On 2020-04-08 04:09, Kyotaro Horiguchi wrote:
> How about something like the follows.
> 
> BEGIN AFTER ColId Sconst
> BEGIN FOLOWING ColId Sconst
> 
> UNTIL <absolute time>;
> LIMIT BY <interval>;
> WITHIN Iconst;
> 
> regards.

I like your suggested keywords! I think that "AFTER" + "WITHIN" sound 
the most natural. We could completely give up the LSN keyword for now. 
The final command could look something like:

BEGIN AFTER ‘0/303EC60’ WITHIN '5 seconds';
or
BEGIN AFTER ‘0/303EC60’ WITHIN 5000;

I'd like to hear others' opinions on the syntax as well.

-- 
Anna Akenteva
Postgres Professional:
The Russian Postgres Company
http://www.postgrespro.com



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Tom Lane
Дата:
Anna Akenteva <a.akenteva@postgrespro.ru> writes:
> I'd like to hear others' opinions on the syntax as well.

Pardon me for coming very late to the party, but it seems like there are
other questions that ought to be answered before we worry about any of
this.  Why is this getting grafted onto BEGIN/START TRANSACTION in the
first place?  It seems like it would be just as useful as a separate
command, if not more so.  You could always start a transaction just
after waiting.  Conversely, there might be reasons to want to wait
within an already-started transaction.

If it could survive as a separate command, then I'd humbly suggest
that it requires no grammar work at all.  You could just invent one
or more functions that take suitable parameters.

            regards, tom lane



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Kyotaro Horiguchi
Дата:
At Wed, 08 Apr 2020 16:35:46 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in 
> Anna Akenteva <a.akenteva@postgrespro.ru> writes:
> > I'd like to hear others' opinions on the syntax as well.
> 
> Pardon me for coming very late to the party, but it seems like there are
> other questions that ought to be answered before we worry about any of
> this.  Why is this getting grafted onto BEGIN/START TRANSACTION in the
> first place?  It seems like it would be just as useful as a separate
> command, if not more so.  You could always start a transaction just
> after waiting.  Conversely, there might be reasons to want to wait
> within an already-started transaction.
> 
> If it could survive as a separate command, then I'd humbly suggest
> that it requires no grammar work at all.  You could just invent one
> or more functions that take suitable parameters.

The rationale for not being a fmgr function is stated in the following
comments.

https://www.postgresql.org/message-id/CAEepm%3D0V74EApmfv%3DMGZa24Ac_pV1vGrp3Ovnv-3rUXwxu9epg%40mail.gmail.com
| because it doesn't work for our 2 higher isolation levels as
| mentioned."

https://www.postgresql.org/message-id/CA%2BTgmob-aG3Lqh6OpvMDYTNR5eyq94VugyEejyk7pLhE9uwnyA%40mail.gmail.com

| IMHO, trying to do this using a function-based interface is a really
| bad idea for exactly the reasons you mention.  I don't see why we'd
| resist the idea of core syntax here; transactions are a core part of
| PostgreSQL.

It seemed to me that they were suggested it to be in a part of BEGIN
command, but the next proposed patch implemented "WAIT FOR" command
for uncertain reasons to me.  I don't object to the isolate command if
it is useful than a part of BEGIN command.

By the way, for example, pg_current_wal_lsn() is a volatile function
and repeated calls within a SERIALIZABLE transaction can return
different values.

If there's no necessity for this feature to be a core command, I think
I would like to be it a function.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Fujii Masao
Дата:

On 2020/04/09 16:11, Kyotaro Horiguchi wrote:
> At Wed, 08 Apr 2020 16:35:46 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in
>> Anna Akenteva <a.akenteva@postgrespro.ru> writes:
>>> I'd like to hear others' opinions on the syntax as well.
>>
>> Pardon me for coming very late to the party, but it seems like there are
>> other questions that ought to be answered before we worry about any of
>> this.  Why is this getting grafted onto BEGIN/START TRANSACTION in the
>> first place?  It seems like it would be just as useful as a separate
>> command, if not more so.  You could always start a transaction just
>> after waiting.  Conversely, there might be reasons to want to wait
>> within an already-started transaction.
>>
>> If it could survive as a separate command, then I'd humbly suggest
>> that it requires no grammar work at all.  You could just invent one
>> or more functions that take suitable parameters.
> 
> The rationale for not being a fmgr function is stated in the following
> comments.

This issue happens because the function is executed after BEGIN? If yes,
what about executing the function (i.e., as separate transaction) before BEGIN?
If so, the snapshot taken in the function doesn't affect the subsequent
transaction whatever its isolation level is.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Tom Lane
Дата:
Fujii Masao <masao.fujii@oss.nttdata.com> writes:
> On 2020/04/09 16:11, Kyotaro Horiguchi wrote:
>> At Wed, 08 Apr 2020 16:35:46 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in
>>> Why is this getting grafted onto BEGIN/START TRANSACTION in the
>>> first place?

>> The rationale for not being a fmgr function is stated in the following
>> comments. [...]

> This issue happens because the function is executed after BEGIN? If yes,
> what about executing the function (i.e., as separate transaction) before BEGIN?
> If so, the snapshot taken in the function doesn't affect the subsequent
> transaction whatever its isolation level is.

I wonder whether making it a procedure, rather than a plain function,
would help any.

            regards, tom lane



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Alexey Kondratov
Дата:
On 2020-04-09 16:33, Tom Lane wrote:
> Fujii Masao <masao.fujii@oss.nttdata.com> writes:
>> On 2020/04/09 16:11, Kyotaro Horiguchi wrote:
>>> At Wed, 08 Apr 2020 16:35:46 -0400, Tom Lane <tgl@sss.pgh.pa.us> 
>>> wrote in
>>>> Why is this getting grafted onto BEGIN/START TRANSACTION in the
>>>> first place?
> 
>>> The rationale for not being a fmgr function is stated in the 
>>> following
>>> comments. [...]
> 
>> This issue happens because the function is executed after BEGIN? If 
>> yes,
>> what about executing the function (i.e., as separate transaction) 
>> before BEGIN?
>> If so, the snapshot taken in the function doesn't affect the 
>> subsequent
>> transaction whatever its isolation level is.
> 
> I wonder whether making it a procedure, rather than a plain function,
> would help any.
> 

Just another idea in case if one will still decide to go with a separate 
statement + BEGIN integration instead of a function. We could use 
parenthesized options list here. This is already implemented for VACUUM, 
REINDEX, etc. There was an idea to allow CONCURRENTLY in REINDEX there 
[1] and recently this was proposed again for new options [2], since it 
is much more extensible from the grammar perspective.

That way, the whole feature may look like:

WAIT (LSN '16/B374D848', TIMEOUT 100);

and/or

BEGIN
WAIT (LSN '16/B374D848', WHATEVER_OPTION_YOU_WANT);
...
COMMIT;

It requires only one reserved keyword 'WAIT'. The advantage of this 
approach is that it can be extended to support xid, timestamp, csn or 
anything else, that may be invented in the future, without affecting the 
grammar.

What do you think?

Personally, I find this syntax to be more convenient and human-readable 
compared with function call:

SELECT pg_wait_for_lsn('16/B374D848');
BEGIN;


[1] 
https://www.postgresql.org/message-id/aad2ec49-5142-7356-ffb2-a9b2649cdd1f%402ndquadrant.com

[2] 
https://www.postgresql.org/message-id/20200401060334.GB142683%40paquier.xyz


Regards
-- 
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Fujii Masao
Дата:

On 2020/04/10 3:16, Alexey Kondratov wrote:
> On 2020-04-09 16:33, Tom Lane wrote:
>> Fujii Masao <masao.fujii@oss.nttdata.com> writes:
>>> On 2020/04/09 16:11, Kyotaro Horiguchi wrote:
>>>> At Wed, 08 Apr 2020 16:35:46 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in
>>>>> Why is this getting grafted onto BEGIN/START TRANSACTION in the
>>>>> first place?
>>
>>>> The rationale for not being a fmgr function is stated in the following
>>>> comments. [...]
>>
>>> This issue happens because the function is executed after BEGIN? If yes,
>>> what about executing the function (i.e., as separate transaction) before BEGIN?
>>> If so, the snapshot taken in the function doesn't affect the subsequent
>>> transaction whatever its isolation level is.
>>
>> I wonder whether making it a procedure, rather than a plain function,
>> would help any.
>>
> 
> Just another idea in case if one will still decide to go with a separate statement + BEGIN integration instead of a
function.We could use parenthesized options list here. This is already implemented for VACUUM, REINDEX, etc. There was
anidea to allow CONCURRENTLY in REINDEX there [1] and recently this was proposed again for new options [2], since it is
muchmore extensible from the grammar perspective.
 
> 
> That way, the whole feature may look like:
> 
> WAIT (LSN '16/B374D848', TIMEOUT 100);
> 
> and/or
> 
> BEGIN
> WAIT (LSN '16/B374D848', WHATEVER_OPTION_YOU_WANT);
> ...
> COMMIT;
> 
> It requires only one reserved keyword 'WAIT'. The advantage of this approach is that it can be extended to support
xid,timestamp, csn or anything else, that may be invented in the future, without affecting the grammar.
 
> 
> What do you think?
> 
> Personally, I find this syntax to be more convenient and human-readable compared with function call:
> 
> SELECT pg_wait_for_lsn('16/B374D848');
> BEGIN;

I can imagine that some users want to specify the LSN to wait for,
from the result of another query, for example,
SELECT pg_wait_for_lsn(lsn) FROM xxx. If this is valid use case,
isn't the function better?

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Alexey Kondratov
Дата:
On 2020-04-10 05:25, Fujii Masao wrote:
> On 2020/04/10 3:16, Alexey Kondratov wrote:
>> Just another idea in case if one will still decide to go with a 
>> separate statement + BEGIN integration instead of a function. We could 
>> use parenthesized options list here. This is already implemented for 
>> VACUUM, REINDEX, etc. There was an idea to allow CONCURRENTLY in 
>> REINDEX there [1] and recently this was proposed again for new options 
>> [2], since it is much more extensible from the grammar perspective.
>> 
>> That way, the whole feature may look like:
>> 
>> WAIT (LSN '16/B374D848', TIMEOUT 100);
>> 
>> and/or
>> 
>> BEGIN
>> WAIT (LSN '16/B374D848', WHATEVER_OPTION_YOU_WANT);
>> ...
>> COMMIT;
>> 
>> It requires only one reserved keyword 'WAIT'. The advantage of this 
>> approach is that it can be extended to support xid, timestamp, csn or 
>> anything else, that may be invented in the future, without affecting 
>> the grammar.
>> 
>> What do you think?
>> 
>> Personally, I find this syntax to be more convenient and 
>> human-readable compared with function call:
>> 
>> SELECT pg_wait_for_lsn('16/B374D848');
>> BEGIN;
> 
> I can imagine that some users want to specify the LSN to wait for,
> from the result of another query, for example,
> SELECT pg_wait_for_lsn(lsn) FROM xxx. If this is valid use case,
> isn't the function better?
> 

I think that the main purpose of the feature is to achieve 
read-your-writes-consistency, while using async replica for reads. In 
that case lsn of last modification is stored inside application, so 
there is no need to do any query for that. Moreover, you cannot store 
this lsn inside database, since reads are distributed across all 
replicas (+ primary).

Thus, I could imagine that 'xxx' in your example states for some kind of 
stored procedure, that fetches lsn from the off-postgres storage, but it 
looks like very narrow case to count on it, doesn't it?

Anyway, I am not against implementing this as a function. That was just 
another option to consider.

Just realized that the last patch I have seen does not allow usage of 
wait on primary. It may be a problem if reads are pooled not only across 
replicas, but on primary as well, which should be quite usual I guess. 
In that case application does not know either request will be processed 
on replica, or on primary. I think it should be allowed without any 
warning, or just saying some LOG/DEBUG at most, that there was no 
waiting performed.


Regards
-- 
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Andres Freund
Дата:
Hi,

On 2020-04-10 11:25:02 +0900, Fujii Masao wrote:
> > BEGIN
> > WAIT (LSN '16/B374D848', WHATEVER_OPTION_YOU_WANT);
> > ...
> > COMMIT;
> > 
> > It requires only one reserved keyword 'WAIT'. The advantage of this approach is that it can be extended to support
xid,timestamp, csn or anything else, that may be invented in the future, without affecting the grammar.
 
> > 
> > What do you think?
> > 
> > Personally, I find this syntax to be more convenient and human-readable compared with function call:
> > 
> > SELECT pg_wait_for_lsn('16/B374D848');
> > BEGIN;
> 
> I can imagine that some users want to specify the LSN to wait for,
> from the result of another query, for example,
> SELECT pg_wait_for_lsn(lsn) FROM xxx. If this is valid use case,
> isn't the function better?

I don't think a function is a good idea - it'll cause a snapshot to be
held while waiting. Which in turn will cause hot_standby_feedback to not
be able to report an increased xmin up. And it will possibly hit
snapshot recovery conflicts.

Whereas explicit syntax, especially if a transaction control statement,
won't have that problem.

I'd personally look at 'AFTER' instead of 'WAIT'. Upthread you talked
about a reserved keyword - why does it have to be reserved?


FWIW, I'm not really convinced there needs to be bespoke timeout syntax
for this feature. I can see reasons why you'd not just want to rely on
statement_timeout, but at least that should be discussed.

Greetings,

Andres Freund



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> I don't think a function is a good idea - it'll cause a snapshot to be
> held while waiting. Which in turn will cause hot_standby_feedback to not
> be able to report an increased xmin up. And it will possibly hit
> snapshot recovery conflicts.

Good point, but we could address that by making it a procedure no?

            regards, tom lane



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Andres Freund
Дата:
Hi,

On 2020-04-10 16:29:39 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > I don't think a function is a good idea - it'll cause a snapshot to be
> > held while waiting. Which in turn will cause hot_standby_feedback to not
> > be able to report an increased xmin up. And it will possibly hit
> > snapshot recovery conflicts.
>
> Good point, but we could address that by making it a procedure no?

Probably. Don't think we have great infrastructure for builtin
procedures yet though? We'd presumably not want to use plpgsql.

ISTM that we can make it BEGIN AFTER 'xx/xx' or such, which'd not
require any keywords, it'd be easier to use than a procedure.

With a separate procedure, you'd likely need more roundtrips / complex
logic at the client. You either need to check first if the procedure
errored ou, and then send the BEGIN, or send both together and separate
out potential errors.

Greetings,

Andres Freund



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2020-04-10 16:29:39 -0400, Tom Lane wrote:
>> Good point, but we could address that by making it a procedure no?

> Probably. Don't think we have great infrastructure for builtin
> procedures yet though? We'd presumably not want to use plpgsql.

Don't think anyone's tried yet.  It's not instantly clear that the
amount of code needed would be more than comes along with new
syntax, though.

> ISTM that we can make it BEGIN AFTER 'xx/xx' or such, which'd not
> require any keywords, it'd be easier to use than a procedure.

I still don't see a good argument for tying this to BEGIN.  If it
has to be a statement, why not a standalone statement?

(I also have a lurking suspicion that this shouldn't be SQL at all
but part of the replication command set.)

            regards, tom lane



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Andres Freund
Дата:
Hi,

On 2020-04-10 17:17:10 -0400, Tom Lane wrote:
> > ISTM that we can make it BEGIN AFTER 'xx/xx' or such, which'd not
> > require any keywords, it'd be easier to use than a procedure.
> 
> I still don't see a good argument for tying this to BEGIN.  If it
> has to be a statement, why not a standalone statement?

Because the goal is to start a transaction where a certain action from
the primary is visible.

I think there's also some advantages of having it in a single statement
for poolers. If a pooler analyzes BEGIN AFTER 'xx/xx' it could
e.g. redirect the transaction to a node that's caught up far enough,
instead of blocking. But that can't work even close to as easily if it's
something that has to be executed before transaction begin.


> (I also have a lurking suspicion that this shouldn't be SQL at all
> but part of the replication command set.)

Hm? I'm not quite following. The feature is useful to achieve
read-your-own-writes consistency. Consider

Primary: INSERT INTO app_users ...; SELECT pg_current_wal_lsn();
Standby: BEGIN AFTER 'returned/lsn';
Standby: SELECT i_am_a_set_of_very_expensive_queries FROM ..., app_users;

without the AFTER/WAIT whatnot, you cannot rely on the insert having
been replicated to the standby.

Offloading queries from the write node to replicas is a pretty standard
technique for scaling out databases (including PG). We just make it
harder than necessary.

How would this be part of the replication command set? This shouldn't
require replication permissions for the user executing the queries.
While I'm in favor of merging the replication protocol entirely with the
normal protocol, I've so far received very little support for that
proposition...

Greetings,

Andres Freund



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Anna Akenteva
Дата:
On 2020-04-11 00:44, Andres Freund wrote:
> I think there's also some advantages of having it in a single statement
> for poolers. If a pooler analyzes BEGIN AFTER 'xx/xx' it could
> e.g. redirect the transaction to a node that's caught up far enough,
> instead of blocking. But that can't work even close to as easily if 
> it's
> something that has to be executed before transaction begin.
> 

I think that's a good point.

Also, I'm not sure how we'd expect a wait-for-LSN procedure to work 
inside a single-snapshot transaction. Would it throw an error inside a 
RR transaction block? Would it give a warning?

-- 
Anna Akenteva
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Daniel Gustafsson
Дата:
This patch require some rewording of documentation/comments and variable names
after the language change introduced by 229f8c219f8f..a9a4a7ad565b, the thread
below can be used as reference for how to change:

https://www.postgresql.org/message-id/flat/20200615182235.x7lch5n6kcjq4aue%40alap3.anarazel.de

cheers ./daniel





Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Anna Akenteva
Дата:
On 2020-07-13 14:21, Daniel Gustafsson wrote:
> This patch require some rewording of documentation/comments and 
> variable names
> after the language change introduced by 229f8c219f8f..a9a4a7ad565b, the 
> thread
> below can be used as reference for how to change:
> 
> https://www.postgresql.org/message-id/flat/20200615182235.x7lch5n6kcjq4aue%40alap3.anarazel.de
> 

Thank you for the heads up!

I updated the most recent patch and removed the use of "master" from it, 
replacing it with "primary".

-- 
Anna Akenteva
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Вложения

Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Michael Paquier
Дата:
On Tue, Aug 18, 2020 at 01:12:51PM +0300, Anna Akenteva wrote:
> I updated the most recent patch and removed the use of "master" from it,
> replacing it with "primary".

This is failing to apply lately, causing the CF bot to complain:
http://cfbot.cputube.org/patch_29_772.log
--
Michael

Вложения

Re: [HACKERS] make async slave to wait for lsn to be replayed

От
a.pervushina@postgrespro.ru
Дата:
Anna Akenteva писал 2020-04-08 22:36:
> On 2020-04-08 04:09, Kyotaro Horiguchi wrote:
> 
> I like your suggested keywords! I think that "AFTER" + "WITHIN" sound
> the most natural. We could completely give up the LSN keyword for now.
> The final command could look something like:
> 
> BEGIN AFTER ‘0/303EC60’ WITHIN '5 seconds';
> or
> BEGIN AFTER ‘0/303EC60’ WITHIN 5000;


Hello,

I've changed the syntax of the command from BEGIN [ WAIT FOR LSN value [ 
TIMEOUT delay ]] to BEGIN [ AFTER value [ WITHIN delay ]] and removed 
all the unnecessary keywords.

Best regards,
Alexandra Pervushina.
Вложения

Re: [HACKERS] make async slave to wait for lsn to be replayed

От
a.pervushina@postgrespro.ru
Дата:
Hello,

I've changed the BEGIN WAIT FOR LSN statement to core functions 
pg_waitlsn, pg_waitlsn_infinite and pg_waitlsn_no_wait.
Currently the functions work inside repeatable read transactions, but 
waitlsn creates a snapshot if called first in a transaction block, which 
can possibly lead the transaction to working incorrectly, so the 
function gives a warning.

Usage examples
==========
select pg_waitlsn(‘LSN’, timeout);
select pg_waitlsn_infinite(‘LSN’);
select pg_waitlsn_no_wait(‘LSN’);


Вложения

Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Kyotaro Horiguchi
Дата:
Hello.

At Wed, 18 Nov 2020 15:05:00 +0300, a.pervushina@postgrespro.ru wrote in 
> I've changed the BEGIN WAIT FOR LSN statement to core functions
> pg_waitlsn, pg_waitlsn_infinite and pg_waitlsn_no_wait.
> Currently the functions work inside repeatable read transactions, but
> waitlsn creates a snapshot if called first in a transaction block,
> which can possibly lead the transaction to working incorrectly, so the
> function gives a warning.

According to the discuttion here, implementing as functions is not
optimal. As a Poc, I made it as a procedure. However I'm not sure it
is the correct implement as a native procedure but it seems working as
expected.

> Usage examples
> ==========
> select pg_waitlsn(‘LSN’, timeout);
> select pg_waitlsn_infinite(‘LSN’);
> select pg_waitlsn_no_wait(‘LSN’);

The first and second usage is coverd by a single procedure. The last
function is equivalent to pg_last_wal_replay_lsn(). As the result, the
following procedure is provided in the attached.

pg_waitlsn(wait_lsn pg_lsn, timeout integer DEFAULT -1)

Any opinions mainly compared to implementation as a command?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 470e113b33..4283b98eb4 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -42,6 +42,7 @@
 #include "catalog/pg_database.h"
 #include "commands/progress.h"
 #include "commands/tablespace.h"
+#include "commands/wait.h"
 #include "common/controldata_utils.h"
 #include "executor/instrument.h"
 #include "miscadmin.h"
@@ -7463,6 +7464,15 @@ StartupXLOG(void)
                     break;
                 }
 
+                /*
+                 * If we replayed an LSN that someone was waiting for,
+                 * set latches in shared memory array to notify the waiter.
+                 */
+                if (XLogCtl->lastReplayedEndRecPtr >= GetMinWaitedLSN())
+                {
+                    WaitSetLatch(XLogCtl->lastReplayedEndRecPtr);
+                }
+
                 /* Else, try to fetch the next WAL record */
                 record = ReadRecord(xlogreader, LOG, false);
             } while (record != NULL);
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index fa58afd9d7..c19d49e7a4 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1460,6 +1460,10 @@ LANGUAGE internal
 STRICT IMMUTABLE PARALLEL SAFE
 AS 'unicode_is_normalized';
 
+CREATE OR REPLACE PROCEDURE
+  pg_waitlsn(wait_lsn pg_lsn, timeout integer DEFAULT -1)
+  LANGUAGE internal AS 'pg_waitlsn';
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile
index e8504f0ae4..2c0bd41336 100644
--- a/src/backend/commands/Makefile
+++ b/src/backend/commands/Makefile
@@ -60,6 +60,7 @@ OBJS = \
     user.o \
     vacuum.o \
     variable.o \
-    view.o
+    view.o \
+    wait.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index f9bbe97b50..959e96b7e0 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -23,6 +23,7 @@
 #include "access/syncscan.h"
 #include "access/twophase.h"
 #include "commands/async.h"
+#include "commands/wait.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
@@ -149,6 +150,7 @@ CreateSharedMemoryAndSemaphores(void)
         size = add_size(size, BTreeShmemSize());
         size = add_size(size, SyncScanShmemSize());
         size = add_size(size, AsyncShmemSize());
+        size = add_size(size, WaitShmemSize());
 #ifdef EXEC_BACKEND
         size = add_size(size, ShmemBackendArraySize());
 #endif
@@ -268,6 +270,11 @@ CreateSharedMemoryAndSemaphores(void)
     SyncScanShmemInit();
     AsyncShmemInit();
 
+    /*
+     * Init array of events for the wait clause in shared memory
+     */
+    WaitShmemInit();
+
 #ifdef EXEC_BACKEND
 
     /*
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index c87ffc6549..2b4d73ba2f 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -38,6 +38,7 @@
 #include "access/transam.h"
 #include "access/twophase.h"
 #include "access/xact.h"
+#include "commands/wait.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
@@ -713,6 +714,9 @@ LockErrorCleanup(void)
 
     AbortStrongLockAcquire();
 
+    /* If waitlsn was interrupted, then stop waiting for that LSN */
+    DeleteWaitedLSN();
+
     /* Nothing to do if we weren't waiting for a lock */
     if (lockAwaited == NULL)
     {
diff --git a/src/backend/utils/adt/misc.c b/src/backend/utils/adt/misc.c
index 4096faff9a..90876da120 100644
--- a/src/backend/utils/adt/misc.c
+++ b/src/backend/utils/adt/misc.c
@@ -373,8 +373,6 @@ pg_sleep(PG_FUNCTION_ARGS)
      * less than the specified time when WaitLatch is terminated early by a
      * non-query-canceling signal such as SIGHUP.
      */
-#define GetNowFloat()    ((float8) GetCurrentTimestamp() / 1000000.0)
-
     endtime = GetNowFloat() + secs;
 
     for (;;)
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index b5f52d4e4a..918eaedfd5 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11375,4 +11375,8 @@
   proname => 'is_normalized', prorettype => 'bool', proargtypes => 'text text',
   prosrc => 'unicode_is_normalized' },
 
+{ oid => '9313', descr => 'wait for LSN to be replayed',
+  proname => 'pg_waitlsn',  prokind => 'p',prorettype => 'void', proargtypes => 'pg_lsn int4',
+  proargnames => '{wait_lsn,timeout}',
+  prosrc => 'pg_waitlsn' }
 ]
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index 63bf71ac61..6c4ecd704d 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -113,4 +113,6 @@ extern int    date2isoyearday(int year, int mon, int mday);
 
 extern bool TimestampTimestampTzRequiresRewrite(void);
 
+#define GetNowFloat() ((float8) GetCurrentTimestamp() / 1000000.0)
+
 #endif                            /* TIMESTAMP_H */

Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Ibrar Ahmed
Дата:


On Thu, Jan 21, 2021 at 1:30 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
Hello.

At Wed, 18 Nov 2020 15:05:00 +0300, a.pervushina@postgrespro.ru wrote in
> I've changed the BEGIN WAIT FOR LSN statement to core functions
> pg_waitlsn, pg_waitlsn_infinite and pg_waitlsn_no_wait.
> Currently the functions work inside repeatable read transactions, but
> waitlsn creates a snapshot if called first in a transaction block,
> which can possibly lead the transaction to working incorrectly, so the
> function gives a warning.

According to the discuttion here, implementing as functions is not
optimal. As a Poc, I made it as a procedure. However I'm not sure it
is the correct implement as a native procedure but it seems working as
expected.

> Usage examples
> ==========
> select pg_waitlsn(‘LSN’, timeout);
> select pg_waitlsn_infinite(‘LSN’);
> select pg_waitlsn_no_wait(‘LSN’);

The first and second usage is coverd by a single procedure. The last
function is equivalent to pg_last_wal_replay_lsn(). As the result, the
following procedure is provided in the attached.

pg_waitlsn(wait_lsn pg_lsn, timeout integer DEFAULT -1)

Any opinions mainly compared to implementation as a command?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

The patch (pg_waitlsn_v10_2_kh.patch) does not compile successfully and has compilation errors. Can you please take a look?


xlog.c:45:10: fatal error: commands/wait.h: No such file or directory
#include "commands/wait.h"
^~~~~~~~~~~~~~~~~
compilation terminated.
make[4]: *** [<builtin>: xlog.o] Error 1
make[4]: *** Waiting for unfinished jobs....
make[3]: *** [../../../src/backend/common.mk:39: transam-recursive] Error 2
make[2]: *** [common.mk:39: access-recursive] Error 2
make[1]: *** [Makefile:42: all-backend-recurse] Error 2
make: *** [GNUmakefile:11: all-src-recurse] Error 2

I am changing the status to  "Waiting on Author"




--
Ibrar Ahmed

Re: [HACKERS] make async slave to wait for lsn to be replayed

От
Kyotaro Horiguchi
Дата:
At Thu, 18 Mar 2021 18:57:15 +0500, Ibrar Ahmed <ibrar.ahmad@gmail.com> wrote in 
> On Thu, Jan 21, 2021 at 1:30 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com>
> wrote:
> 
> > Hello.
> >
> > At Wed, 18 Nov 2020 15:05:00 +0300, a.pervushina@postgrespro.ru wrote in
> > > I've changed the BEGIN WAIT FOR LSN statement to core functions
> > > pg_waitlsn, pg_waitlsn_infinite and pg_waitlsn_no_wait.
> > > Currently the functions work inside repeatable read transactions, but
> > > waitlsn creates a snapshot if called first in a transaction block,
> > > which can possibly lead the transaction to working incorrectly, so the
> > > function gives a warning.
> >
> > According to the discuttion here, implementing as functions is not
> > optimal. As a Poc, I made it as a procedure. However I'm not sure it
> > is the correct implement as a native procedure but it seems working as
> > expected.
> >
> > > Usage examples
> > > ==========
> > > select pg_waitlsn(‘LSN’, timeout);
> > > select pg_waitlsn_infinite(‘LSN’);
> > > select pg_waitlsn_no_wait(‘LSN’);
> >
> > The first and second usage is coverd by a single procedure. The last
> > function is equivalent to pg_last_wal_replay_lsn(). As the result, the
> > following procedure is provided in the attached.
> >
> > pg_waitlsn(wait_lsn pg_lsn, timeout integer DEFAULT -1)
> >
> > Any opinions mainly compared to implementation as a command?
> >
> > regards.
> >
> > --
> > Kyotaro Horiguchi
> > NTT Open Source Software Center
> >
> 
> The patch (pg_waitlsn_v10_2_kh.patch) does not compile successfully and has
> compilation errors. Can you please take a look?
> 
> https://cirrus-ci.com/task/6241565996744704
> 
> xlog.c:45:10: fatal error: commands/wait.h: No such file or directory
> #include "commands/wait.h"
> ^~~~~~~~~~~~~~~~~
> compilation terminated.
> make[4]: *** [<builtin>: xlog.o] Error 1
> make[4]: *** Waiting for unfinished jobs....
> make[3]: *** [../../../src/backend/common.mk:39: transam-recursive] Error 2
> make[2]: *** [common.mk:39: access-recursive] Error 2
> make[1]: *** [Makefile:42: all-backend-recurse] Error 2
> make: *** [GNUmakefile:11: all-src-recurse] Error 2
> 
> I am changing the status to  "Waiting on Author"

Anna is the autor.  The "patch" was just to show how we can implement
the feature as a procedure. (Sorry for the bad mistake I made.)

The patch still applies to the master. So I resend just rebased
version as v10_2, and attached the "PoC" as *.txt which applies on top
of the patch.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 6f8810e149..3c580083dd 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -42,6 +42,7 @@
 #include "catalog/pg_database.h"
 #include "commands/progress.h"
 #include "commands/tablespace.h"
+#include "commands/wait.h"
 #include "common/controldata_utils.h"
 #include "executor/instrument.h"
 #include "miscadmin.h"
@@ -7535,6 +7536,15 @@ StartupXLOG(void)
                     break;
                 }
 
+                /*
+                 * If we replayed an LSN that someone was waiting for,
+                 * set latches in shared memory array to notify the waiter.
+                 */
+                if (XLogCtl->lastReplayedEndRecPtr >= GetMinWaitedLSN())
+                {
+                    WaitSetLatch(XLogCtl->lastReplayedEndRecPtr);
+                }
+
                 /* Else, try to fetch the next WAL record */
                 record = ReadRecord(xlogreader, LOG, false);
             } while (record != NULL);
diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile
index e8504f0ae4..2c0bd41336 100644
--- a/src/backend/commands/Makefile
+++ b/src/backend/commands/Makefile
@@ -60,6 +60,7 @@ OBJS = \
     user.o \
     vacuum.o \
     variable.o \
-    view.o
+    view.o \
+    wait.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/commands/wait.c b/src/backend/commands/wait.c
new file mode 100644
index 0000000000..1f2483672b
--- /dev/null
+++ b/src/backend/commands/wait.c
@@ -0,0 +1,297 @@
+/*-------------------------------------------------------------------------
+ *
+ * wait.c
+ *      Implements waitlsn, which allows waiting for events such as
+ *      LSN having been replayed on replica.
+ *
+ * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
+ * Portions Copyright (c) 2020, Regents of PostgresPro
+ *
+ * IDENTIFICATION
+ *      src/backend/commands/wait.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include <math.h>
+
+#include "access/xact.h"
+#include "access/xlog.h"
+#include "access/xlogdefs.h"
+#include "commands/wait.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "storage/backendid.h"
+#include "storage/pmsignal.h"
+#include "storage/proc.h"
+#include "storage/shmem.h"
+#include "storage/sinvaladt.h"
+#include "storage/spin.h"
+#include "utils/builtins.h"
+#include "utils/pg_lsn.h"
+#include "utils/timestamp.h"
+
+/* Add to shared memory array */
+static void AddWaitedLSN(XLogRecPtr lsn_to_wait);
+
+/* Shared memory structure */
+typedef struct
+{
+    int            backend_maxid;
+    pg_atomic_uint64 min_lsn; /* XLogRecPtr of minimal waited for LSN */
+    slock_t        mutex;
+    /* LSNs that different backends are waiting */
+    XLogRecPtr    lsn[FLEXIBLE_ARRAY_MEMBER];
+} WaitState;
+
+static WaitState *state;
+
+/*
+ * Add the wait event of the current backend to shared memory array
+ */
+static void
+AddWaitedLSN(XLogRecPtr lsn_to_wait)
+{
+    SpinLockAcquire(&state->mutex);
+    if (state->backend_maxid < MyBackendId)
+        state->backend_maxid = MyBackendId;
+
+    state->lsn[MyBackendId] = lsn_to_wait;
+
+    if (lsn_to_wait < state->min_lsn.value)
+        state->min_lsn.value = lsn_to_wait;
+    SpinLockRelease(&state->mutex);
+}
+
+/*
+ * Delete wait event of the current backend from the shared memory array.
+ */
+void
+DeleteWaitedLSN(void)
+{
+    int            i;
+    XLogRecPtr    lsn_to_delete;
+
+    SpinLockAcquire(&state->mutex);
+
+    lsn_to_delete = state->lsn[MyBackendId];
+    state->lsn[MyBackendId] = InvalidXLogRecPtr;
+
+    /* If we are deleting the minimal LSN, then choose the next min_lsn */
+    if (lsn_to_delete != InvalidXLogRecPtr &&
+        lsn_to_delete == state->min_lsn.value)
+    {
+        state->min_lsn.value = PG_UINT64_MAX;
+        for (i = 2; i <= state->backend_maxid; i++)
+            if (state->lsn[i] != InvalidXLogRecPtr &&
+                state->lsn[i] < state->min_lsn.value)
+                state->min_lsn.value = state->lsn[i];
+    }
+
+    /* If deleting from the end of the array, shorten the array's used part */
+    if (state->backend_maxid == MyBackendId)
+        for (i = (MyBackendId); i >= 2; i--)
+            if (state->lsn[i] != InvalidXLogRecPtr)
+            {
+                state->backend_maxid = i;
+                break;
+            }
+
+    SpinLockRelease(&state->mutex);
+}
+
+/*
+ * Report amount of shared memory space needed for WaitState
+ */
+Size
+WaitShmemSize(void)
+{
+    Size        size;
+
+    size = offsetof(WaitState, lsn);
+    size = add_size(size, mul_size(MaxBackends + 1, sizeof(XLogRecPtr)));
+    return size;
+}
+
+/*
+ * Initialize an array of events to wait for in shared memory
+ */
+void
+WaitShmemInit(void)
+{
+    bool        found;
+    uint32        i;
+
+    state = (WaitState *) ShmemInitStruct("pg_wait_lsn",
+                                          WaitShmemSize(),
+                                          &found);
+    if (!found)
+    {
+        SpinLockInit(&state->mutex);
+
+        for (i = 0; i < (MaxBackends + 1); i++)
+            state->lsn[i] = InvalidXLogRecPtr;
+
+        state->backend_maxid = 0;
+        state->min_lsn.value = PG_UINT64_MAX;
+    }
+}
+
+/*
+ * Set latches in shared memory to signal that new LSN has been replayed
+ */
+void
+WaitSetLatch(XLogRecPtr cur_lsn)
+{
+    uint32        i;
+    int            backend_maxid;
+    PGPROC       *backend;
+
+    SpinLockAcquire(&state->mutex);
+    backend_maxid = state->backend_maxid;
+
+    for (i = 2; i <= backend_maxid; i++)
+    {
+        backend = BackendIdGetProc(i);
+
+        if (backend && state->lsn[i] != 0 &&
+            state->lsn[i] <= cur_lsn)
+        {
+            SetLatch(&backend->procLatch);
+        }
+    }
+    SpinLockRelease(&state->mutex);
+}
+
+/*
+ * Get minimal LSN that someone waits for
+ */
+XLogRecPtr
+GetMinWaitedLSN(void)
+{
+    return state->min_lsn.value;
+}
+
+/*
+ * On WAIT use a latch to wait till LSN is replayed,
+ * postmaster dies or timeout happens. Timeout is specified in milliseconds.
+ * Returns true if LSN was reached and false otherwise.
+ */
+bool
+WaitUtility(XLogRecPtr target_lsn, const int timeout_ms)
+{
+    XLogRecPtr    cur_lsn = GetXLogReplayRecPtr(NULL);
+    int            latch_events;
+    float8        endtime;
+    bool        res = false;
+    bool        wait_forever = (timeout_ms <= 0);
+
+    if (!RecoveryInProgress()) {
+        ereport(ERROR,
+                errmsg("Cannot use waitlsn on primary"));
+        return false;
+    }
+
+    /*
+     * In transactions, that have isolation level repeatable read or higher
+     * waitlsn creates a snapshot if called first in a block, which can
+     * lead the transaction to working incorrectly
+     */
+
+    if (IsTransactionBlock() && XactIsoLevel != XACT_READ_COMMITTED) {
+        ereport(WARNING,
+                errmsg("Waitlsn may work incorrectly in this isolation level"),
+                errhint("Call waitlsn before starting the transaction"));
+    }
+
+    endtime = GetNowFloat() + timeout_ms / 1000.0;
+
+    latch_events = WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH;
+
+    /* Check if we already reached the needed LSN */
+    if (cur_lsn >= target_lsn)
+        return true;
+
+    AddWaitedLSN(target_lsn);
+
+    for (;;)
+    {
+        int            rc;
+        float8        time_left = 0;
+        long        time_left_ms = 0;
+
+        time_left = endtime - GetNowFloat();
+
+        /* Use 100 ms as the default timeout to check for interrupts */
+        if (wait_forever || time_left < 0 || time_left > 0.1)
+            time_left_ms = 100;
+        else
+            time_left_ms = (long) ceil(time_left * 1000.0);
+
+        /* If interrupt, LockErrorCleanup() will do DeleteWaitedLSN() for us */
+        CHECK_FOR_INTERRUPTS();
+
+        /* If postmaster dies, finish immediately */
+        if (!PostmasterIsAlive())
+            break;
+
+        rc = WaitLatch(MyLatch, latch_events, time_left_ms,
+                       WAIT_EVENT_CLIENT_READ);
+
+        ResetLatch(MyLatch);
+
+        if (rc & WL_LATCH_SET)
+            cur_lsn = GetXLogReplayRecPtr(NULL);
+
+        if (rc & WL_TIMEOUT)
+        {
+            cur_lsn = GetXLogReplayRecPtr(NULL);
+            /* If the time specified by user has passed, stop waiting */
+            time_left = endtime - GetNowFloat();
+            if (!wait_forever && time_left <= 0.0)
+                break;
+        }
+
+        /* If LSN has been replayed */
+        if (target_lsn <= cur_lsn)
+            break;
+    }
+
+    DeleteWaitedLSN();
+
+    if (cur_lsn < target_lsn)
+        ereport(WARNING,
+                 errmsg("LSN was not reached"),
+                 errhint("Try to increase wait time."));
+    else
+        res = true;
+
+    return res;
+}
+
+Datum
+pg_waitlsn(PG_FUNCTION_ARGS)
+{
+    XLogRecPtr        trg_lsn = PG_GETARG_LSN(0);
+    uint64_t        delay = PG_GETARG_INT32(1);
+
+    PG_RETURN_BOOL(WaitUtility(trg_lsn, delay));
+}
+
+Datum
+pg_waitlsn_infinite(PG_FUNCTION_ARGS)
+{
+    XLogRecPtr        trg_lsn = PG_GETARG_LSN(0);
+
+    PG_RETURN_BOOL(WaitUtility(trg_lsn, 0));
+}
+
+Datum
+pg_waitlsn_no_wait(PG_FUNCTION_ARGS)
+{
+    XLogRecPtr        trg_lsn = PG_GETARG_LSN(0);
+
+    PG_RETURN_BOOL(WaitUtility(trg_lsn, 1));
+}
\ No newline at end of file
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 3e4ec53a97..fb8f8588a7 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -23,6 +23,7 @@
 #include "access/syncscan.h"
 #include "access/twophase.h"
 #include "commands/async.h"
+#include "commands/wait.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
@@ -150,6 +151,7 @@ CreateSharedMemoryAndSemaphores(void)
         size = add_size(size, BTreeShmemSize());
         size = add_size(size, SyncScanShmemSize());
         size = add_size(size, AsyncShmemSize());
+        size = add_size(size, WaitShmemSize());
 #ifdef EXEC_BACKEND
         size = add_size(size, ShmemBackendArraySize());
 #endif
@@ -270,6 +272,11 @@ CreateSharedMemoryAndSemaphores(void)
     SyncScanShmemInit();
     AsyncShmemInit();
 
+    /*
+     * Init array of events for the wait clause in shared memory
+     */
+    WaitShmemInit();
+
 #ifdef EXEC_BACKEND
 
     /*
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 897045ee27..540991146a 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -38,6 +38,7 @@
 #include "access/transam.h"
 #include "access/twophase.h"
 #include "access/xact.h"
+#include "commands/wait.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
@@ -716,6 +717,9 @@ LockErrorCleanup(void)
 
     AbortStrongLockAcquire();
 
+    /* If waitlsn was interrupted, then stop waiting for that LSN */
+    DeleteWaitedLSN();
+
     /* Nothing to do if we weren't waiting for a lock */
     if (lockAwaited == NULL)
     {
diff --git a/src/backend/utils/adt/misc.c b/src/backend/utils/adt/misc.c
index 634f574d7e..50c836fdb7 100644
--- a/src/backend/utils/adt/misc.c
+++ b/src/backend/utils/adt/misc.c
@@ -375,8 +375,6 @@ pg_sleep(PG_FUNCTION_ARGS)
      * less than the specified time when WaitLatch is terminated early by a
      * non-query-canceling signal such as SIGHUP.
      */
-#define GetNowFloat()    ((float8) GetCurrentTimestamp() / 1000000.0)
-
     endtime = GetNowFloat() + secs;
 
     for (;;)
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index e259531f60..c11387961e 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11411,4 +11411,19 @@
   proname => 'is_normalized', prorettype => 'bool', proargtypes => 'text text',
   prosrc => 'unicode_is_normalized' },
 
+{ oid => '16387', descr => 'wait for LSN until timeout',
+  proname => 'pg_waitlsn', prorettype => 'bool', proargtypes => 'pg_lsn int8',
+  proargnames => '{trg_lsn,delay}',
+  prosrc => 'pg_waitlsn' },
+
+{ oid => '16388', descr => 'wait for LSN for an infinite time',
+  proname => 'pg_waitlsn_infinite', prorettype => 'bool', proargtypes => 'pg_lsn',
+  proargnames => '{trg_lsn}',
+  prosrc => 'pg_waitlsn_infinite' },
+
+{ oid => '16389', descr => 'wait for LSN with no timeout',
+  proname => 'pg_waitlsn_no_wait', prorettype => 'bool', proargtypes => 'pg_lsn',
+  proargnames => '{trg_lsn}',
+  prosrc => 'pg_waitlsn_no_wait' },
+
 ]
diff --git a/src/include/commands/wait.h b/src/include/commands/wait.h
new file mode 100644
index 0000000000..fd21e43416
--- /dev/null
+++ b/src/include/commands/wait.h
@@ -0,0 +1,26 @@
+/*-------------------------------------------------------------------------
+ *
+ * wait.h
+ *      prototypes for commands/wait.c
+ *
+ * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
+ * Portions Copyright (c) 2020, Regents of PostgresPro
+ *
+ * src/include/commands/wait.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef WAIT_H
+#define WAIT_H
+#include "postgres.h"
+#include "tcop/dest.h"
+#include "nodes/parsenodes.h"
+
+extern bool WaitUtility(XLogRecPtr lsn, const int timeout_ms);
+extern Size WaitShmemSize(void);
+extern void WaitShmemInit(void);
+extern void WaitSetLatch(XLogRecPtr cur_lsn);
+extern XLogRecPtr GetMinWaitedLSN(void);
+extern void DeleteWaitedLSN(void);
+
+#endif                            /* WAIT_H */
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index 63bf71ac61..6c4ecd704d 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -113,4 +113,6 @@ extern int    date2isoyearday(int year, int mon, int mday);
 
 extern bool TimestampTimestampTzRequiresRewrite(void);
 
+#define GetNowFloat() ((float8) GetCurrentTimestamp() / 1000000.0)
+
 #endif                            /* TIMESTAMP_H */
diff --git a/src/test/recovery/t/021_waitlsn.pl b/src/test/recovery/t/021_waitlsn.pl
new file mode 100644
index 0000000000..81dd70ef96
--- /dev/null
+++ b/src/test/recovery/t/021_waitlsn.pl
@@ -0,0 +1,91 @@
+# Checks waitlsn
+use strict;
+use warnings;
+
+use PostgresNode;
+use TestLib;
+use Test::More tests => 11;
+
+# Initialize primary node
+my $node_primary = get_new_node('primary');
+$node_primary->init(allows_streaming => 1);
+$node_primary->start;
+
+# And some content and take a backup
+$node_primary->safe_psql('postgres',
+    "CREATE TABLE wait_test AS SELECT generate_series(1,10) AS a");
+my $backup_name = 'my_backup';
+$node_primary->backup($backup_name);
+
+# Using the backup, create a streaming standby with a 1 second delay
+my $node_standby = get_new_node('standby');
+my $delay        = 1;
+$node_standby->init_from_backup($node_primary, $backup_name,
+    has_streaming => 1);
+$node_standby->append_conf('postgresql.conf', qq[
+    recovery_min_apply_delay = '${delay}s'
+]);
+$node_standby->start;
+
+# Check that timeouts make us wait for the specified time (1s here)
+my $current_time = $node_standby->safe_psql('postgres', "SELECT now()");
+my $two_seconds = 2000; # in milliseconds
+my $start_time = time();
+$node_standby->safe_psql('postgres',
+    "SELECT pg_waitlsn('0/FFFFFFFF', $two_seconds)");
+my $time_waited = (time() - $start_time) * 1000; # convert to milliseconds
+ok($time_waited >= $two_seconds, "waitlsn waits for enough time");
+
+# Check that timeouts let us stop waiting right away, before reaching target LSN
+$node_primary->safe_psql('postgres',
+    "INSERT INTO wait_test VALUES (generate_series(11, 20))");
+my $lsn1 = $node_primary->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+my ($ret, $out, $err) = $node_standby->psql('postgres',
+    "SELECT pg_waitlsn('$lsn1', 1)");
+
+ok($ret == 0, "zero return value when failed to waitlsn on standby");
+ok($err =~ /WARNING:  LSN was not reached/,
+    "correct error message when failed to waitlsn on standby");
+ok($out eq "f", "if given too little wait time, WAIT doesn't reach target LSN");
+
+
+# Check that waitlsn works fine and reaches target LSN if given no timeout
+
+# Add data on primary, memorize primary's last LSN
+$node_primary->safe_psql('postgres',
+    "INSERT INTO wait_test VALUES (generate_series(21, 30))");
+my $lsn2 = $node_primary->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+
+# Wait for it to appear on replica, memorize replica's last LSN
+$node_standby->safe_psql('postgres',
+    "SELECT pg_waitlsn_infinite('$lsn2')");
+my $reached_lsn = $node_standby->safe_psql('postgres',
+    "SELECT pg_last_wal_replay_lsn()");
+
+# Make sure that primary's and replica's LSNs are the same after WAIT
+my $compare_lsns = $node_standby->safe_psql('postgres',
+    "SELECT pg_lsn_cmp('$reached_lsn'::pg_lsn, '$lsn2'::pg_lsn)");
+ok($compare_lsns eq 0,
+    "standby reached the same LSN as primary before starting transaction");
+
+
+# Make sure that it's not allowed to use waitlsn on primary
+($ret, $out, $err) = $node_primary->psql('postgres',
+    "SELECT pg_waitlsn_infinite('0/FFFFFFFF')");
+
+ok($ret != 0, "non-zero return value when trying to waitlsn on primary");
+ok($err =~ /ERROR:  Cannot use waitlsn on primary/,
+    "correct error message when trying to waitlsn on primary");
+ok($out eq '', "empty output when trying to waitlsn on primary");
+
+# Make sure that waitlsn gives a warning inside a read commited transaction
+
+($ret, $out, $err) = $node_standby->psql('postgres',
+    "BEGIN ISOLATION LEVEL REPEATABLE READ; SELECT pg_waitlsn_no_wait('0/FFFFFFFF')");
+ok($ret == 0, "zero return value when trying to waitlsn in transaction");
+ok($err =~ /WARNING:  Waitlsn may work incorrectly in this isolation level/,
+    "correct warning message when trying to waitlsn in transaction");
+ok($out eq "f", "non empty output when trying to waitlsn in transaction");
+
+$node_standby->stop;
+$node_primary->stop;
\ No newline at end of file
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 1d1d5d2f0e..6075ee5e77 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2730,6 +2730,7 @@ WaitEventIPC
 WaitEventSet
 WaitEventTimeout
 WaitPMResult
+WaitState
 WalCloseMethod
 WalLevel
 WalRcvData
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 0dca65dc7b..635508639a 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1474,6 +1474,10 @@ LANGUAGE internal
 STRICT IMMUTABLE PARALLEL SAFE
 AS 'unicode_is_normalized';
 
+CREATE OR REPLACE PROCEDURE
+  pg_waitlsn(wait_lsn pg_lsn, timeout integer DEFAULT -1)
+  LANGUAGE internal AS 'pg_waitlsn';
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index c11387961e..7f25938cbc 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11426,4 +11426,8 @@
   proargnames => '{trg_lsn}',
   prosrc => 'pg_waitlsn_no_wait' },
 
+{ oid => '9313', descr => 'wait for LSN to be replayed',
+  proname => 'pg_waitlsn',  prokind => 'p',prorettype => 'void', proargtypes => 'pg_lsn int4',
+  proargnames => '{wait_lsn,timeout}',
+  prosrc => 'pg_waitlsn' }
 ]