Обсуждение: "38.10.10. Shared Memory and LWLocks" may require a clarification

Поиск
Список
Период
Сортировка

"38.10.10. Shared Memory and LWLocks" may require a clarification

От
Aleksander Alekseev
Дата:
Hi,

While re-reading 38.10.10. Shared Memory and LWLocks [1] and the
corresponding code in pg_stat_statements.c I noticed that there are
several things that can puzzle the reader.

The documentation and the example suggest that LWLock* should be
stored within a structure in shared memory:

```
typedef struct pgssSharedState
{
    LWLock       *lock;
    /* ... etc ... */
} pgssSharedState;
```

... and initialized like this:

```
    LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);

    pgss = ShmemInitStruct("pg_stat_statements",
                           sizeof(pgssSharedState),
                           &found);

    if (!found)
    {
        pgss->lock = &(GetNamedLWLockTranche("pg_stat_statements"))->lock;
        /* ... */
    }

    /* ... */

    LWLockRelease(AddinShmemInitLock);
```

It is not clear why placing LWLock* in a local process memory would be a bug.

On top of that the documentation says:

"""
To avoid possible race-conditions, each backend should use the LWLock
AddinShmemInitLock when connecting to and initializing its allocation
of shared memory
"""

However it's not clear when a race-condition may happen. The rest of
the text gives an overall impression that the shmem_startup_hook will
be called by postmaster once (unless an extension places several hooks
in series). Thus there is no real need to ackquire AddinShmemInitLock
and it should be safe to store LWLock* in local process memory. This
memory will be inherited from postmaster by child processes and the
overall memory usage is going to be the same due to copy-on-write.

Perhaps we should clarify this.

Thoughts?

[1]: https://www.postgresql.org/docs/15/xfunc-c.html#XFUNC-SHARED-ADDIN

-- 
Best regards,
Aleksander Alekseev



Re: "38.10.10. Shared Memory and LWLocks" may require a clarification

От
Aleksander Alekseev
Дата:
Hi,

> However it's not clear when a race-condition may happen. The rest of
> the text gives an overall impression that the shmem_startup_hook will
> be called by postmaster once (unless an extension places several hooks
> in series). Thus there is no real need to ackquire AddinShmemInitLock
> and it should be safe to store LWLock* in local process memory. This
> memory will be inherited from postmaster by child processes and the
> overall memory usage is going to be the same due to copy-on-write.

I added some logs and comments to my toy extension [1] to demonstrate
this. Additionally I added a sleep() call to the shmem_startup_hook to
make sure there are no concurrent processes at the moment when the
hook is called (this change is not committed to the GitHub
repository):

```
@@ -35,6 +35,9 @@ experiment_shmem_request(void)
        RequestNamedLWLockTranche("experiment", 1);
 }

+#include <stdio.h>
+#include <unistd.h>
+
 static void
 experiment_shmem_startup(void)
 {
@@ -43,6 +46,8 @@ experiment_shmem_startup(void)
        elog(LOG, "experiment_shmem_startup(): pid = %d, postmaster = %d\n",
                MyProcPid, !IsUnderPostmaster);

+    sleep(30);
+
        if(prev_shmem_startup_hook)
                prev_shmem_startup_hook();
```

If we do `make && make install && make installcheck` and examine
.//tmp_check/log/001_basic_main.log we will see:

```
[6288] LOG:  _PG_init(): pid = 6288, postmaster = 1
[6288] LOG:  experiment_shmem_request(): pid = 6288, postmaster = 1
[6288] LOG:  experiment_shmem_startup(): pid = 6288, postmaster = 1
```

Also we can make sure that there is only one process running when
shmem_startup_hook is called.

So it looks like acquiring AddinShmemInitLock in the hook is redundant
and also placing LWLock* in local process memory instead of shared
memory is safe.

Unless I missed something, I suggest updating the documentation and
pg_stat_statements.c accordingly.

[1]: https://github.com/afiskon/postgresql-extensions/blob/main/006-shared-memory/experiment.c#L38

-- 
Best regards,
Aleksander Alekseev



Re: "38.10.10. Shared Memory and LWLocks" may require a clarification

От
Aleksander Alekseev
Дата:
Hi,

> Unless I missed something, I suggest updating the documentation and
> pg_stat_statements.c accordingly.

Since no one seems to object so far I prepared the patch.

-- 
Best regards,
Aleksander Alekseev

Вложения

Re: "38.10.10. Shared Memory and LWLocks" may require a clarification

От
Aleksander Alekseev
Дата:
Hi,

> Since no one seems to object so far I prepared the patch.

Turned out patch v1 fails on cfbot on Windows due to extra Assert I added [1]:

```
abort() has been calledTRAP: failed Assert("!IsUnderPostmaster"),
File: "../src/backend/storage/ipc/ipci.c", Line: 320, PID: 4040
abort() has been calledTRAP: failed Assert("!IsUnderPostmaster"),
File: "../src/backend/storage/ipc/ipci.c", Line: 320, PID: 3484
```

Which indicates that currently shmem_startup_hook **can** be called by
child processes on Windows. Not 100% sure if this is a desired
behavior considering the fact that it is inconsistent with the current
behavior on *nix systems.

Here is patch v2. Changes comparing to v1:

```
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -311,15 +311,8 @@ CreateSharedMemoryAndSemaphores(void)
        /*
         * Now give loadable modules a chance to set up their shmem allocations
         */
-       if (shmem_startup_hook)
-       {
-               /*
-                * The following assert ensures that
shmem_startup_hook is going to be
-                * called only by the postmaster, as promised in the
documentation.
-                */
-               Assert(!IsUnderPostmaster);
+       if (!IsUnderPostmaster && shmem_startup_hook)
                shmem_startup_hook();
-       }
 }
```

Thoughts?

[1]:
https://api.cirrus-ci.com/v1/artifact/task/4924036300406784/testrun/build/testrun/pg_stat_statements/regress/log/postmaster.log

-- 
Best regards,
Aleksander Alekseev

Вложения

Re: "38.10.10. Shared Memory and LWLocks" may require a clarification

От
Aleksander Alekseev
Дата:
Hi hackers,

That's me still talking to myself :)

> Thoughts?

Evidently this works differently from what I initially thought on
Windows due to lack of fork() on this system.

PFA the patch v3. Your feedback is most welcomed.

--
Best regards,
Aleksander Alekseev

Вложения

Re: "38.10.10. Shared Memory and LWLocks" may require a clarification

От
Michael Paquier
Дата:
On Tue, May 23, 2023 at 01:47:52PM +0300, Aleksander Alekseev wrote:
> That's me still talking to myself :)

Let's be two then.

> Evidently this works differently from what I initially thought on
> Windows due to lack of fork() on this system.

This comes down to the fact that processes executed with EXEC_BACKEND,
because Windows does not know how to do a fork(), need to update their
local variables to point to the shared memory structures already
created, so we have to call CreateSharedMemoryAndSemaphores() in this
case.

> PFA the patch v3. Your feedback is most welcomed.

+    <para>
+    It is convenient to use <literal>shmem_startup_hook</literal> which allows
+    placing all the code responsible for initializing shared memory in one place.
+    When using <literal>shmem_startup_hook</literal> the extension still needs
+    to acquire <function>AddinShmemInitLock</function> in order to work properly
+    on all the supported platforms including Windows.

Yeah, AddinShmemInitLock is useful because extensions have no base
point outside that, and they may want to update their local variables.
Still, this is not completely exact because EXEC_BACKEND on
non-Windows platform would still need it, so this should be mentioned.
Another thing is that extensions may do like autoprewarm.c, where
the shmem area is not initialized in the startup shmem hook.  This is
a bit cheating because this is a scenario where shmem_request_hook is
not requested, so shmem needs overallocation, but you'd also need a
LWLock in this case, even for non-WIN32.

+    on all the supported platforms including Windows. This is also the reason
+    why the return value of <function>GetNamedLWLockTranche</function> is
+    conventionally stored in shared memory instead of local process memory.
+    </para>

Not sure to follow this sentence, the result of GetNamedLWLockTranche
is the lock, so this sentence does not seem really necessary?

While we're on it, why not improving this part of the documentation more
modern?  We don't mention LWLockNewTrancheId() and
LWLockRegisterTranche() at all, so do you think that it would be worth
adding a sample of code with that, mentioning autoprewarm.c as
example?
--
Michael

Вложения

Re: "38.10.10. Shared Memory and LWLocks" may require a clarification

От
Aleksander Alekseev
Дата:
Michael,

On Fri, Oct 27, 2023 at 9:52 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Tue, May 23, 2023 at 01:47:52PM +0300, Aleksander Alekseev wrote:
> > That's me still talking to myself :)
>
> Let's be two then.

Many thanks for your feedback.

> +    <para>
> +    It is convenient to use <literal>shmem_startup_hook</literal> which allows
> +    placing all the code responsible for initializing shared memory in one place.
> +    When using <literal>shmem_startup_hook</literal> the extension still needs
> +    to acquire <function>AddinShmemInitLock</function> in order to work properly
> +    on all the supported platforms including Windows.
>
> Yeah, AddinShmemInitLock is useful because extensions have no base
> point outside that, and they may want to update their local variables.
> Still, this is not completely exact because EXEC_BACKEND on
> non-Windows platform would still need it, so this should be mentioned.
> Another thing is that extensions may do like autoprewarm.c, where
> the shmem area is not initialized in the startup shmem hook.  This is
> a bit cheating because this is a scenario where shmem_request_hook is
> not requested, so shmem needs overallocation, but you'd also need a
> LWLock in this case, even for non-WIN32.

Got it. Let's simply remove the "including Windows" part then.

>
> +    on all the supported platforms including Windows. This is also the reason
> +    why the return value of <function>GetNamedLWLockTranche</function> is
> +    conventionally stored in shared memory instead of local process memory.
> +    </para>
>
> Not sure to follow this sentence, the result of GetNamedLWLockTranche
> is the lock, so this sentence does not seem really necessary?

To be honest, by now I don't remember what was meant here, so I
removed the sentence.

> While we're on it, why not improving this part of the documentation more
> modern?  We don't mention LWLockNewTrancheId() and
> LWLockRegisterTranche() at all, so do you think that it would be worth
> adding a sample of code with that, mentioning autoprewarm.c as
> example?

Agree, these functions deserve to be mentioned in this section.

PFA patch v4.

--
Best regards,
Aleksander Alekseev

Вложения

Re: "38.10.10. Shared Memory and LWLocks" may require a clarification

От
Aleksander Alekseev
Дата:
Hi,

> PFA patch v4.

I didn't like the commit message. Here is the corrected patch. Sorry
for the noise.

-- 
Best regards,
Aleksander Alekseev

Вложения

Re: "38.10.10. Shared Memory and LWLocks" may require a clarification

От
Michael Paquier
Дата:
On Mon, Oct 30, 2023 at 04:10:17PM +0300, Aleksander Alekseev wrote:
> I didn't like the commit message. Here is the corrected patch. Sorry
> for the noise.

Sounds pretty much OK to me.  Thanks!
--
Michael

Вложения

Re: "38.10.10. Shared Memory and LWLocks" may require a clarification

От
Michael Paquier
Дата:
On Tue, Oct 31, 2023 at 10:33:25AM +0900, Michael Paquier wrote:
> Sounds pretty much OK to me.  Thanks!

The main thing I have found annoying in the patch was the term
"tranche ID", so I have reworded that to use tranche_id to match with
the surroundings and the routines of lwlock.h.  LWLockInitialize()
should also be mentioned, as it is as important as the two others.

The result has been applied as fe705ef6fc1d.
--
Michael

Вложения

Re: "38.10.10. Shared Memory and LWLocks" may require a clarification

От
Aleksander Alekseev
Дата:
> The result has been applied as fe705ef6fc1d.

Many thanks!

-- 
Best regards,
Aleksander Alekseev