Обсуждение: PostgreSQL & latest Mac OS Sonoma, a possible bug / configuration issue

Поиск
Список
Период
Сортировка

PostgreSQL & latest Mac OS Sonoma, a possible bug / configuration issue

От
Arnd Baranowski
Дата:
Hello,

I have a MacBook Pro 16 inch with M3 Max in the base configuration (48 gig RAM, 1 terabyte HD). Operating System macOS
thelatest Sonoma 14.3. I use the latest Postgres 14 installed via brew in the standard configuration. From time to time
(everysecond or third week) I use to reboot the Mac and since several weeks now (the last 4 reboots at least) I lose
dataof the database when rebooting and I fall back to a state of several days ahead of the reboot. This affects
structureand data added. I cover this via backups and it looks that data is kept in memory rather than written to the
database.Beside this Mac and Postgres run fine. 

Regards

Arnd Baranowski


Re: PostgreSQL & latest Mac OS Sonoma, a possible bug / configuration issue

От
Tom Lane
Дата:
Arnd Baranowski <baranowski@oculeus.com> writes:
> I have a MacBook Pro 16 inch with M3 Max in the base configuration (48 gig RAM, 1 terabyte HD). Operating System
macOSthe latest Sonoma 14.3. I use the latest Postgres 14 installed via brew in the standard configuration. From time
totime (every second or third week) I use to reboot the Mac and since several weeks now (the last 4 reboots at least) I
losedata of the database when rebooting and I fall back to a state of several days ahead of the reboot. This affects
structureand data added. I cover this via backups and it looks that data is kept in memory rather than written to the
database.Beside this Mac and Postgres run fine. 

Hmm, what have you got the fsync and wal_sync_method GUCs set to?
What was the last macOS version that was stable for you?

            regards, tom lane



Re: PostgreSQL & latest Mac OS Sonoma, a possible bug / configuration issue

От
Arnd Baranowski
Дата:
Correction fsync is „On" and the wal_sync_method is set to „open_datasync“

> Am 05.02.2024 um 21:44 schrieb Tom Lane <tgl@sss.pgh.pa.us>:
>
> Arnd Baranowski <baranowski@oculeus.com> writes:
>> I have a MacBook Pro 16 inch with M3 Max in the base configuration (48 gig RAM, 1 terabyte HD). Operating System
macOSthe latest Sonoma 14.3. I use the latest Postgres 14 installed via brew in the standard configuration. From time
totime (every second or third week) I use to reboot the Mac and since several weeks now (the last 4 reboots at least) I
losedata of the database when rebooting and I fall back to a state of several days ahead of the reboot. This affects
structureand data added. I cover this via backups and it looks that data is kept in memory rather than written to the
database.Beside this Mac and Postgres run fine. 
>
> Hmm, what have you got the fsync and wal_sync_method GUCs set to?
> What was the last macOS version that was stable for you?
>
>             regards, tom lane




Re: PostgreSQL & latest Mac OS Sonoma, a possible bug / configuration issue

От
Arnd Baranowski
Дата:
The latest stable version seemed to be 14.1. I do not know and it might have been a coincidence. Recently I got forced
toupgrade my Postgres by Brew. Postgres moved from 14.7 to 14.10. This was about the same time my problem started   

> Am 05.02.2024 um 21:44 schrieb Tom Lane <tgl@sss.pgh.pa.us>:
>
> Arnd Baranowski <baranowski@oculeus.com> writes:
>> I have a MacBook Pro 16 inch with M3 Max in the base configuration (48 gig RAM, 1 terabyte HD). Operating System
macOSthe latest Sonoma 14.3. I use the latest Postgres 14 installed via brew in the standard configuration. From time
totime (every second or third week) I use to reboot the Mac and since several weeks now (the last 4 reboots at least) I
losedata of the database when rebooting and I fall back to a state of several days ahead of the reboot. This affects
structureand data added. I cover this via backups and it looks that data is kept in memory rather than written to the
database.Beside this Mac and Postgres run fine. 
>
> Hmm, what have you got the fsync and wal_sync_method GUCs set to?
> What was the last macOS version that was stable for you?
>
>             regards, tom lane




Re: PostgreSQL & latest Mac OS Sonoma, a possible bug / configuration issue

От
Tom Lane
Дата:
Arnd Baranowski <baranowski@oculeus.com> writes:
> Correction fsync is „On" and the wal_sync_method is set to „open_datasync“

That's what they should be.

I tried to reproduce this by selecting "Restart..." immediately after
creating/populating a table on my own MacBook running Sonoma 14.3.
After the reboot, the table was there with the expected contents.
Now, this test doesn't actually prove a heck of a lot about PG's
crash recovery, because I see in the postmaster log

2024-02-05 21:00:30.322 EST [1148] LOG:  database system was shut down at 2024-02-05 20:58:46 EST
2024-02-05 21:00:30.327 EST [1144] LOG:  database system is ready to accept connections

which indicates that Postgres had time to perform a clean shutdown
before the system rebooted.  (That is the expected scenario for an
OS reboot, assuming that the kernel delivers us SIGTERM as it's
required to do by POSIX and then gives us enough time to nail the
windows shut, which it's not required to do.)

The facts as you've presented them indicate that (1) checkpoints
weren't working, (2) we didn't get SIGTERM at system shutdown, *and*
(3) WAL wasn't written out to disk as it's supposed to be.  It's
a bit hard to credit that so many things are broken and nobody has
noticed.  I'm inclined to wonder if something is wrong with your
disk drive.

It would be interesting to know what appears in the first few lines
of your postmaster log after a data-losing restart.  Also, try
running with log_checkpoints = on for awhile, and see if there are
log entries claiming successful checkpoint completion.

A different line of thought is that maybe the corruption is happening
because you have two postmasters started in the same data directory.
We have interlocks that are supposed to defend against that, but it'd
be a lot easier to credit that those aren't working than that all the
rest of this stuff broke.

            regards, tom lane



Re: PostgreSQL & latest Mac OS Sonoma, a possible bug / configuration issue

От
Arnd Baranowski
Дата:
Hi Tom,

Thanks for the feedback and insights. I will follow your advice, observe and report if I find something which could
explainthis behavior 

Regard

Arnd

> Am 06.02.2024 um 03:18 schrieb Tom Lane <tgl@sss.pgh.pa.us>:
>
> Arnd Baranowski <baranowski@oculeus.com> writes:
>> Correction fsync is „On" and the wal_sync_method is set to „open_datasync“
>
> That's what they should be.
>
> I tried to reproduce this by selecting "Restart..." immediately after
> creating/populating a table on my own MacBook running Sonoma 14.3.
> After the reboot, the table was there with the expected contents.
> Now, this test doesn't actually prove a heck of a lot about PG's
> crash recovery, because I see in the postmaster log
>
> 2024-02-05 21:00:30.322 EST [1148] LOG:  database system was shut down at 2024-02-05 20:58:46 EST
> 2024-02-05 21:00:30.327 EST [1144] LOG:  database system is ready to accept connections
>
> which indicates that Postgres had time to perform a clean shutdown
> before the system rebooted.  (That is the expected scenario for an
> OS reboot, assuming that the kernel delivers us SIGTERM as it's
> required to do by POSIX and then gives us enough time to nail the
> windows shut, which it's not required to do.)
>
> The facts as you've presented them indicate that (1) checkpoints
> weren't working, (2) we didn't get SIGTERM at system shutdown, *and*
> (3) WAL wasn't written out to disk as it's supposed to be.  It's
> a bit hard to credit that so many things are broken and nobody has
> noticed.  I'm inclined to wonder if something is wrong with your
> disk drive.
>
> It would be interesting to know what appears in the first few lines
> of your postmaster log after a data-losing restart.  Also, try
> running with log_checkpoints = on for awhile, and see if there are
> log entries claiming successful checkpoint completion.
>
> A different line of thought is that maybe the corruption is happening
> because you have two postmasters started in the same data directory.
> We have interlocks that are supposed to defend against that, but it'd
> be a lot easier to credit that those aren't working than that all the
> rest of this stuff broke.
>
>             regards, tom lane




Re: PostgreSQL & latest Mac OS Sonoma, a possible bug / configuration issue

От
Arnd Baranowski
Дата:
Hi Tom,

I completely deleted my Mac installation of Postgres and Brew. Reinstalled everything from scratch and moved to
PostgreSQL16.The issue is gone. It looks like a screwed PostgreSQL14 installation caused the problem. 

Regards

Arnd

---

Hi Tom,

Thanks for the feedback and insights. I will follow your advice, observe and report if I find something which could
explainthis behavior 

Regard

Arnd

> Am 06.02.2024 um 03:18 schrieb Tom Lane <tgl@sss.pgh.pa.us>:
>
> Arnd Baranowski <baranowski@oculeus.com> writes:
>> Correction fsync is „On" and the wal_sync_method is set to „open_datasync“
>
> That's what they should be.
>
> I tried to reproduce this by selecting "Restart..." immediately after
> creating/populating a table on my own MacBook running Sonoma 14.3.
> After the reboot, the table was there with the expected contents.
> Now, this test doesn't actually prove a heck of a lot about PG's
> crash recovery, because I see in the postmaster log
>
> 2024-02-05 21:00:30.322 EST [1148] LOG:  database system was shut down at 2024-02-05 20:58:46 EST
> 2024-02-05 21:00:30.327 EST [1144] LOG:  database system is ready to accept connections
>
> which indicates that Postgres had time to perform a clean shutdown
> before the system rebooted.  (That is the expected scenario for an
> OS reboot, assuming that the kernel delivers us SIGTERM as it's
> required to do by POSIX and then gives us enough time to nail the
> windows shut, which it's not required to do.)
>
> The facts as you've presented them indicate that (1) checkpoints
> weren't working, (2) we didn't get SIGTERM at system shutdown, *and*
> (3) WAL wasn't written out to disk as it's supposed to be.  It's
> a bit hard to credit that so many things are broken and nobody has
> noticed.  I'm inclined to wonder if something is wrong with your
> disk drive.
>
> It would be interesting to know what appears in the first few lines
> of your postmaster log after a data-losing restart.  Also, try
> running with log_checkpoints = on for awhile, and see if there are
> log entries claiming successful checkpoint completion.
>
> A different line of thought is that maybe the corruption is happening
> because you have two postmasters started in the same data directory.
> We have interlocks that are supposed to defend against that, but it'd
> be a lot easier to credit that those aren't working than that all the
> rest of this stuff broke.
>
>             regards, tom lane