Обсуждение: Re: [COMMITTERS] pgsql: Upgrade to Autoconf 2.69
Alvaro Herrera wrote: Heikki, Andres, > Shortly after this patch was committed, buildfarm member locust (running > Mac OS X 10.5 apparently) started failing the pg_upgrade check: > > command: "/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/install//Users/pgbuildfarm/Documents/workdir//HEAD/inst/bin/pg_ctl" -w-l "pg_upgrade_server.log" -D "/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/data"-o "-p 57632 -b -c synchronous_commit=off-c fsync=off -c full_page_writes=off -c listen_addresses='' -c unix_socket_permissions=0700 -c unix_socket_directories='/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade'"start >> "pg_upgrade_server.log"2>&1 > waiting for server to start....LOG: database system was shut down at 2013-12-19 12:51:16 CET > LOG: invalid primary checkpoint record > LOG: invalid secondary checkpoint link in control file > PANIC: could not locate a valid checkpoint record Any comment on this problem? Somehow ReadRecord is unable to find a checkpoint, yet there's no error message to be seen anywhere, whereas pg_resetxlog does report it: > command: "/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/install//Users/pgbuildfarm/Documents/workdir//HEAD/inst/bin/pg_resetxlog" -l000000010000000000000009 "/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/data" >> "pg_upgrade_utility.log"2>&1 > pg_resetxlog: could not read from directory "pg_xlog": Invalid argument I cannot but think xlogreader is at fault. Regardless of the solution to the Mac OS X problem, ISTM this should be fixed. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Hi, On 2013-12-24 12:58:04 -0300, Alvaro Herrera wrote: > > Shortly after this patch was committed, buildfarm member locust (running > > Mac OS X 10.5 apparently) started failing the pg_upgrade check: > > > > command: "/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/install//Users/pgbuildfarm/Documents/workdir//HEAD/inst/bin/pg_ctl" -w-l "pg_upgrade_server.log" -D "/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/data"-o "-p 57632 -b -c synchronous_commit=off-c fsync=off -c full_page_writes=off -c listen_addresses='' -c unix_socket_permissions=0700 -c unix_socket_directories='/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade'"start >> "pg_upgrade_server.log"2>&1 > > waiting for server to start....LOG: database system was shut down at 2013-12-19 12:51:16 CET > > LOG: invalid primary checkpoint record > > LOG: invalid secondary checkpoint link in control file > > PANIC: could not locate a valid checkpoint record > > Any comment on this problem? Somehow ReadRecord is unable to find a > checkpoint, yet there's no error message to be seen anywhere, whereas > pg_resetxlog does report it: > > > command: "/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/install//Users/pgbuildfarm/Documents/workdir//HEAD/inst/bin/pg_resetxlog" -l000000010000000000000009 "/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/data" >> "pg_upgrade_utility.log"2>&1 > > pg_resetxlog: could not read from directory "pg_xlog": Invalid argument > > I cannot but think xlogreader is at fault. > > Regardless of the solution to the Mac OS X problem, ISTM this should be > fixed. I didn't look at any code, and I won't today, but it doesn't look surprising - the report when starting the server above is presumable the one in ReadCheckpoint() (or similar) and it probably just reports that ReadRecord() didn't return a record. pg_resetxlog (which doesn't use xlogreader!) reports that it couldn't read from directory "pg_xlog", so there's something wonky independently from xlogreader. I'd guess that xlog.c read_page callback errors out without reporting an error. IIRC we're logging some failures as DEBUG there, because they really aren't unexpected, and normally just signal the end of wal. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund wrote: > Hi, > > On 2013-12-24 12:58:04 -0300, Alvaro Herrera wrote: > > > Shortly after this patch was committed, buildfarm member locust (running > > > Mac OS X 10.5 apparently) started failing the pg_upgrade check: > > > > > > command: "/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/install//Users/pgbuildfarm/Documents/workdir//HEAD/inst/bin/pg_ctl" -w-l "pg_upgrade_server.log" -D "/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/data"-o "-p 57632 -b -c synchronous_commit=off-c fsync=off -c full_page_writes=off -c listen_addresses='' -c unix_socket_permissions=0700 -c unix_socket_directories='/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade'"start >> "pg_upgrade_server.log"2>&1 > > > waiting for server to start....LOG: database system was shut down at 2013-12-19 12:51:16 CET > > > LOG: invalid primary checkpoint record > > > LOG: invalid secondary checkpoint link in control file > > > PANIC: could not locate a valid checkpoint record > > > > Any comment on this problem? Somehow ReadRecord is unable to find a > > checkpoint, yet there's no error message to be seen anywhere, whereas > > pg_resetxlog does report it: > > > > > command: "/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/install//Users/pgbuildfarm/Documents/workdir//HEAD/inst/bin/pg_resetxlog" -l000000010000000000000009 "/Users/pgbuildfarm/Documents/workdir/HEAD/pgsql.82393/contrib/pg_upgrade/tmp_check/data" >> "pg_upgrade_utility.log"2>&1 > > > pg_resetxlog: could not read from directory "pg_xlog": Invalid argument > > > > I cannot but think xlogreader is at fault. > > > > Regardless of the solution to the Mac OS X problem, ISTM this should be > > fixed. > > I didn't look at any code, and I won't today, but it doesn't look > surprising - the report when starting the server above is presumable the > one in ReadCheckpoint() (or similar) and it probably just reports that > ReadRecord() didn't return a record. How is this not surprising? Surely failing to find a checkpoint record is not a problem to be taken lightly. > pg_resetxlog (which doesn't use xlogreader!) reports that it couldn't > read from directory "pg_xlog", so there's something wonky independently > from xlogreader. Yes, most likely there is. My point is that the LOG messages above should have logged the system error that caused the checkpoint record to be unfindable. > I'd guess that xlog.c read_page callback errors out without reporting > an error. IIRC we're logging some failures as DEBUG there, because > they really aren't unexpected, and normally just signal the end of > wal. Hmm? At least, I recall something like a "unexpected pageaddr" message is sometimes logged when end-of-wal is found. Why would other error messages be hidden? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services