Обсуждение: Postgresql 9.3 not coming up after restart in centos
Hi I have installed postgresql-9.3 in my centos application. When the application starts for first-time Postgres starts without any issue most of the time. But when I reboot the centOs, Postgres is not getting started on subsequent boot and I am getting the error, LOG: invalid magic number 0000 in log segment 000000010000000000000002, offset 0 LOG: invalid primary checkpoint record LOG: invalid magic number 0000 in log segment 000000010000000000000002, offset 0 LOG: invalid secondary checkpoint record PANIC: could not locate a valid checkpoint record LOG: startup process (PID 2529) was terminated by signal 6: Aborted LOG: aborting startup due to startup process failure in the location "/var/lib/pgsql/9.3/data/pg_xlog", I see 3 files 000000010000000000000002, 000000010000000000000003 and an empty directoy archive_status. When I tried to restart the Postgres, I am getting the same error. How to start the Postgres now? How to avoid this problem without happening in future? Thanks. Let me know incase if more information is needed. -- View this message in context: http://postgresql.nabble.com/Postgresql-9-3-not-coming-up-after-restart-in-centos-tp5880435.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On 01/05/2016 11:18 AM, balajishanmugam@live.in wrote: > Hi > > I have installed postgresql-9.3 in my centos application. When the > application starts for first-time Postgres starts without any issue most of Not sure what you are talking about when you say CentOS application. Are you saying when CentOS starts, or an application that you wrote starts? How did you install Postgres? Can you show the init script that is starting/stopping Postgres? If such a thing does not exist, how do you start/stop Postgres? > the time. But when I reboot the centOs, Postgres is not getting started on > subsequent boot and I am getting the error, > > LOG: invalid magic number 0000 in log segment 000000010000000000000002, > offset 0 > LOG: invalid primary checkpoint record > LOG: invalid magic number 0000 in log segment 000000010000000000000002, > offset 0 > LOG: invalid secondary checkpoint record > PANIC: could not locate a valid checkpoint record > LOG: startup process (PID 2529) was terminated by signal 6: Aborted > LOG: aborting startup due to startup process failure > > in the location "/var/lib/pgsql/9.3/data/pg_xlog", I see 3 files > 000000010000000000000002, 000000010000000000000003 and an empty directoy > archive_status. > > When I tried to restart the Postgres, I am getting the same error. How to > start the Postgres now? How to avoid this problem without happening in > future? Thanks. Let me know incase if more information is needed. > > > > -- > View this message in context: http://postgresql.nabble.com/Postgresql-9-3-not-coming-up-after-restart-in-centos-tp5880435.html > Sent from the PostgreSQL - general mailing list archive at Nabble.com. > > -- Adrian Klaver adrian.klaver@aklaver.com
By application I mean centOS. We are starting and stopping Postgres using systemd service. We have a service file called postgresql9.3.service which is used to start or stop Postgres. Excerpts of postgresql9.3.service ExecStartPre=/usr/pgsql-9.3/bin/postgresql93-check-db-dir ${PGDATA} ExecStart=/usr/pgsql-9.3/bin/pg_ctl start -D ${PGDATA} -s -w -t 300 ExecStop=/usr/pgsql-9.3/bin/pg_ctl stop -D ${PGDATA} -s -m fast ExecReload=/usr/pgsql-9.3/bin/pg_ctl reload -D ${PGDATA} -s -- View this message in context: http://postgresql.nabble.com/Postgresql-9-3-not-coming-up-after-restart-in-centos-tp5880435p5880677.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On Wed, Jan 6, 2016 at 12:36 PM, balajishanmugam@live.in <balajishanmugam@live.in> wrote: > By application I mean centOS. > > We are starting and stopping Postgres using systemd service. We have a > service file called postgresql9.3.service which is used to start or stop > Postgres. > > Excerpts of postgresql9.3.service > > ExecStartPre=/usr/pgsql-9.3/bin/postgresql93-check-db-dir ${PGDATA} > ExecStart=/usr/pgsql-9.3/bin/pg_ctl start -D ${PGDATA} -s -w -t 300 > ExecStop=/usr/pgsql-9.3/bin/pg_ctl stop -D ${PGDATA} -s -m fast > ExecReload=/usr/pgsql-9.3/bin/pg_ctl reload -D ${PGDATA} -s > > So how are you restarting centos? Orderly shutdown, pulling the power plugs etc? I'm wondering if you've got untrustworthy data storage underneath it (i.e. storage that lies about fsync) and maybe centos or the method of shutdown isn't allowing the drives to flush the data that they've already said they flushed but actually haven't.
Most of the time I will be restarting centOS by issuing reboot command. Which will do the orderly shutdown of all the service and sometimes just pull the plug. But the issue appears to be random. Is there a way that before Postgres starts we can check whether data is flushed, if not flush it manually or any other better way to avoid this issue. Thank you! -- View this message in context: http://postgresql.nabble.com/Postgresql-9-3-not-coming-up-after-restart-in-centos-tp5880435p5880690.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On 01/06/2016 01:08 PM, balajishanmugam@live.in wrote: > Most of the time I will be restarting centOS by issuing reboot command. Which > will do the orderly shutdown of all the service and sometimes just pull the > plug. > > But the issue appears to be random. Is there a way that before Postgres > starts we can check whether data is flushed, if not flush it manually or any > other better way to avoid this issue. I would think the damage is done on shutdown and by the time you start up again it is to late to do anything. In that vein, what does the Postgres log show at the end of the shutdown sequence? > > Thank you! > > > > -- > View this message in context: http://postgresql.nabble.com/Postgresql-9-3-not-coming-up-after-restart-in-centos-tp5880435p5880690.html > Sent from the PostgreSQL - general mailing list archive at Nabble.com. > > -- Adrian Klaver adrian.klaver@aklaver.com
On Wed, Jan 6, 2016 at 2:08 PM, balajishanmugam@live.in <balajishanmugam@live.in> wrote: > Most of the time I will be restarting centOS by issuing reboot command. Which > will do the orderly shutdown of all the service and sometimes just pull the > plug. > > But the issue appears to be random. Is there a way that before Postgres > starts we can check whether data is flushed, if not flush it manually or any > other better way to avoid this issue. As Adrian mentioned, by the time you go for a startup of pgsql, the damage is already done during the previous shut down. The real issue here is that a properly operating server should be able to have the power plug pulled, and on boot up postgres should be able to come back up. When pgsql can't come back up, it's usually due to an unreliable storage subsystem. So what are you using for storage?
Hi, For storage I am using a 2.5inch SATA 3 SSD hard disk. It is about 60 GB. I am yet to get the log. I will post the Postgres log once I have it. Thanks! -- View this message in context: http://postgresql.nabble.com/Postgresql-9-3-not-coming-up-after-restart-in-centos-tp5880435p5880957.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On Thu, Jan 7, 2016 at 1:41 PM, balajishanmugam@live.in <balajishanmugam@live.in> wrote: > Hi, > > For storage I am using a 2.5inch SATA 3 SSD hard disk. It is about 60 GB. I > am yet to get the log. I will post the Postgres log once I have it. > > Thanks! Yeah a lot of cheaper consumer grade SSDs don't fsync safely. There are drives that do, and they're usually bit more expensive. We use the Intel DC S3500 series at work and they do pass the "pull the power cables" test for us.
On 1/7/2016 12:41 PM, balajishanmugam@live.in wrote: > For storage I am using a 2.5inch SATA 3 SSD hard disk. It is about 60 GB. I > am yet to get the log. I will post the Postgres log once I have it. is this an enterprise grade SSD with supercap backup? or is it a consumer desktop/notebook class SSD without write buffer protection? if the latter, its lying about data being written, and if the power fails, poof, the last bunch of data that had just been written is likely gone. -- john r pierce, recycling bits in santa cruz