Re: pg_basebackup may fail to send feedbacks.

Поиск
Список
Период
Сортировка
От Kyotaro HORIGUCHI
Тема Re: pg_basebackup may fail to send feedbacks.
Дата
Msg-id 20150204.165815.245382587.horiguchi.kyotaro@lab.ntt.co.jp
обсуждение исходный текст
Ответ на pg_basebackup may fail to send feedbacks.  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Ответы Re: pg_basebackup may fail to send feedbacks.  (Fujii Masao <masao.fujii@gmail.com>)
Список pgsql-hackers
I'm very sorry for confused report. The problem found in 9.4.0
and the diagnosis was mistakenly done on master.

9.4.0 has no problem of feedback delay caused by slow xlog
receiving on pg_basebackup mentioned in the previous mail. But
the current master still has this problem.

regards,

At Mon, 02 Feb 2015 13:48:34 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in
<20150202.134834.100672846.horiguchi.kyotaro@lab.ntt.co.jp>
> Hello, I received an report that pg_basebackup with "-X stream"
> always exits with the following error.
> 
> > pg_basebackup: could not receive data from WAL stream: server closed the connection unexpectedly
> >         This probably means the server terminated abnormally
> >         before or while processing the request.
> 
> The walsender had been terminated by replication timeout a bit
> before the above message.
> 
> > LOG:  terminating walsender process due to replication timeout
> 
> ====
> 
> I digged into this and found that a accumulation of delay in
> receiving xlog stream can cause such an error. This hardly occurs
> on an ordinary environment but this would be caused by temporary
> (or permanent) heavy load on the receiver-side machine. A virtual
> machine environment could raises the chance, I suppose.
> 
> In HandleCopyStream(), the feedbacks are sent only after breaks
> of the xlog stream, so continuous flow of xlog stream prevents it
> from being sent with expected intervals. walsender sends
> keepalive message for such a case, but it arrives with a long
> delay being caught by the congestion of the stream, but this is
> not a problem because the keepalive is intended to be sent while
> idle.
> 
> I think that the status feedback should be sent whenever
> standby_message_timeout has elapsed just after (or before) an
> incoming message is processed. The seemingly most straightforward
> way to fix this is breaking the innner-loop (while(r != 0)) if
> the time to feedback comes as the attached patch #1
> (0001-Make-sure-to-send-...).
> 
> What do you thing about this?
> 
> regards,
> 
> 
> =====
> - How to reproduce the situation.
> 
> As mentioned before, this hardly occurs on ordinary
> environment. But simulating the heavy load by inserting a delay
> in HandleCopyStream() effectively let the error occur.
> 
> With the attached patch #2(insert_interval.diff), the following
> operation let us see the situation. The walsender reports timeout
> although the stream processing is delayed but running steadily.
> 
> postgresql.conf:
> > wal_level = hot_standby
> > max_wal_senders = 2
> > wal_sender_timeout = 20s  # * 2 of default of standby_message_timeout
> 
> Terminal1$ psql postgres
> postgres=# CREATE TABLE t1 (a text);
> postgres=# ALTER TABLE t1 ALTER COLUMN a SET STORAGE EXTERNAL;
> postgres=# INSERT INTO t1 (SELECT repeat('x', 10000) FROM generate_series(0, 99999)); -- about 1GB
> postgres=# ^D
> 
> Terminal1$ pgbench -i postgres
> Terminal1$ pgbench -T 600 -h localhost postgres
> 
> Terminal2$ pg_basebackupo -r 32k -X stream -D data -h localhost
> 
> ... in about a couple of minutes on my environment..
> 
> Terminal1:
> > LOG:  terminating walsender process due to replication timeout
> 
> ... after another couple of minutes.
> 
> Terminal2:
> > pg_basebackup: could not receive data from WAL stream: server closed...

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Heikki Linnakangas
Дата:
Сообщение: Re: [COMMITTERS] pgsql: Add API functions to libpq to interrogate SSL related stuff.
Следующее
От: Ian Barwick
Дата:
Сообщение: Docs: CREATE TABLESPACE minor markup fix