Re: New replication mode: write

Поиск
Список
Период
Сортировка
От Fujii Masao
Тема Re: New replication mode: write
Дата
Msg-id CAHGQGwE+Zxk_yNw0rw8bo__+YzFOvEw3HtCb+8FQL=fzTaPxJA@mail.gmail.com
обсуждение исходный текст
Ответ на New replication mode: write  (Fujii Masao <masao.fujii@gmail.com>)
Ответы Re: New replication mode: write
Re: New replication mode: write
Список pgsql-hackers
On Fri, Jan 13, 2012 at 7:30 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Fri, Jan 13, 2012 at 9:15 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> On Fri, Jan 13, 2012 at 7:41 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>
>>> Thought? Comments?
>>
>> This is almost exactly the same as my patch series
>> "syncrep_queues.v[1,2].patch" earlier this year. Which I know because
>> I was updating that patch myself last night for 9.2. I'm about half
>> way through doing that, since you and I agreed in Ottawa I would do
>> this. Perhaps it is better if we work together?
>
> I think this comment is mostly pointless. We don't have time to work
> together and there's no real reason to. You know what you're doing, so
> I'll leave you to do it.
>
> Please add the Apply mode.

OK, will do.

> In my patch, the reason I avoided doing WRITE mode (which we had
> previously referred to as RECV) was that no fsync of the WAL contents
> takes place. In that case we are applying changes using un-fsynced WAL
> data and in case of crash this would cause a problem.

My patch has not changed the execution order of WAL flush and replay.
WAL records are always replayed after they are flushed by walreceiver.
So, such a problem doesn't happen.

But which means that transaction might need to wait for WAL flush caused
by previous transaction even if WRITE mode is chosen. Which limits the
performance gain by WRITE mode, and should be improved later, I think.

> I was going to
> make the WalWriter available during recovery to cater for that. Do you
> not think that is no longer necessary?

That's still necessary to improve the performance in sync rep further, I think.
What I'd like to do (maybe in 9.3dev) after supporting WRITE mode is:

* Allow WAL records to be replayed before they are flushed to the disk.
* Add new GUC parameter specifying whether to allow the standby to defer  WAL flush. If the parameter is false,
walreceiverflushes WAL whenever it  receives WAL (i.e., it's same as the current behavior). If true, walreceiver
doesn'tflush WAL at all. Instead, walwriter, backend or startup process  does that. Walwriter periodically checks
whetherthere is un-flushed WAL  file, and flushes it if exists. When the buffer page is written out, backend  or
startupprocess forces WAL flush up to buffer's LSN.
 

If the above GUC parameter is set to true (i.e., walreceiver doesn't flush
WAL at all) and WRITE mode is chosen, transaction doesn't need to wait
for WAL flush on the standby at all. Also the frequency of WAL flush on
the standby would become lower, which significantly reduces I/O load.
After all, the performance in sync rep would improve very much.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: read transaction and sync rep
Следующее
От: Simon Riggs
Дата:
Сообщение: Re: New replication mode: write