Re: Transaction-controlled robustness for replication

Поиск

Список

Период

Сортировка

От	Simon Riggs
Тема	Re: Transaction-controlled robustness for replication
Дата	25 июля 2008 г. 17:02:57
Msg-id	1217005755.3894.954.camel@ebony.2ndQuadrant обсуждение исходный текст
Ответ на	Re: Transaction-controlled robustness for replication (Jens-Wolfhard Schicke <drahflow@gmx.de>)
Ответы	Re: Transaction-controlled robustness for replication (Markus Wanner <markus@bluegap.ch>)
Список	pgsql-hackers

Дерево обсуждения

On Wed, 2008-07-23 at 10:49 +1000, Jens-Wolfhard Schicke wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Simon Riggs wrote:
> > Asynchronous commit controls whether we go to disk at time of commit, or
> > whether we defer this slightly. We have the same options with
> > replication: do we replicate at time of commit, or do we defer this
> > slightly for performance reasons. DRBD and other replication systems
> > show us that there is actually another difference when talking about
> > synchronous replication: do we go to disk on the standby before
> > acknowledging the primary?
> > 
> > We can generalise this as three closed questions, answered either Yes
> > (Synchronous) or No (Asynchronous)
> > 
> > * Does WAL get forced to disk on primary at commit time?
> > * Does WAL get forced across link to standby at commit time?
> > * Does WAL get forced to disk on standby at commit time?

> * Does WAL get applied [and synced] to disk on standby at commit time?

> This is important if you want to use the standby as a read-only.

That's an assumption - I'm not sure its a requirement in all cases. 

If a standby query needed to see particular data then the *query* would
wait until correct data has been applied. I certainly wouldn't want to
penalise writing transactions on the primary because there *might* be a
statement on the standby that wishes to see an updated view.

> I am slightly confused about what the fsync setting does to all this, hence
> the brackets.

There is no sync() during WAL apply when each individual transaction
hits commit. This is because there is "no WAL" i.e. changes comes from
WAL to the database, so we have no need of a second WAL to protect the
changes being made.

> I think that questions 2 and 3 are trivially bundled together. Once the
> user can specify 2, implementing 3 should be trivial and vice versa.
> I am not even convinced that these need to be two different parameters.
> Also please note that an answer of "yes" to 3 means that 2 must also
> be answered "yes".

Yes, they are trivially bundled together, but there is benefit in doing
so. The difference between 2 and 3 is about performance and levels of
robustness.

Waiting for transfer across link to standby (only) is much faster than
waiting for transfer *and* waiting for fsync. Probably twice as fast in
a tightly coupled cluster, i.e. option 3 will make your transactions
somewhat more robust, but twice the response time and half the
throughput.

> > We could represent this with 3 parameters:
> > synchronous_commit = on | off
> > synchronous_standby_transfer = on | off
> > synchronous_standby_wal_fsync = on | off
> synchronous_standby_apply = on | off    # just to propose a name
> 
> > Changing the parameter setting at transaction-level would be expensive
> > if we had to set three parameters.
> What exactly does "expensive" mean? All three parameters can probably be set
> in one TCP packet from client to server.

Expensive as in we need to parse and handle each statement separately.
If we have a single parameter then much lower overhead.

> > Or we could use just a single parameter
> > synchronous_commit = 'AAA', 'SAA', 'SSA', 'SSS' or on |off when no
> > log-based replication is defined
> > 
> > Having the ability to set these at the transaction-level would be very
> > cool. Having it set via a *single* parameter would make it much more
> > viable to switch between AAA for bulk, low importance data and SSS for
> > very low volume, critical data, or somewhere in between on the same
> > server, at the same time.

> The problem with a single parameter is that everything becomes position
> dependent and if whyever a new parameter is introduced, it's not easy to
> upgrade old application code.

True, but what new parameter do you imagine?

> > So proposal in summary is
> > * allow various modes of synchronous replication for perf/robustness
> > * allow modes to be specified per-transaction
> > * allow modes to be specified as a single parameter

> How about creating named modes? 

Good idea

> This would give the user the ability to
> define more fine-grained control especially in larger clusters of fail-over/read-only
> servers without totally clogging the parameter space and application code.
> Whether this should be done SQL-style or in some config file is not so clear to me,
> although I'd prefer SQL-style like
> 
> CREATE SYNCHRONIZING MODE immediate_readonly AS
>   LOCAL        SYNCHRONOUS APPLY
>   192.168.0.10 SYNCHRONOUS APPLY        -- read-only slave
>   192.168.0.11 SYNCHRONOUS APPLY        -- read-only slave
>   192.168.0.20 SYNCHRONOUS SHIP         -- backup-server
>   192.168.0.21 SYNCHRONOUS SHIP         -- backup-server
>   192.168.0.30 SYNHCRONOUS FSYNC        -- backup-server with fast disks
> ;

Thats not how we define parameter values, so no.

> and then something like
> 
> synchronize_mode = immediate_readonly;
> 
> Yeah, I know, give patches not pipe-dreams :)

Ah yes. Of course.

The only sensible options are these four:

AAA    
SAA    
SSA    
SSS

plus the existing on & off

So we give them 4 names and set it using a single parameter value.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Alvaro Herrera
Дата: 25 июля 2008 г., 16:44:51
Сообщение: Re: [RFC] Unsigned integer support.

Следующее

От: "Kevin Grittner"
Дата: 25 июля 2008 г., 17:20:47
Сообщение: Re: [RFC] Unsigned integer support.

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Transaction-controlled robustness for replication

Предыдущее

Следующее