Re: Replication Ideas

Поиск

Список

Период

Сортировка

От	Dennis Gearon
Тема	Re: Replication Ideas
Дата	27 августа 2003 г. 05:57:15
Msg-id	3F4C34B7.3050307@fireserve.net обсуждение исходный текст
Ответ на	Re: Replication Ideas (Jan Wieck <JanWieck@Yahoo.com>)
Список	pgsql-general

Дерево обсуждения

Jan Wieck wrote:

> WARNING: This is getting long ...
>
> Postgres-R is a very interesting and inspiring idea. And I've been
> kicking that concept around for a while now. What I don't like about
> it is that it requires fundamental changes in the lock mechanism and
> that it is based on the assumption of very low lock conflict.
>
> <explain-PG-R>
> In Postgres-R a committing transaction sends it's workset (WS - a list
> of all updates done in this transaction) to the group communication
> system (GC). The GC guarantees total order, meaning that all nodes
> will receive all WSs in the same order, no matter how they have been
> sent.
>
> If a node receives back it's own WS before any error occured, it goes
> ahead and finalizes the commit. If it receives a foreign WS, it has to
> apply the whole WS and commit it before it can process anything else.
> If now a local transaction, in progress or while waiting for it's WS
> to come back, holds a lock that is required to process such remote WS,
> the local transaction needs to be aborted to unlock it's resources ...
> it lost the total order race.
> </explain-PG-R>
>
> Postgres-R requires that all remote WSs are applied and committed
> before a local transaction can commit. Otherwise it couldn't correctly
> detect a lock conflict. So there will not be any read ahead. And since
> the total order really counts here, it cannot apply any two remote WSs
> in parallel, a race condition could possibly exist and a later WS in
> the total order runs faster and locks up a previous one, so we have to
> squeeze all remote WSs through one single replication work process.
> And all the locally parallel executed transactions that wait for their
> WSs to come back have to wait until that poor little worker is done
> with the whole pile. Bye bye concurrency. And I don't know how the GC
> will deal with the backlog either. Could well choke on it.
>
> I do not see how this will scale well in a multi-SMP-system cluster.
> At least the serialization of WSs will become a horror if there is
> significant lock contention like in a standard TPC-C on the district
> row containing the order number counter. I don't know for sure, but I
> suspect that with this kind of bottleneck, Postgres-R will have to
> rollback more than 50% of it's transactions when there are more than 4
> nodes under heavy load (like in a benchmark run). That will suck ...
>
>
> But ... initially I said that it is an inspiring concept ... soooo ...
>
> I am currently hacking around with some C+PL/TclU+Spread constructs
> that might form a rude kind of prototype creature.
>
> My changes to the Postgres-R concept are that there will be as many
> replicating slave processes as there are in summary masters out in the
> cluster ... yes, it will try to utilize all the CPU's in the cluster!
> For failover reliability, A committing transaction will hold before
> finalizing the commit and send it's "I'm ready" to the GC. Every
> replicator that reaches the same state send's "I'm ready" too. Spread
> guarantees in SAFE_MESS mode that messages are delivered to all nodes
> in a group or that at least LEAVE/DISCONNECT messages are deliverd
> before. So if a node receives more than 50% of "I'm ready", there
> would be a very small gap where multiple nodes have to fail in the
> same split second so that the majority of nodes does NOT commit. A
> node that reported "I'm ready" but lost more than 50% of the cluster
> before committing has to rollback and rejoin or wait for operator
> intervention.
>
> Now the idea is to split up the communication into GC distribution
> groups per transaction. So working master backends and associated
> replication backends will join/leave a unique group for every
> transaction in the cluster. This way, the per process communication is
> reduced to the required minimum.
>
>
> As said, I am hacking on some code ...
>
>
> Jan
>
> Chris Travers wrote:
>
>> Tom Lane wrote:
>>
>>> Chris Travers <chris@travelamericas.com> writes:
>>>
>>>
>>>> Yes I have. Postgres-r is not a high-availability solution which is
>>>> capable of transparent failover,
>>>>
>>>
>>>
>>> What makes you say that?  My understanding is it's supposed to survive
>>> loss of individual servers.
>>>
>>>             regards, tom lane
>>>
>>>
>>>
>>>
>> My mistake.  I must have gotten them confused with another
>> (asynchronous) replication project.
>>
>> Best Wishes,
>> Chris Travers
>>
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 9: the planner will ignore your desire to choose an index scan if
>> your
>>       joining column's datatypes do not match
>
>
>
As my british friends would say, "Bully for you",and I applaud you
playing, struggling, learning from this for our sakes. Jeez, all I think
about is me,huh?

В списке pgsql-general по дате отправления:

Предыдущее

От: Alvaro Herrera
Дата: 27 августа 2003 г., 05:55:41
Сообщение: Re: 7.4b1 vs 7.3.4 performance

Следующее

От: Greg Stark
Дата: 27 августа 2003 г., 05:57:21
Сообщение: Re: move to usenet?

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Replication Ideas

Предыдущее

Следующее