Обсуждение: BDR - triggers on receiving node?

Поиск
Список
Период
Сортировка

BDR - triggers on receiving node?

От
Peter Mogensen
Дата:
Hi,

Traditionally it hadn't made much sense to fire a trigger on a receiving
replica node (slave) - for many reasons, including it being read-only.

But with BDRs multi parter, partial replication and the possibility that
some tables are either actually or logically local to a single node,
there's scenarios where this makes sense.

Is it possible?

Say ... I have a table in a BDR replicated database with an "ON UPDATE"
trigger. - and that trigger wants to locally find out the local
txid_snapshot_xmin() when a change was applied to the local node.
Not the upstream txid of the transaction which actually made the change
originally (which only makes sense on that node) - but on the local node
receiving the change via BDR LLSR.

Can that be done?

/Peter



Re: BDR - triggers on receiving node?

От
Craig Ringer
Дата:


On 25 March 2015 at 19:15, Peter Mogensen <apm@one.com> wrote:
Hi,

Traditionally it hadn't made much sense to fire a trigger on a receiving replica node (slave) - for many reasons, including it being read-only.

But with BDRs multi parter, partial replication and the possibility that some tables are either actually or logically local to a single node, there's scenarios where this makes sense.

Is it possible?

It might be possible to fire FOR EACH ROW triggers, though this is not currently supported. BDR does its row changes at a lower level than SQL, so it doesn't invoke INSERT, UPDATE or DELETE statements directly. That helps with the efficiently of apply, but also means you can't easily fire triggers on those changes simply by creating ENABLE REPLICA or ENABLE ALWAYS triggers and using session_replication_role = 'replica'.

It is not practical to fire FOR EACH STATEMENT triggers, because BDR has no idea what upstream statement caused which change.

Most likely BDR would need to look for ENABLE REPLICA / ENABLE ALWAYS triggers and fire them directly. Or a new pseudo-trigger callback procedure for row-apply could be added, somewhat similar to the existing functionality for conflict handlers.


Say ... I have a table in a BDR replicated database with an "ON UPDATE" trigger. - and that trigger wants to locally find out the local txid_snapshot_xmin() when a change was applied to the local node.

Why would you want to do that? Just out of interest?
 
Not the upstream txid of the transaction which actually made the change originally (which only makes sense on that node) - but on the local node receiving the change via BDR LLSR.

Can that be done?

At this time I think you'd have to patch the BDR apply worker to do what you wanted.

There are already a few areas where downstream filtering or apply hooks are of interest, so this might be an area that can be enhanced in future. 

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: BDR - triggers on receiving node?

От
Peter Mogensen
Дата:

On 2015-03-25 12:32, Craig Ringer wrote:
> On 25 March 2015 at 19:15, Peter Mogensen <apm@one.com> wrote:
>
>> Say ... I have a table in a BDR replicated database with an "ON UPDATE"
>> trigger. - and that trigger wants to locally find out the local
>> txid_snapshot_xmin() when a change was applied to the local node.
>>
> Why would you want to do that? Just out of interest?
>
>

Because I have a bunch of SELECT queries to the local node, which
includes the txid_current() in their result, and I want to be able to
recognize a result which was obtained prior to a change reaching the
local node.

/Peter




Re: BDR - triggers on receiving node?

От
Craig Ringer
Дата:


On 25 March 2015 at 20:14, Peter Mogensen <apm@one.com> wrote:


On 2015-03-25 12:32, Craig Ringer wrote:
On 25 March 2015 at 19:15, Peter Mogensen <apm@one.com> wrote:

Say ... I have a table in a BDR replicated database with an "ON UPDATE"
trigger. - and that trigger wants to locally find out the local
txid_snapshot_xmin() when a change was applied to the local node.

Why would you want to do that? Just out of interest?



Because I have a bunch of SELECT queries to the local node, which includes the txid_current() in their result, and I want to be able to recognize a result which was obtained prior to a change reaching the local node.

So you're trying to do cache invalidation / cache verification of some form?

I'm interested in what you're looking at doing, but would probably need a more complete description to offer much input.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: BDR - triggers on receiving node?

От
Craig Ringer
Дата:
On 26 March 2015 at 15:57, Peter Mogensen <apm@one.com> wrote:

In standard Postgres one-way replication you can get the txid_snapshot_xmin() of the master on the slave. But with BDR, all nodes have their own txids. So the above scheme doesn't work unless you can get the txid which applied the change to the local node. ... I can see that applying BDR replication consumes transactions, so I assumed this value exists ... and if only there was a way to get it. I you could get it in a trigger, then you could maintain the invalidation event queue locally for each node.

I see what you're getting at. You want to prevent stale data from being reinsterted into a cache by a read from an asynchronous replica after it's been invalidated and purged by a write on the master. Your application provides cross-node co-ordination and can set a horizon of validity to prevent such caching events.

There are a few things you could work with in BDR that may be useful.

First, BDR adds commit timestamps for transactions, giving you wall-clock time commit information. It's used for last-update-wins conflict resolution. Commit timestamp information is accessible from SQL using:

 pg_get_transaction_committime(xid)

 pg_get_latest_transaction_committime_data()

(Note that these are the BDR definitions. The versions committed to PostgreSQL 9.5 have different names and some interface changes, so applications using these functions may need changes when updated to run on PostgreSQL 9.5+bdr when released, though BDR will probably add backwards-compat wrappers.)


Second, PostgreSQL keeps track of the node-wide log-sequence number (LSN). The replay position of a node is available on the upstream in pg_catalog.pg_stat_replication. The last-replayed LSN of an upstream is kept track of by the BDR extension and recorded as a replication identifier, but the SQL-level interface to replication identifiers, pg_catalog.pg_replication_identifier_progress. This view is only accessible by the superuser at the moment. You can also access the upstream commit's LSN at the C level from the BDR apply worker during transaction replay, though. The LSN provides strict ordering for a node.


I'd start by looking into whether commit timestamps can meet your needs.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: BDR - triggers on receiving node?

От
Craig Ringer
Дата:
(Please reply-to-all to keep the thread on pgsql-general)

On 26 March 2015 at 18:32, Peter Mogensen <apm@one.com> wrote:


On 2015-03-26 10:14, Craig Ringer wrote:
I see what you're getting at. You want to prevent stale data from being
reinsterted into a cache by a read from an asynchronous replica after it's
been invalidated and purged by a write on the master. Your application
provides cross-node co-ordination and can set a horizon of validity to
prevent such caching events.


Ohh... I get why you think of asynchronous replicas.
Yes. the system would also prevent that if the txids are global.
However that's not the goal.
The goal is only to prevent local race conditions wrt. a local node.

If that's the case then BDR shouldn't make any difference.

Earlier you were speaking of (presumably not synchronous) streaming replicas, and writes on the master vs reads from the replica, and a way you avoid caching stale data from the read-replica using the application's cache manager to co-ordinate visibility horizons. It sounded like you were looking for a way to translate that into something that could work with BDR.
 
Did I misunderstand?
 
I've seen in production how stale values can end up in the cache due to race condition like:

1) Client tries to read from the cache and gets a cache miss causing it to query the database. The resulting value should now be cached.
2) The value is changed in the database, causing an invalidation event.
3) The invalidation event is applied to the cache (being a NOP since it was not there).
4) Now the client manages to apply it's read value to the cache, but now it's an old stale value.

One way to handle that is to SELECT ... FOR SHARE in your reads, then update the cache and not commit until the cache has been updated.

A concurrent UPDATE can then not proceed until the cache write has been completed.

That could be tricky to apply in practice though, and prone to deadlocks caused by lock-ordering issues.

If it's purely local you can continue to use your current approach without issues. If not, then commit timestamps may offer some assistance when combined with accurate time sync.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: BDR - triggers on receiving node?

От
Peter Mogensen
Дата:

On 2015-03-26 11:57, Craig Ringer wrote:
> If that's the case then BDR shouldn't make any difference.

It does. Because now with BDR you can't compare txid_current() as saved
on the master with txid_snapshot_xmin() as read by the replica.
If however, you could save the txid associated with the application of
the BDR replication on the target, you would again be able to compare.

> Earlier you were speaking of (presumably not synchronous) streaming
> replicas, and writes on the master vs reads from the replica, and a
> way you avoid caching stale data from the read-replica using the
> application's cache manager to co-ordinate visibility horizons. It
> sounded like you were looking for a way to translate that into
> something that could work with BDR. Did I misunderstand?

No. I think you understood, but the goal of using txid from the master
was not to co-ordinate with the master. Only to match changes on the
slave with reads on the slave.
The extra property of txid's being the same on the master was not used.
The problem when using BDR is that the "slave" has it's own txid
sequence and I can't get the txid of all transactions doing changes, -
since some of them are via BDR and doesn't cause triggers.

>
> One way to handle that is to SELECT ... FOR SHARE in your reads, then
> update the cache and not commit until the cache has been updated.
>
> A concurrent UPDATE can then not proceed until the cache write has been
> completed.

Yes, that would serialize the selects...
It's very complicated to integrate into the cache however, since it
doesn't really know about how the values to cache are retrieved.


/Peter



Re: BDR - triggers on receiving node?

От
Craig Ringer
Дата:


On 26 March 2015 at 19:08, Peter Mogensen <apm@one.com> wrote:


On 2015-03-26 11:57, Craig Ringer wrote:
If that's the case then BDR shouldn't make any difference.

It does. Because now with BDR you can't compare txid_current() as saved on the master with txid_snapshot_xmin() as read by the replica.
If however, you could save the txid associated with the application of the BDR replication on the target, you would again be able to compare.

My comment was with regards to it being on the local node. A master and synchronous replica isn't a local-node to local-node scenario.
 
Earlier you were speaking of (presumably not synchronous) streaming replicas, and writes on the master vs reads from the replica, and a way you avoid caching stale data from the read-replica using the application's cache manager to co-ordinate visibility horizons. It sounded like you were looking for a way to translate that into something that could work with BDR. Did I misunderstand?

No. I think you understood, but the goal of using txid from the master was not to co-ordinate with the master. Only to match changes on the slave with reads on the slave.
The extra property of txid's being the same on the master was not used.
The problem when using BDR is that the "slave" has it's own txid sequence and I can't get the txid of all transactions doing changes, - since some of them are via BDR and doesn't cause triggers.

Right. So that's where I think you need to look into commit timestamps as an alternative, as outlined in prior mail.

Alternately you could look at firing a C callback in bdr_apply.c when a row is replayed.

As previously mentioned it's likely to be possible to add per-row apply callbacks that can be written in plpgsql or other function languages, but no such feature currently exists in BDR. 

Firing real "FOR EACH ROW ... ENABLE REPLICA" triggers may also be possible, but I haven't done any significant looking into it, and I'd be concerned about the correctness of doing so for row triggers but not statement triggers.

One issue is that it'd probably have to be able to restrict writes to unlogged tables or tables that aren't part of a replication set. If it could write to replicated tables it'd be doing so in the context of the apply worker, so the writes would not get replicated to other nodes. That'd mean the triggers would be creating rows only on one node - and that's a fast track to data divergence that can cause replication stalls and so on.

At this point I think commit timestamps are likely to be your best bet, and certainly what you should start looking into first.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: BDR - triggers on receiving node?

От
Andres Freund
Дата:
On 2015-03-26 19:56:23 +0800, Craig Ringer wrote:
> Right. So that's where I think you need to look into commit timestamps as
> an alternative, as outlined in prior mail.

I've only quickly skimmed this thread, but it sounds to me you just
could create a logical slot in the "standby" and do the cache
invalidation via that.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: BDR - triggers on receiving node?

От
Peter Mogensen
Дата:

On 2015-03-26 12:56, Craig Ringer wrote:
> My comment was with regards to it being on the local node. A master
> and synchronous replica isn't a local-node to local-node scenario.

No. But all I'm exploiting is that change events to the local node see
the same logical clock as SELECT statements to the same node.

>
>
> As previously mentioned it's likely to be possible to add per-row apply
> callbacks that can be written in plpgsql or other function languages, but
> no such feature currently exists in BDR.

I'll put it on my wish list then :)

> Firing real "FOR EACH ROW ... ENABLE REPLICA" triggers may also be
> possible, but I haven't done any significant looking into it, and I'd be
> concerned about the correctness of doing so for row triggers but not
> statement triggers.

That too :)

> One issue is that it'd probably have to be able to restrict writes to
> unlogged tables or tables that aren't part of a replication set. If it
> could write to replicated tables it'd be doing so in the context of the
> apply worker, so the writes would not get replicated to other nodes. That'd
> mean the triggers would be creating rows only on one node - and that's a
> fast track to data divergence that can cause replication stalls and so on.

For the use case at hand, that's not a problem. The trigger would only
need to update a fully local state. - like a table not part of the
replication set and not related to any other tables. - like an FIFO
queue of invalidation events.

/Peter



Re: BDR - triggers on receiving node?

От
Peter Mogensen
Дата:

On 2015-03-26 12:56, Craig Ringer wrote:
> At this point I think commit timestamps are likely to be your best
> bet, and certainly what you should start looking into first.

I've thought about this, but it seems that since these timestamps are
made on the node doing the change and you have no way on knowing if one
node in the cluster is far behind in it's replication to you, then you
could make a select on the local node being much newer in timestamp than
the transaction changing the value - which would only arrive later  (and
making pg_get_latest_transaction_committime_data() go backwards).

But I think I might have found a way to make this work with BDR:

It seems that BDR translates the xmin column of tables from the sending
node to the target node txid values.

So - instead of having a local FIFO table on each node, I could actually
still (like with single-master replication) have a global FIFO table of
invalidation events, being inserted into at the node actually making the
change.

When that invalidation event reaches a node the row gets a local xmin
value which can be compared to txid_snapshot_xmin() kept along every select.

So if I'm correct in observing that the xmin column gives me the a local
txid of the transaction inserting the invalidation event - even if it
was originally inserted on another node, then I think it'll work.

/Peter