Обсуждение: [HACKERS] An isolation test for SERIALIZABLE READ ONLY DEFERRABLE

Поиск

Список

Период

Сортировка

[HACKERS] An isolation test for SERIALIZABLE READ ONLY DEFERRABLE

От

Thomas Munro

Дата:

01 января 2017 г., 15:38:29

Hi hackers,

Here is a small patch to add a test exercising SERIALIZABLE READ ONLY
DEFERRABLE.  It shows a well known example of a serialisation anomaly
caused by a read-only transaction under REPEATABLE READ (snapshot
isolation), then shows the different ways that SERIALIZABLE and
SERIALIZABLE READ ONLY DEFERRABLE avoid the anomaly.

To be able to do this, the patch modifies the isolation tester so that
it recognises wait_event SafeSnapshot.

-- 
Thomas Munro
http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

read-only-anomaly-under-si.patch

Re: [HACKERS] An isolation test for SERIALIZABLE READ ONLY DEFERRABLE

От

Robert Haas

Дата:

03 января 2017 г., 21:48:37

On Sun, Jan 1, 2017 at 4:38 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> To be able to do this, the patch modifies the isolation tester so that
> it recognises wait_event SafeSnapshot.

I'm not going to say that's unacceptable, but it's certainly not beautiful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] An isolation test for SERIALIZABLE READ ONLY DEFERRABLE

От

Michael Paquier

Дата:

04 января 2017 г., 13:17:23

On Wed, Jan 4, 2017 at 12:48 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sun, Jan 1, 2017 at 4:38 AM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> To be able to do this, the patch modifies the isolation tester so that
>> it recognises wait_event SafeSnapshot.
>
> I'm not going to say that's unacceptable, but it's certainly not beautiful.

Perhaps being able to define in an isolation spec a step called
'wait_event' with a value defined to the wait event to look for would
make more sense?
-- 
Michael

Re: [HACKERS] An isolation test for SERIALIZABLE READ ONLY DEFERRABLE

От

Robert Haas

Дата:

05 января 2017 г., 00:41:46

On Wed, Jan 4, 2017 at 2:17 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Wed, Jan 4, 2017 at 12:48 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Sun, Jan 1, 2017 at 4:38 AM, Thomas Munro
>> <thomas.munro@enterprisedb.com> wrote:
>>> To be able to do this, the patch modifies the isolation tester so that
>>> it recognises wait_event SafeSnapshot.
>>
>> I'm not going to say that's unacceptable, but it's certainly not beautiful.
>
> Perhaps being able to define in an isolation spec a step called
> 'wait_event' with a value defined to the wait event to look for would
> make more sense?

That'd be a much bigger change, since currently waiting is entirely implicit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] An isolation test for SERIALIZABLE READ ONLY DEFERRABLE

От

Thomas Munro

Дата:

05 января 2017 г., 03:41:01

On Thu, Jan 5, 2017 at 7:41 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Jan 4, 2017 at 2:17 AM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> On Wed, Jan 4, 2017 at 12:48 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> On Sun, Jan 1, 2017 at 4:38 AM, Thomas Munro
>>> <thomas.munro@enterprisedb.com> wrote:
>>>> To be able to do this, the patch modifies the isolation tester so that
>>>> it recognises wait_event SafeSnapshot.
>>>
>>> I'm not going to say that's unacceptable, but it's certainly not beautiful.
>>
>> Perhaps being able to define in an isolation spec a step called
>> 'wait_event' with a value defined to the wait event to look for would
>> make more sense?
>
> That'd be a much bigger change, since currently waiting is entirely implicit.

It's a bit of a strange case: we're indirectly waiting for other
backends' transactions to end, but it's not exactly a lock or even a
predicate lock: it's waiting for a time when we could run safely with
predicate locking disabled.  So it's not at all clear that
pg_blocking_pids should try to get its hands on the appropriate pids
(or if it even could).  Hence my attempt to do this using our
wonderful wait introspection.

I don't think putting the particular wait_event name into the test
spec would be very useful.  It's really there as a whitelist to be
conservative about excluding random waits caused by concurrent
autovacuum etc.  It just happens to have only one thing in the
whitelist for now, and I'm not sure what else would ever go in it...

-- 
Thomas Munro
http://www.enterprisedb.com

Re: [HACKERS] An isolation test for SERIALIZABLE READ ONLY DEFERRABLE

От

Michael Paquier

Дата:

11 января 2017 г., 10:41:24

On Thu, Jan 5, 2017 at 6:41 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> It's a bit of a strange case: we're indirectly waiting for other
> backends' transactions to end, but it's not exactly a lock or even a
> predicate lock: it's waiting for a time when we could run safely with
> predicate locking disabled.  So it's not at all clear that
> pg_blocking_pids should try to get its hands on the appropriate pids
> (or if it even could).  Hence my attempt to do this using our
> wonderful wait introspection.
>
> I don't think putting the particular wait_event name into the test
> spec would be very useful.  It's really there as a whitelist to be
> conservative about excluding random waits caused by concurrent
> autovacuum etc.  It just happens to have only one thing in the
> whitelist for now, and I'm not sure what else would ever go in it...

Do you think that expanding the wait query by default could be
intrusive for the other tests? I am wondering about such a white list
to generate false positives for the existing tests, including
out-of-core extensions, as all the tests now rely only on
pg_blocking_pids().
-- 
Michael

Re: [HACKERS] An isolation test for SERIALIZABLE READ ONLY DEFERRABLE

От

Robert Haas

Дата:

12 января 2017 г., 01:09:54

On Tue, Jan 10, 2017 at 11:41 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Thu, Jan 5, 2017 at 6:41 AM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> It's a bit of a strange case: we're indirectly waiting for other
>> backends' transactions to end, but it's not exactly a lock or even a
>> predicate lock: it's waiting for a time when we could run safely with
>> predicate locking disabled.  So it's not at all clear that
>> pg_blocking_pids should try to get its hands on the appropriate pids
>> (or if it even could).  Hence my attempt to do this using our
>> wonderful wait introspection.
>>
>> I don't think putting the particular wait_event name into the test
>> spec would be very useful.  It's really there as a whitelist to be
>> conservative about excluding random waits caused by concurrent
>> autovacuum etc.  It just happens to have only one thing in the
>> whitelist for now, and I'm not sure what else would ever go in it...
>
> Do you think that expanding the wait query by default could be
> intrusive for the other tests? I am wondering about such a white list
> to generate false positives for the existing tests, including
> out-of-core extensions, as all the tests now rely only on
> pg_blocking_pids().

It won't affect anything unless running at transaction isolation level
serializable with the "read only deferrable" option.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] An isolation test for SERIALIZABLE READ ONLY DEFERRABLE

От

Michael Paquier

Дата:

12 января 2017 г., 07:21:17

On Thu, Jan 12, 2017 at 4:09 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Jan 10, 2017 at 11:41 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> Do you think that expanding the wait query by default could be
>> intrusive for the other tests? I am wondering about such a white list
>> to generate false positives for the existing tests, including
>> out-of-core extensions, as all the tests now rely only on
>> pg_blocking_pids().
>
> It won't affect anything unless running at transaction isolation level
> serializable with the "read only deferrable" option.

Indeed as monitoring.sgml says. And that's now very likely close to
zero. It would be nice to add a comment in the patch to just mention
that. In short, I withdraw my concerns about this patch, the addition
of a test for repeatable read outweights the tweaks done in the
isolation tester. I am marking this as ready for committer, I have not
spotted issues with it.
-- 
Michael

Re: [HACKERS] An isolation test for SERIALIZABLE READ ONLY DEFERRABLE

От

Thomas Munro

Дата:

16 января 2017 г., 06:40:21

On Thu, Jan 12, 2017 at 2:21 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Thu, Jan 12, 2017 at 4:09 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Tue, Jan 10, 2017 at 11:41 PM, Michael Paquier
>> <michael.paquier@gmail.com> wrote:
>>> Do you think that expanding the wait query by default could be
>>> intrusive for the other tests? I am wondering about such a white list
>>> to generate false positives for the existing tests, including
>>> out-of-core extensions, as all the tests now rely only on
>>> pg_blocking_pids().
>>
>> It won't affect anything unless running at transaction isolation level
>> serializable with the "read only deferrable" option.
>
> Indeed as monitoring.sgml says. And that's now very likely close to
> zero. It would be nice to add a comment in the patch to just mention
> that. In short, I withdraw my concerns about this patch, the addition
> of a test for repeatable read outweights the tweaks done in the
> isolation tester. I am marking this as ready for committer, I have not
> spotted issues with it.

Thanks!  Aside from exercising the code, which is surely always a good
thing, I think these three tests are quite educational on their own
for those of us trying to learn about transaction isolation.

I also have longer term plans to show the first and third of them
running with the read-only transaction moved to a standby server.
Kevin Grittner gave me the idea of multi-server isolation tests when I
mentioned my WIP SERIALIZABLE DEFERRABLE-on-standbys patch off-list,
which prompted me to realise that our existing SSI tests aren't very
interesting for that because they all rely on the behaviour of
SERIALIZABLE, not SERIALIZABLE DEFERRABLE.  So I figured we'd better
have some tests that show show that working on a single node system
first.

Concerning the question of beauty, I thought of several ways for the
isolation tester to detect this type of waiting:

1.  Create a new function pg_waiting_for_safe_snapshot(pid) that would
acquire SerializableXactHashLock, do a linear search of active
SERIALIZABLEXACT structs (using FirstPredXact and NextPredXact)
looking for sact->pid == pid and then test sact->flags &
SXACT_FLAG_DEFERRABLE_WAITING.  Then change the isolation tester's
"waiting" query to add " OR pg_waiting_for_safe_snapshot($1)".

2.  Modify the existing pg_blocking_pids() function to detect this
type of waiting and figure out which other pids are responsible,
adding them to its current results.  That could work by first doing a
linear search to find the SERIALIZABLEXACT struct as above, and if its
SXACT_FLAG_DEFERRABLE_WAITING flag is set, then walking the list of
possibleUnsafeConflicts to find the pids responsible.  Then the
isolation tester wouldn't need to change at all.

3.  Create a new function pg_waiting_for_safe_snapshot_pids(),
separate from pg_blocking_pids(), to return the list of pids as above,
and modify the isolation tester to call this new function too.

4.  The wait_event based approach as proposed, looking at pg_stat_activity.

I couldn't think of any practical uses for the new facilities 1-3
outside this isolation test, which is why I didn't look into those
more intrusive approaches.  I admit that 4 is a bit clunky, but it's a
simple non-intrusive change and I can't think of any reason it would
give incorrect results.  Even though wait_event is updated and read
without memory synchronisation, as far as I know we can assume that
select(2) and WaitLatch contain full memory barriers so the isolation
tester will certainly see the SafeSnapshot value when it repeatedly
polls that query, and the value can't change again until the isolation
tester sees it, recognises it as a kind of waiting it knows about, and
moves on to running the step in the test.

Upon reflection, either 2 or 3 might be considered more beautiful than
4, depending on how others feel about extending the remit of the
existing pg_blocking_pid function.  I'd be happy to post a new patch
using one of those approaches if others feel it'd be better that way.

-- 
Thomas Munro
http://www.enterprisedb.com

Re: [HACKERS] An isolation test for SERIALIZABLE READ ONLY DEFERRABLE

От

Michael Paquier

Дата:

16 января 2017 г., 12:08:28

On Mon, Jan 16, 2017 at 9:40 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> I also have longer term plans to show the first and third of them
> running with the read-only transaction moved to a standby server.
> Kevin Grittner gave me the idea of multi-server isolation tests when I
> mentioned my WIP SERIALIZABLE DEFERRABLE-on-standbys patch off-list,

Being able to do that with the isolation tester has been mentioned a
coupled of times, particularly from 2ndQ folks who worked in BDR. I
think that it has been actually patched in this sense, so perhaps you
could reuse the work that has been done there.

> which prompted me to realise that our existing SSI tests aren't very
> interesting for that because they all rely on the behaviour of
> SERIALIZABLE, not SERIALIZABLE DEFERRABLE.  So I figured we'd better
> have some tests that show show that working on a single node system
> first.

Makes sense.

> Upon reflection, either 2 or 3 might be considered more beautiful than
> 4, depending on how others feel about extending the remit of the
> existing pg_blocking_pid function.  I'd be happy to post a new patch
> using one of those approaches if others feel it'd be better that way.

Well, either 1, 2 or 3 basically duplicate what the wait events
tracked via latch.c for SafeSnapshot are doing when a client scans
pg_stat_activity, so at the end even if that's not the most beautiful
thing I have seen, this does not seem worth making the backend more
complicated for a facility that we already have. What I am wondering
though is if it would be interesting to make the list of potential
wait events backends are waiting for at some transaction point
configurable in the isolation tester. For example say having as step a
way to switch between wait events to wait for and be able to make that
usable as part of a permutation. I am not sure if there are actual use
cases for such a fancy facility, so if things prove to become more
complicated in the future we could consider it. But honestly I am now
of the opinion that this time has not come yet.
-- 
Michael

Re: [HACKERS] An isolation test for SERIALIZABLE READ ONLY DEFERRABLE

От

Craig Ringer

Дата:

16 января 2017 г., 15:37:09

On 16 Jan. 2017 17:09, "Michael Paquier" <michael.paquier@gmail.com> wrote:

On Mon, Jan 16, 2017 at 9:40 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> I also have longer term plans to show the first and third of them
> running with the read-only transaction moved to a standby server.
> Kevin Grittner gave me the idea of multi-server isolation tests when I
> mentioned my WIP SERIALIZABLE DEFERRABLE-on-standbys patch off-list,

Being able to do that with the isolation tester has been mentioned a
coupled of times, particularly from 2ndQ folks who worked in BDR. I
think that it has been actually patched in this sense, so perhaps you
could reuse the work that has been done there.

Yep. See the bdr-pg/REL9_4_STABLE branch of the BDR repo, src/test/isolation .

I think Abhijit submitted a patch to the list too.

Re: [HACKERS] An isolation test for SERIALIZABLE READ ONLY DEFERRABLE

От

Thomas Munro

Дата:

16 января 2017 г., 17:04:43

On Mon, Jan 16, 2017 at 10:37 PM, Craig Ringer
<craig.ringer@2ndquadrant.com> wrote:
> On 16 Jan. 2017 17:09, "Michael Paquier" <michael.paquier@gmail.com> wrote:
>
> On Mon, Jan 16, 2017 at 9:40 AM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> I also have longer term plans to show the first and third of them
>> running with the read-only transaction moved to a standby server.
>> Kevin Grittner gave me the idea of multi-server isolation tests when I
>> mentioned my WIP SERIALIZABLE DEFERRABLE-on-standbys patch off-list,
>
> Being able to do that with the isolation tester has been mentioned a
> coupled of times, particularly from 2ndQ folks who worked in BDR. I
> think that it has been actually patched in this sense, so perhaps you
> could reuse the work that has been done there.
>
>
> Yep. See the bdr-pg/REL9_4_STABLE branch of the BDR repo, src/test/isolation
> .
>
> I think Abhijit submitted a patch to the list too.

Very nice.  That's almost exactly what I had in mind.  One thing that
jumps out at me from README.multinode is that it's surprising to see
conninfo strings in the actual spec file: I assumed that the isolation
tester would extend what it does today by accepting multiple conninfo
command line arguments, and I imagined that the spec info could say
something like SESSION "s3" CONNECTION 2.  I don't really care about
names vs numbers but I thought it might be nice if connections were
controlled separately from specs so that you could use the same spec
for tests that are run on one node and two node setups, perhaps by
making CONNECTION 2 fall back to the default connection if there is no
connection 2 provided on the command line.  For example, with
read-only-anomaly.spec (from the patch in this thread), the output
should be identical if you point s3 at a standby using syncrep and
remote_apply.  In future, if SERIALIZABLE DEFERRABLE is available on
standbys, then read-only-anomaly-3.spec should also work that way.
It'd be cool to find a way to express that without having to use a
separate spec/expected files for the multi-node version when the goal
is to show that it works the same as the single-node version.

-- 
Thomas Munro
http://www.enterprisedb.com

Re: [HACKERS] An isolation test for SERIALIZABLE READ ONLY DEFERRABLE

От

Michael Paquier

Дата:

31 января 2017 г., 10:48:39

On Thu, Jan 12, 2017 at 10:21 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Thu, Jan 12, 2017 at 4:09 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Tue, Jan 10, 2017 at 11:41 PM, Michael Paquier
>> <michael.paquier@gmail.com> wrote:
>>> Do you think that expanding the wait query by default could be
>>> intrusive for the other tests? I am wondering about such a white list
>>> to generate false positives for the existing tests, including
>>> out-of-core extensions, as all the tests now rely only on
>>> pg_blocking_pids().
>>
>> It won't affect anything unless running at transaction isolation level
>> serializable with the "read only deferrable" option.
>
> Indeed as monitoring.sgml says. And that's now very likely close to
> zero. It would be nice to add a comment in the patch to just mention
> that. In short, I withdraw my concerns about this patch, the addition
> of a test for repeatable read outweights the tweaks done in the
> isolation tester. I am marking this as ready for committer, I have not
> spotted issues with it.

Moved to CF 2017-03.
-- 
Michael

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: [HACKERS] An isolation test for SERIALIZABLE READ ONLY DEFERRABLE

Вложения