Обсуждение: Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.
Andres Freund <andres@anarazel.de> writes: > Allow Pin/UnpinBuffer to operate in a lockfree manner. This commit has broken buildfarm member gaur, and no doubt pademelon will be equally unhappy once it catches up to HEAD. The reason is that you've caused localbuf.c to perform a whole bunch of atomic operations on its buffer headers; and on machines that don't have native atomic ops, there's a spinlock underlying those; and you did not bother to ensure that appropriate SpinLockInit operations happen for local-buffer headers. (HPPA, possibly alone among supported platforms, does not think that SpinLockInit is equivalent to memset-to-zeroes.) While we could fix this by performing suitable SpinLockInit's on local-buffer headers, I have to think that that's fundamentally the wrong direction. The *entire* *point* of having local buffers is that they are not subject to concurrency overhead. So IMO, sticking atomic-ops calls into localbuf.c is broken on its face. regards, tom lane
Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.
От
Andres Freund
Дата:
On 2016-04-11 23:24:36 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > Allow Pin/UnpinBuffer to operate in a lockfree manner. > > This commit has broken buildfarm member gaur, and no doubt pademelon > will be equally unhappy once it catches up to HEAD. The reason is that > you've caused localbuf.c to perform a whole bunch of atomic operations > on its buffer headers; and on machines that don't have native atomic > ops, there's a spinlock underlying those; and you did not bother to > ensure that appropriate SpinLockInit operations happen for local-buffer > headers. (HPPA, possibly alone among supported platforms, does not > think that SpinLockInit is equivalent to memset-to-zeroes.) That's obviously borked, and need to be fixed. > While we could fix this by performing suitable SpinLockInit's on > local-buffer headers, I have to think that that's fundamentally the > wrong direction. The *entire* *point* of having local buffers is > that they are not subject to concurrency overhead. So IMO, sticking > atomic-ops calls into localbuf.c is broken on its face. Note that localbuf.c tries to be careful to only use pg_atomic_read_u32/pg_atomic_write_u32 - which don't have a concurrency overhead as they don't utilize atomic ops. The issue is likely that either Alexander or I somehow made MarkLocalBufferDirty() use pg_atomic_fetch_or_u32(), instead of the proper pg_atomic_read_u32()/pg_atomic_write_u32(). Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > The issue is likely that either Alexander or I somehow made > MarkLocalBufferDirty() use pg_atomic_fetch_or_u32(), instead of the > proper pg_atomic_read_u32()/pg_atomic_write_u32(). The stack trace I'm seeing is #5 0x7279cc in elog_finish (elevel=6, fmt=0x40057cf8 '\177' <repeats 200 times>...) at elog.c:1378 #6 0x5cecd8 in s_lock_stuck (p=0x402995b8, file=0x21bae0 "s_lock.c", line=92) at s_lock.c:81 #7 0x5cedd4 in perform_spin_delay (status=0x7b03b8c8) at s_lock.c:130 #8 0x5ced40 in s_lock (lock=0x6, file=0x20 <Address 0x20 out of bounds>, line=6) at s_lock.c:96 #9 0x53a4b0 in pg_atomic_compare_exchange_u32_impl (ptr=0x402995b8, expected=0x5c, newval=58982400) at atomics.c:122 #10 0x5a280c in MarkLocalBufferDirty (buffer=6) at ../../../../src/include/port/atomics/generic.h:224 #11 0x59bba0 in MarkBufferDirty (buffer=6) at bufmgr.c:1489 #12 0x2c9cd0 in heap_multi_insert (relation=0x401c41d0, tuples=0x40201888, ntuples=561, cid=0, options=0, bistate=0x40153128)at heapam.c:2760 regards, tom lane
Re: Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.
От
Andres Freund
Дата:
On 2016-04-11 23:41:10 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > The issue is likely that either Alexander or I somehow made > > MarkLocalBufferDirty() use pg_atomic_fetch_or_u32(), instead of the > > proper pg_atomic_read_u32()/pg_atomic_write_u32(). > > The stack trace I'm seeing is > > #5 0x7279cc in elog_finish (elevel=6, > fmt=0x40057cf8 '\177' <repeats 200 times>...) at elog.c:1378 > #6 0x5cecd8 in s_lock_stuck (p=0x402995b8, file=0x21bae0 "s_lock.c", line=92) > at s_lock.c:81 > #7 0x5cedd4 in perform_spin_delay (status=0x7b03b8c8) at s_lock.c:130 > #8 0x5ced40 in s_lock (lock=0x6, file=0x20 <Address 0x20 out of bounds>, > line=6) at s_lock.c:96 > #9 0x53a4b0 in pg_atomic_compare_exchange_u32_impl (ptr=0x402995b8, > expected=0x5c, newval=58982400) at atomics.c:122 > #10 0x5a280c in MarkLocalBufferDirty (buffer=6) > at ../../../../src/include/port/atomics/generic.h:224 Ok, so the theory above fits. Will fix (both initialization and use of pg_atomic_fetch_or_u32), and expand the documentation on why only atomic read/write are supposed to be used. Thanks, Andres
Andres Freund <andres@anarazel.de> writes: >>> The issue is likely that either Alexander or I somehow made >>> MarkLocalBufferDirty() use pg_atomic_fetch_or_u32(), instead of the >>> proper pg_atomic_read_u32()/pg_atomic_write_u32(). > Ok, so the theory above fits. Yah, especially in view of localbuf.c:297 ;-) > Will fix (both initialization and use of pg_atomic_fetch_or_u32), and > expand the documentation on why only atomic read/write are supposed to > be used. FWIW, I'd vote against adding a SpinLockInit there. What it would mostly do is prevent noticing future mistakes of the same ilk. It would be better no doubt if we didn't have to rely on a nearly-dead platform to detect this; but having such detection of a performance bug is better than having no detection. regards, tom lane
Re: Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.
От
Andres Freund
Дата:
On 2016-04-11 23:59:21 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > Will fix (both initialization and use of pg_atomic_fetch_or_u32), and > > expand the documentation on why only atomic read/write are supposed to > > be used. > > FWIW, I'd vote against adding a SpinLockInit there. Well, it'd not be a SpinLockInit, but a pg_atomic_init_u32(), but ... > What it would mostly > do is prevent noticing future mistakes of the same ilk. It would be > better no doubt if we didn't have to rely on a nearly-dead platform > to detect this; but having such detection of a performance bug is better > than having no detection. Ok, works for me as well. I guess it'd be useful to add a "modern" animal that disables spinlocks & atomics... - Andres
I wrote: > This commit has broken buildfarm member gaur, and no doubt pademelon > will be equally unhappy once it catches up to HEAD. Spoke too soon, actually: pademelon did not get as far as initdb. cc: "bufmgr.c", line 4032: error 1521: Incorrect initialization. cc: "bufmgr.c", line 4058: error 1521: Incorrect initialization. Apparently, assigning the result of init_spin_delay() is not as portable as you thought. Maybe convert that into init_spin_delay(SpinDelayStatus *target, whatever-the-other-arg-is) ? regards, tom lane
Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.
От
Andres Freund
Дата:
On 2016-04-12 00:32:13 -0400, Tom Lane wrote: > I wrote: > > This commit has broken buildfarm member gaur, and no doubt pademelon > > will be equally unhappy once it catches up to HEAD. > > Spoke too soon, actually: pademelon did not get as far as initdb. > > cc: "bufmgr.c", line 4032: error 1521: Incorrect initialization. > cc: "bufmgr.c", line 4058: error 1521: Incorrect initialization. > > Apparently, assigning the result of init_spin_delay() is not as portable > as you thought. Hm. I'm not sure why not though? Is it because delayStatus isn't static or global scope? > Maybe convert that into init_spin_delay(SpinDelayStatus *target, > whatever-the-other-arg-is) ? Too bad, my compiler generates a bit nicer code for the current version. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2016-04-12 00:32:13 -0400, Tom Lane wrote: >> Apparently, assigning the result of init_spin_delay() is not as portable >> as you thought. > Hm. I'm not sure why not though? Is it because delayStatus isn't static > or global scope? It looks like that compiler adheres to the C89 restriction that an initializer for an array or struct must contain only link-time-constant expressions, even if the target object is of dynamic scope. The macro works with a link-time-constant pointer argument, but not with one that must be evaluated at runtime. regards, tom lane
I wrote: > It looks like that compiler adheres to the C89 restriction that an > initializer for an array or struct must contain only link-time-constant > expressions, even if the target object is of dynamic scope. > The macro works with a link-time-constant pointer argument, but not > with one that must be evaluated at runtime. It strikes me that that means you could stick with this initialization method if you made the macro argument be a literal constant string name, like "buffer spinlock", and printed that rather than the address in s_lock_stuck. This would be different from what we do now, but not necessarily any less useful. I'm pretty dubious of the usefulness of the addresses you're printing right now, because most of them are indeed not the spinlock itself. regards, tom lane
Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.
От
Piotr Stefaniak
Дата:
On 2016-04-12 07:00, Andres Freund wrote: > On 2016-04-12 00:32:13 -0400, Tom Lane wrote: >> I wrote: >>> This commit has broken buildfarm member gaur, and no doubt pademelon >>> will be equally unhappy once it catches up to HEAD. >> >> Spoke too soon, actually: pademelon did not get as far as initdb. >> >> cc: "bufmgr.c", line 4032: error 1521: Incorrect initialization. >> cc: "bufmgr.c", line 4058: error 1521: Incorrect initialization. >> >> Apparently, assigning the result of init_spin_delay() is not as portable >> as you thought. > > Hm. I'm not sure why not though? Is it because delayStatus isn't static > or global scope? It's because C89 requires initializers for aggregate and union types to be constant expressions (3.5.7p3): > All the expressions in an initializer for an object that has static > storage duration or in an initializer list for an object that has > aggregate or union type shall be constant expressions. C99 removes this constraint (6.7.8p4): > All the expressions in an initializer for an object that has static storage duration shall be > constant expressions or string literals.
Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.
От
Andres Freund
Дата:
On 2016-04-12 11:52:01 -0400, Tom Lane wrote: > I wrote: > > It looks like that compiler adheres to the C89 restriction that an > > initializer for an array or struct must contain only link-time-constant > > expressions, even if the target object is of dynamic scope. > > The macro works with a link-time-constant pointer argument, but not > > with one that must be evaluated at runtime. > > It strikes me that that means you could stick with this initialization > method if you made the macro argument be a literal constant string name, > like "buffer spinlock", and printed that rather than the address in > s_lock_stuck. This would be different from what we do now, but not > necessarily any less useful. I'm not sure anybody really benefits from those addresses; I guess the idea was that they'd allow you to figure out which exact spinlock got stuck; file + line doesn't necessarily help there. But it doesn't seem super useful, ASLR makes the addesses unpredictable, so you need a core file anyway - in which case you can just walk the stack. So I think I'm on board with replacing the argument; although I'm wondering if we shouldn't just remove it entirely, rather than replacing it with a string that's likely just going to duplicate the file/line information. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2016-04-12 11:52:01 -0400, Tom Lane wrote: >> It strikes me that that means you could stick with this initialization >> method if you made the macro argument be a literal constant string name, >> like "buffer spinlock", and printed that rather than the address in >> s_lock_stuck. This would be different from what we do now, but not >> necessarily any less useful. > I'm not sure anybody really benefits from those addresses; I guess the > idea was that they'd allow you to figure out which exact spinlock got > stuck; file + line doesn't necessarily help there. But it doesn't seem > super useful, ASLR makes the addesses unpredictable, so you need a core > file anyway - in which case you can just walk the stack. True. > So I think I'm on board with replacing the argument; although I'm > wondering if we shouldn't just remove it entirely, rather than replacing > it with a string that's likely just going to duplicate the file/line > information. Good point, it would be absolutely duplicative. What I'd suggest, actually, is that we convert this to the same info as what elog provides (file+line+function name). The line number is often hard to decipher if you're not sure exactly what PG version you're dealing with, so I think having the function name would be a sufficient advance to rebut any claim that we'd gone backwards. regards, tom lane
Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.
От
Andres Freund
Дата:
On 2016-04-13 14:16:15 -0400, Tom Lane wrote: > Good point, it would be absolutely duplicative. What I'd suggest, > actually, is that we convert this to the same info as what elog > provides (file+line+function name). Heh, I was wondering the same aftering sending the last email. Will do that then. Greetings, Andres Freund
Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.
От
Andres Freund
Дата:
On 2016-04-13 11:20:19 -0700, Andres Freund wrote: > On 2016-04-13 14:16:15 -0400, Tom Lane wrote: > > Good point, it would be absolutely duplicative. What I'd suggest, > > actually, is that we convert this to the same info as what elog > > provides (file+line+function name). > > Heh, I was wondering the same aftering sending the last email. Will do > that then. Pushed. Let's see what pademelon says...
Andres Freund <andres@anarazel.de> writes: > On 2016-04-13 11:20:19 -0700, Andres Freund wrote: >> Heh, I was wondering the same aftering sending the last email. Will do >> that then. > Pushed. Let's see what pademelon says... I lit off a run, but it'll be a few hours till we have results... regards, tom lane
Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.
От
Andres Freund
Дата:
On 2016-04-13 20:09:46 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2016-04-13 11:20:19 -0700, Andres Freund wrote: > >> Heh, I was wondering the same aftering sending the last email. Will do > >> that then. > > > Pushed. Let's see what pademelon says... > > I lit off a run, but it'll be a few hours till we have results... In hindsight obviously, this isn't sufficient because s_lock reports the caller's file/line/function, not __FILE__ etc. So I guess we're left moving initialization to an inline function - will do so tomorrow. I'm also putting up an animal with clang that uses CFLAGS='-std=c89 -Wc99-extensions -Werror=c99-extensions' which actually catches this. I'd previously tested with gcc and "-Wc90-c99-compat -Werror=c90-c99-compat -Wno-long-long -Wno-error=long-long -std=c90", but that doesn't catch that error. (both find a couple trailing commas in enums, will fix) - Andres
Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.
От
Andres Freund
Дата:
On 2016-04-13 23:31:21 -0700, Andres Freund wrote: > I'm also putting up an animal with clang that uses > CFLAGS='-std=c89 -Wc99-extensions -Werror=c99-extensions' > which actually catches this. Hm. Doing so I found the following in 9.3: /home/andres/src/postgresql-9.3/src/bin/pg_dump/parallel.c:561:23: error: initializer for aggregate is not a compile-timeconstant [-Werror,-Wc99-extensions] int pipefd[2] = {pipeMW[PIPE_READ],pipeWM[PIPE_WRITE]}; ^~~~~~~~~~~~~~~~~ which is, afaics, the same class of issue we're hitting on master right now. I apparently fixed that, via Robert, back in 59202fae0. I'd planned to put up that animal (mylodon) for all branches, but given that that's been unfixed for years I'm not sure. What do you think? - Andres
Andres Freund <andres@anarazel.de> writes: > On 2016-04-13 23:31:21 -0700, Andres Freund wrote: >> I'm also putting up an animal with clang that uses >> CFLAGS='-std=c89 -Wc99-extensions -Werror=c99-extensions' >> which actually catches this. > Hm. Doing so I found the following in 9.3: > /home/andres/src/postgresql-9.3/src/bin/pg_dump/parallel.c:561:23: error: initializer for aggregate is not a compile-timeconstant [-Werror,-Wc99-extensions] > int pipefd[2] = {pipeMW[PIPE_READ], pipeWM[PIPE_WRITE]}; > ^~~~~~~~~~~~~~~~~ > which is, afaics, the same class of issue we're hitting on master right > now. I apparently fixed that, via Robert, back in 59202fae0. I'd planned > to put up that animal (mylodon) for all branches, but given that that's > been unfixed for years I'm not sure. What do you think? Huh. I just tried pademelon's compiler on the 9.3 branch, and sure enough it spits up on that: cc: "parallel.c", line 561: error 1521: Incorrect initialization. cc: "parallel.c", line 561: error 1521: Incorrect initialization. make: *** [parallel.o] Error 1 I am not sure how come I missed seeing that at the time, but I certainly would have complained about it if I had chanced to try that compiler later in 9.3 development. I stopped using that machine actively in mid-2013, though, so maybe I just failed to try that compiler for a period of a few months. IIRC, when I got around to setting up pademelon as a buildfarm animal, I didn't have it building anything older than 9.4, so I missed seeing it on that end too. I'd vote for back-patching 59202fae0 and enforcing c89 compatibility all along the line. regards, tom lane
Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.
От
Andres Freund
Дата:
On 2016-04-14 22:46:01 -0400, Tom Lane wrote: > > Hm. Doing so I found the following in 9.3: > > > /home/andres/src/postgresql-9.3/src/bin/pg_dump/parallel.c:561:23: error: initializer for aggregate is not a compile-timeconstant [-Werror,-Wc99-extensions] > > int pipefd[2] = {pipeMW[PIPE_READ], pipeWM[PIPE_WRITE]}; > > ^~~~~~~~~~~~~~~~~ > > > which is, afaics, the same class of issue we're hitting on master right > > now. I apparently fixed that, via Robert, back in 59202fae0. I'd planned > > to put up that animal (mylodon) for all branches, but given that that's > > been unfixed for years I'm not sure. What do you think? > > Huh. I just tried pademelon's compiler on the 9.3 branch, and sure > enough it spits up on that: > > cc: "parallel.c", line 561: error 1521: Incorrect initialization. > cc: "parallel.c", line 561: error 1521: Incorrect initialization. > make: *** [parallel.o] Error 1 > > I am not sure how come I missed seeing that at the time, but I certainly > would have complained about it if I had chanced to try that compiler > later in 9.3 development. I stopped using that machine actively in > mid-2013, though, so maybe I just failed to try that compiler for a > period of a few months. IIRC, when I got around to setting up pademelon > as a buildfarm animal, I didn't have it building anything older than 9.4, > so I missed seeing it on that end too. > > I'd vote for back-patching 59202fae0 and enforcing c89 compatibility > all along the line. Done. Will be a bit till all branches have reported (that VM's getting busy, it now runs four animals, one of them skink which is kinda expensive CPU wise). But I checked before that I can do a basic compile with the respective flags all the way to 9.1 (after pushing the enum fixes). Andres