Обсуждение: [PATCH] heap_insert() and heap_update() optimization
Hello, hackers
I suggest the small attached patch that gives a bit of heap_insert() and heap_update() optimization
by reducing calls of BufferGetPage(buffer) into them.
I measured call time of these:
heap_insert(): avg origin 13394 ns, avg patched 12685 ns; perf increases +5.59%
heap_update(): avg origin 15728 ns, avg patched 13936 ns; perf increases +11.39%
This can be notable when there are handling many rows.
--
Regards,
Andrew K.
Вложения
Hi, On 2018-10-16 11:28:17 +0300, Andrey Klychkov wrote: > I suggest the small attached patch that gives a bit of heap_insert() and heap_update() optimization > by reducing calls of BufferGetPage(buffer) into them. > I measured call time of these: > heap_insert(): avg origin 13394 ns, avg patched 12685 ns; perf increases +5.59% > heap_update(): avg origin 15728 ns, avg patched 13936 ns; perf increases +11.39% > This can be notable when there are handling many rows. Interesting. That's with an optimized build, or an assertion build? Wonder what precisely prevents the optimizer to recognize BufferGetPage() with a constant argument will always be the same result. I assume it's that it doesn't recognize that BufferBlocks can't change across other function calls? Might also be the pointer math, or the if block... Wonder if we could force the compiler's hand by making BufferGetPage an inline function and decorating it with __attribute__((pure)) or such. I see little reason to not apply what you have here, but there's a lot of other places that access buffers... Greetings, Andres Freund
> Interesting. That's with an optimized build, or an assertion build?
Hello,
That was an optimized build.
However I've just done some extra time tests and didn't notice so significant difference as early.
Even more - avg origin 1272, avg patched 1303.
Maybe there was the autovacuum / analyze / checkpoint or something else that could influence on the yesterday tests.
Thanks a lot for explanation!
--
Regards,
Andrey Klychkov
Hello,
That was an optimized build.
However I've just done some extra time tests and didn't notice so significant difference as early.
Even more - avg origin 1272, avg patched 1303.
Maybe there was the autovacuum / analyze / checkpoint or something else that could influence on the yesterday tests.
Thanks a lot for explanation!
Вторник, 16 октября 2018, 22:57 +03:00 от Andres Freund <andres@anarazel.de>:
Hi,
On 2018-10-16 11:28:17 +0300, Andrey Klychkov wrote:
> I suggest the small attached patch that gives a bit of heap_insert() and heap_update() optimization
> by reducing calls of BufferGetPage(buffer) into them.
> I measured call time of these:
> heap_insert(): avg origin 13394 ns, avg patched 12685 ns; perf increases +5.59%
> heap_update(): avg origin 15728 ns, avg patched 13936 ns; perf increases +11.39%
> This can be notable when there are handling many rows.
Interesting. That's with an optimized build, or an assertion build?
Wonder what precisely prevents the optimizer to recognize
BufferGetPage() with a constant argument will always be the same
result. I assume it's that it doesn't recognize that BufferBlocks can't
change across other function calls? Might also be the pointer math, or
the if block...
Wonder if we could force the compiler's hand by making BufferGetPage an
inline function and decorating it with __attribute__((pure)) or such.
I see little reason to not apply what you have here, but there's a lot
of other places that access buffers...
Greetings,
Andres Freund
On 2018-10-16 11:28:17 +0300, Andrey Klychkov wrote:
> I suggest the small attached patch that gives a bit of heap_insert() and heap_update() optimization
> by reducing calls of BufferGetPage(buffer) into them.
> I measured call time of these:
> heap_insert(): avg origin 13394 ns, avg patched 12685 ns; perf increases +5.59%
> heap_update(): avg origin 15728 ns, avg patched 13936 ns; perf increases +11.39%
> This can be notable when there are handling many rows.
Interesting. That's with an optimized build, or an assertion build?
Wonder what precisely prevents the optimizer to recognize
BufferGetPage() with a constant argument will always be the same
result. I assume it's that it doesn't recognize that BufferBlocks can't
change across other function calls? Might also be the pointer math, or
the if block...
Wonder if we could force the compiler's hand by making BufferGetPage an
inline function and decorating it with __attribute__((pure)) or such.
I see little reason to not apply what you have here, but there's a lot
of other places that access buffers...
Greetings,
Andres Freund
--
Regards,
Andrey Klychkov
Hi, On 2018-10-17 09:48:19 +0300, Andrey Klychkov wrote: > > Interesting. That's with an optimized build, or an assertion build? > > Hello, > That was an optimized build. > > However I've just done some extra time tests and didn't notice so significant difference as early. > Even more - avg origin 1272, avg patched 1303. > > Maybe there was the autovacuum / analyze / checkpoint or something else that could influence on the yesterday tests. Probably worth looking at the generated code. I can see some difference, but what you measured seemed pretty large. Greetings, Andres Freund