Обсуждение: Improving the heapgetpage function improves performance in common scenarios
Hi In the function heapgetpage. If a table is not updated very frequently. Many actions in tuple loops are superfluous. For all_visible pages, loctup does not need to be assigned, nor does the "valid" variable. CheckForSerializableConflictOutNeeded from HeapCheckForSerializableConflictOut function, it only need to inspect at the beginning of the cycle only once. Using vtune you can clearly see the result (attached heapgetpage.jpg). So by splitting the loop logic into two parts, the vtune results show significant improvement (attached heapgetpage-allvis.jpg). The test data uses TPC-H's table "orders" with a scale=20, 30 million rows. Quan Zongliang
Вложения
Re: Improving the heapgetpage function improves performance in common scenarios
От
John Naylor
Дата:
On Thu, Aug 24, 2023 at 5:55 PM Quan Zongliang <quanzongliang@yeah.net> wrote:
> In the function heapgetpage. If a table is not updated very frequently.
> Many actions in tuple loops are superfluous. For all_visible pages,
> loctup does not need to be assigned, nor does the "valid" variable.
> CheckForSerializableConflictOutNeeded from
> HeapCheckForSerializableConflictOut function, it only need to inspect at
Thanks for submitting! A few weeks before this, there was another proposal, which specializes code for all paths, not just one. That patch also does so without duplicating the loop:
> Many actions in tuple loops are superfluous. For all_visible pages,
> loctup does not need to be assigned, nor does the "valid" variable.
> CheckForSerializableConflictOutNeeded from
> HeapCheckForSerializableConflictOut function, it only need to inspect at
Thanks for submitting! A few weeks before this, there was another proposal, which specializes code for all paths, not just one. That patch also does so without duplicating the loop:
https://www.postgresql.org/message-id/20230716015656.xjvemfbp5fysjiea@awork3.anarazel.de
> the beginning of the cycle only once. Using vtune you can clearly see
> the result (attached heapgetpage.jpg).
>
> the beginning of the cycle only once. Using vtune you can clearly see
> the result (attached heapgetpage.jpg).
>
> So by splitting the loop logic into two parts, the vtune results show
> significant improvement (attached heapgetpage-allvis.jpg).
For future reference, it's not clear at all from the screenshots what the improvement will be for the user. In the above thread, the author shares testing methodology as well as timing measurements. This is useful for reproducibilty, as well as convincing others that the change is important.
--
John Naylor
EDB: http://www.enterprisedb.com
> significant improvement (attached heapgetpage-allvis.jpg).
For future reference, it's not clear at all from the screenshots what the improvement will be for the user. In the above thread, the author shares testing methodology as well as timing measurements. This is useful for reproducibilty, as well as convincing others that the change is important.
--
John Naylor
EDB: http://www.enterprisedb.com
Re: Improving the heapgetpage function improves performance in common scenarios
От
Quan Zongliang
Дата:
On 2023/9/5 16:15, John Naylor wrote: > > On Thu, Aug 24, 2023 at 5:55 PM Quan Zongliang <quanzongliang@yeah.net > <mailto:quanzongliang@yeah.net>> wrote: > > > In the function heapgetpage. If a table is not updated very frequently. > > Many actions in tuple loops are superfluous. For all_visible pages, > > loctup does not need to be assigned, nor does the "valid" variable. > > CheckForSerializableConflictOutNeeded from > > HeapCheckForSerializableConflictOut function, it only need to inspect at > > Thanks for submitting! A few weeks before this, there was another > proposal, which specializes code for all paths, not just one. That patch > also does so without duplicating the loop: > > https://www.postgresql.org/message-id/20230716015656.xjvemfbp5fysjiea@awork3.anarazel.de <https://www.postgresql.org/message-id/20230716015656.xjvemfbp5fysjiea@awork3.anarazel.de> > Nice patch. I'm sorry I didn't notice it before. > > the beginning of the cycle only once. Using vtune you can clearly see > > the result (attached heapgetpage.jpg). > > > > So by splitting the loop logic into two parts, the vtune results show > > significant improvement (attached heapgetpage-allvis.jpg). > > For future reference, it's not clear at all from the screenshots what > the improvement will be for the user. In the above thread, the author > shares testing methodology as well as timing measurements. This is > useful for reproducibilty, as well as convincing others that the change > is important. > Here's how I test it EXPLAIN ANALYZE SELECT * FROM orders; Maybe the test wasn't good enough. Although the modified optimal result looks good. Because it fluctuates a lot. It's hard to compare. The results of vtune are therefore used. My patch is mainly to eliminate: 1, Assignment of "loctup" struct variable (in vtune you can see that these 4 lines have a significant overhead: 0.4 1.0 0.2 0.4). 2. Assignment of the "valid" variable.(overhead 0.6) 3. HeapCheckForSerializableConflictOut function call.(overhead 0.6) Although these are not the same overhead from test to test. But all are too obvious to ignore. The screenshots are mainly to show the three improvements mentioned above. I'll also try Andres Freund's test method next. > -- > John Naylor > EDB: http://www.enterprisedb.com <http://www.enterprisedb.com>
Re: Improving the heapgetpage function improves performance in common scenarios
От
John Naylor
Дата:
On Tue, Sep 5, 2023 at 4:27 PM Quan Zongliang <quanzongliang@yeah.net> wrote:
> Here's how I test it
> EXPLAIN ANALYZE SELECT * FROM orders;
Note that EXPLAIN ANALYZE has quite a bit of overhead, so it's not good for these kinds of tests.
> I'll also try Andres Freund's test method next.
Commit f691f5b80a85 from today removes another source of overhead in this function, so I suggest testing against that, if you wish to test again.
--
John Naylor
EDB: http://www.enterprisedb.com
Commit f691f5b80a85 from today removes another source of overhead in this function, so I suggest testing against that, if you wish to test again.
--
John Naylor
EDB: http://www.enterprisedb.com
Re: Improving the heapgetpage function improves performance in common scenarios
От
Quan Zongliang
Дата:
On 2023/9/5 18:46, John Naylor wrote: > > On Tue, Sep 5, 2023 at 4:27 PM Quan Zongliang <quanzongliang@yeah.net > <mailto:quanzongliang@yeah.net>> wrote: > > > Here's how I test it > > EXPLAIN ANALYZE SELECT * FROM orders; > > Note that EXPLAIN ANALYZE has quite a bit of overhead, so it's not good > for these kinds of tests. > > > I'll also try Andres Freund's test method next. > > Commit f691f5b80a85 from today removes another source of overhead in > this function, so I suggest testing against that, if you wish to test again. > Test with the latest code of the master branch, see the attached results. If not optimized(--enable-debug CFLAGS='-O0'), there is a clear difference. When the compiler does the optimization, the performance is similar. I think the compiler does a good enough optimization with "pg_attribute_always_inline" and the last two constant parameters when calling heapgetpage_collect. > -- > John Naylor > EDB: http://www.enterprisedb.com <http://www.enterprisedb.com>
Вложения
Re: Improving the heapgetpage function improves performance in common scenarios
От
Quan Zongliang
Дата:
On 2023/9/6 15:50, Quan Zongliang wrote: > > > On 2023/9/5 18:46, John Naylor wrote: >> >> On Tue, Sep 5, 2023 at 4:27 PM Quan Zongliang <quanzongliang@yeah.net >> <mailto:quanzongliang@yeah.net>> wrote: >> >> > Here's how I test it >> > EXPLAIN ANALYZE SELECT * FROM orders; >> >> Note that EXPLAIN ANALYZE has quite a bit of overhead, so it's not >> good for these kinds of tests. >> >> > I'll also try Andres Freund's test method next. >> >> Commit f691f5b80a85 from today removes another source of overhead in >> this function, so I suggest testing against that, if you wish to test >> again. >> > Test with the latest code of the master branch, see the attached results. > > If not optimized(--enable-debug CFLAGS='-O0'), there is a clear > difference. When the compiler does the optimization, the performance is > similar. I think the compiler does a good enough optimization with > "pg_attribute_always_inline" and the last two constant parameters when > calling heapgetpage_collect. > Add a note. The first execution time of an attachment is not calculated in the average. > >> -- >> John Naylor >> EDB: http://www.enterprisedb.com <http://www.enterprisedb.com>
Re: Improving the heapgetpage function improves performance in common scenarios
От
John Naylor
Дата:
On Wed, Sep 6, 2023 at 2:50 PM Quan Zongliang <quanzongliang@yeah.net> wrote:
> If not optimized(--enable-debug CFLAGS='-O0'), there is a clear
> difference. When the compiler does the optimization, the performance is
> similar. I think the compiler does a good enough optimization with
> "pg_attribute_always_inline" and the last two constant parameters when
> calling heapgetpage_collect.
So as we might expect, more specialization (Andres' patch) has no apparent downsides in this workload. (While I'm not sure of the point of testing at -O0, I think we can conclude that less-bright compilers will show some improvement with either patch.)
If you agree, do you want to withdraw your patch from the commit fest?
Re: Improving the heapgetpage function improves performance in common scenarios
От
Quan Zongliang
Дата:
On 2023/9/6 17:07, John Naylor wrote: > > On Wed, Sep 6, 2023 at 2:50 PM Quan Zongliang <quanzongliang@yeah.net > <mailto:quanzongliang@yeah.net>> wrote: > > > If not optimized(--enable-debug CFLAGS='-O0'), there is a clear > > difference. When the compiler does the optimization, the performance is > > similar. I think the compiler does a good enough optimization with > > "pg_attribute_always_inline" and the last two constant parameters when > > calling heapgetpage_collect. > > So as we might expect, more specialization (Andres' patch) has no > apparent downsides in this workload. (While I'm not sure of the point of > testing at -O0, I think we can conclude that less-bright compilers will > show some improvement with either patch.) > > If you agree, do you want to withdraw your patch from the commit fest? > Ok. > -- > John Naylor > EDB: http://www.enterprisedb.com <http://www.enterprisedb.com>