Обсуждение: An implementation of multi-key sort

Поиск
Список
Период
Сортировка

An implementation of multi-key sort

От
Wang Yao
Дата:
Hi hackers,

I'd submit an implementation of multi-key sort for review. Please see the
code as attachment. Thanks for your reponse in advance.


Overview
--------

MKsort (multi-key sort) is an alternative of standard qsort algorithm,
which has better performance for particular sort scenarios, i.e. the data
set has multiple keys to be sorted.

The implementation is based on the paper:
Jon L. Bentley and Robert Sedgewick, "Fast Algorithms for Sorting and
Searching Strings", Jan 1997 [1]

MKsort is applied only for tuple sort by the patch. Theoretically it can
be applied for general-purpose sort scenario when there are multiple sort
keys available, but it is relatively difficult in practice because kind of
unique interface is needed to manipulate the keys. So I limit the usage of
mksort to sort SortTuple.

Comparing to classic quick sort, it can get significant performance
improvement once multiple keys are available. A rough test shows it got
~129% improvement than qsort for ORDER BY on 6 keys, and ~52% for CREATE
INDEX on the same data set. (See more details in section "Performance
Test")

Author: Yao Wang <yaowangm@outlook.com>
Co-author: Hongxu Ma <interma@outlook.com>

Scope
-----

The interface of mksort is pretty simple: in tuplesort_sort_memtuples(),
mksort_tuple() is invoked instead of qsort_tuple() if mksort is applicable.
The major logic in mksort_tuple() is to apply mksort algorithm on
SortTuple, and kind of callback mechanism is used to handle
sort-variant-specific issue, e.g. comparing different datums, like
qsort_tuple() does. It also handles the complexity of "abbreviated keys".

A small difference from classic mksort algorithm is: for IndexTuple, when
all the columns are equal, an additional comparing based on ItemPointer
is performed to determine the order. It is to make the result consistent
to existing qsort.

I did consider about implementing mksort by the approach of kind of
template mechanism like qsort (see sort_template.h), but it seems
unnecessary because all concrete tuple types need to be handled are
derived from SortTuple. Use callback to isolate type specific features
is good enough.

Note that not all tuple types are supported by mksort. Please see the
comments inside tuplesort_sort_memtuples().

Test Cases
----------

The changes of test cases include:

* Generally, mksort should generate result exactly same to qsort. However
some test cases don't. The reason is that SQL doesn't specify order on
all possible columns, e.g. "select c1, c2 from t1 order by c1" will
generate different results between mksort/qsort when c1 values are equal,
and the solution is to order c2 as well ("select c1, c2 from t1 order by
c1, c2"). (e.g. geometry)
* Some cases need to be updated to display the new sort method "multi-key
sort" in explain result. (e.g. incremental_sort)
* regress/tuplesort was updated with new cases to cover some scenarios of
mksort.

Performance Test
----------------

The script I used to configure the build:

CFLAGS="-O3 -fargument-noalias-global -fno-omit-frame-pointer -g"
./configure --prefix=$PGHOME --with-pgport=5432 --with-perl --with-openssl
--with-python --with-pam --with-blocksize=16 --with-wal-blocksize=16
--with-perl --enable-tap-tests --with-gssapi --with-ldap

I used the script for a rough test for ORDER BY:

\timing on
create table t1 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 varchar(100));
insert into t1 values (generate_series(1,499999), 0, 0, 0, 0,
  'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb');
update t1 set c2 = c1 % 100, c3 = c1 % 50, c4 = c1 % 10, c5 = c1 % 3;
update t1 set c6 = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb'
|| (c1 % 5)::text;

-- Use a large work mem to ensure the entire sort happens in memory
set work_mem='1GB';

-- switch between qsort/mksort
set enable_mk_sort=off;

explain analyze select c1 from t1 order by c6, c5, c4, c3, c2, c1;

Results:

mksort:
1341.283 ms (00:01.341)
1379.098 ms (00:01.379)
1369.868 ms (00:01.370)

qsort:
3137.277 ms (00:03.137)
3147.771 ms (00:03.148)
3131.887 ms (00:03.132)

The perf improvement is ~129%.

Another perf test for CREATE INDEX:

create index idx_t1_mk on t3 (c6, c5, c4, c3, c2, c1);

Results:

mksort:
1147.207 ms (00:01.147)
1200.501 ms (00:01.201)
1235.657 ms (00:01.236)

Qsort:
1852.957 ms (00:01.853)
1824.209 ms (00:01.824)
1808.781 ms (00:01.809)

The perf improvement is ~52%.

Another test is to use one of queries of TPC-H:

set work_mem='1GB';

-- query rewritten from TPCH-Q1, and there are 6001215 rows in lineitem
explain analyze select
 l_returnflag,l_linestatus,l_quantity,l_shipmode
from
 lineitem
where
 l_shipdate <= date'1998-12-01' - interval '65 days'
order by
 l_returnflag,l_linestatus,l_quantity,l_shipmode;

Result:

Qsort:
14582.626 ms
14524.188 ms
14524.111 ms

mksort:
11390.891 ms
11647.065 ms
11546.791 ms

The perf improvement is ~25.8%.

[1] https://www.cs.tufts.edu/~nr/cs257/archive/bob-sedgewick/fast-strings.pdf
[2] https://www.tpc.org/tpch/


Thanks,

Yao Wang
Вложения

Re: An implementation of multi-key sort

От
Heikki Linnakangas
Дата:
On 22/05/2024 15:48, Wang Yao wrote:
> Comparing to classic quick sort, it can get significant performance
> improvement once multiple keys are available. A rough test shows it got
> ~129% improvement than qsort for ORDER BY on 6 keys, and ~52% for CREATE
> INDEX on the same data set. (See more details in section "Performance
> Test")

Impressive. Did you test the performance of the cases where MK-sort 
doesn't help, to check if there is a performance regression?

-- 
Heikki Linnakangas
Neon (https://neon.tech)




回复: An implementation of multi-key sort

От
Wang Yao
Дата:
No obvious perf regression is expected because PG will follow original
qsort code path when mksort is disabled. For the case, the only extra
cost is the check in tuplesort_sort_memtuples() to enter mksort code path.

It's also proved by the experiment today:

Mksort disabled:
2949.287 ms
2955.258 ms
2947.262 ms

No mksort code:
2947.094 ms
2946.419 ms
2953.215 ms

Almost the same.

I also updated code with small enhancements. Please see the latest code
as attachment.


Thanks,

Yao Wang

发件人: Heikki Linnakangas <hlinnaka@iki.fi>
发送时间: 2024年5月22日 23:29
收件人: Wang Yao <yaowangm@outlook.com>; PostgreSQL Hackers <pgsql-hackers@postgresql.org>
抄送: interma@outlook.com <interma@outlook.com>
主题: Re: An implementation of multi-key sort
 
On 22/05/2024 15:48, Wang Yao wrote:
> Comparing to classic quick sort, it can get significant performance
> improvement once multiple keys are available. A rough test shows it got
> ~129% improvement than qsort for ORDER BY on 6 keys, and ~52% for CREATE
> INDEX on the same data set. (See more details in section "Performance
> Test")

Impressive. Did you test the performance of the cases where MK-sort
doesn't help, to check if there is a performance regression?

--
Heikki Linnakangas
Neon (https://neon.tech)

Вложения

Re: 回复: An implementation of multi-key sort

От
Heikki Linnakangas
Дата:
On 23/05/2024 15:39, Wang Yao wrote:
> No obvious perf regression is expected because PG will follow original
> qsort code path when mksort is disabled. For the case, the only extra
> cost is the check in tuplesort_sort_memtuples() to enter mksort code path.

And what about the case the mksort is enabled, but it's not effective 
because all leading keys are different?

-- 
Heikki Linnakangas
Neon (https://neon.tech)




Re: 回复: An implementation of multi-key sort

От
Yao Wang
Дата:
When all leading keys are different, mksort will finish the entire sort at the
first sort key and never touch other keys. For the case, mksort falls back to
kind of qsort actually.

I created another data set with distinct values in all sort keys:

create table t2 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 varchar(100));
insert into t2 values (generate_series(1,499999), 0, 0, 0, 0, '');
update t2 set c2 = 999990 - c1, c3 = 999991 - c1, c4 = 999992 - c1, c5
= 999993 - c1;
update t2 set c6 = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb'
  || (999994 - c1)::text;
explain analyze select c1 from t2 order by c6, c5, c4, c3, c2, c1;

Results:

MKsort:
12374.427 ms
12528.068 ms
12554.718 ms

qsort:
12251.422 ms
12279.938 ms
12280.254 ms

MKsort is a bit slower than qsort, which can be explained by extra
checks of MKsort.

Yao Wang

On Fri, May 24, 2024 at 8:36 PM Wang Yao <yaowangm@outlook.com> wrote:
>
>
>
> 获取Outlook for Android
> ________________________________
> From: Heikki Linnakangas <hlinnaka@iki.fi>
> Sent: Thursday, May 23, 2024 8:47:29 PM
> To: Wang Yao <yaowangm@outlook.com>; PostgreSQL Hackers <pgsql-hackers@postgresql.org>
> Cc: interma@outlook.com <interma@outlook.com>
> Subject: Re: 回复: An implementation of multi-key sort
>
> On 23/05/2024 15:39, Wang Yao wrote:
> > No obvious perf regression is expected because PG will follow original
> > qsort code path when mksort is disabled. For the case, the only extra
> > cost is the check in tuplesort_sort_memtuples() to enter mksort code path.
>
> And what about the case the mksort is enabled, but it's not effective
> because all leading keys are different?
>
> --
> Heikki Linnakangas
> Neon (https://neon.tech)
>

--
This electronic communication and the information and any files transmitted
with it, or attached to it, are confidential and are intended solely for
the use of the individual or entity to whom it is addressed and may contain
information that is confidential, legally privileged, protected by privacy
laws, or otherwise restricted from disclosure to anyone else. If you are
not the intended recipient or the person responsible for delivering the
e-mail to the intended recipient, you are hereby notified that any use,
copying, distributing, dissemination, forwarding, printing, or copying of
this e-mail is strictly prohibited. If you received this e-mail in error,
please return the e-mail to the sender, delete it from your computer, and
destroy any printed copy of it.



Re: 回复: An implementation of multi-key sort

От
Yao Wang
Дата:
I added two optimizations to mksort which exist on qsort_tuple():

1. When selecting pivot, always pick the item in the middle of array but
not by random. Theoretically it has the same effect to old approach, but
it can eliminate some unstable perf test results, plus a bit perf benefit by
removing random value generator.
2. Always check whether the array is ordered already, and return
immediately if it is. The pre-ordered check requires extra cost and
impacts perf numbers on some data sets, but can improve perf
significantly on other data sets.

By now, mksort has perf results equal or better than qsort on all data
sets I ever used.

I also updated test case. Please see v3 code as attachment.

Perf test results:

Data set 1 (with mass duplicate values):
-----------------------------------------

create table t1 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 varchar(100));
insert into t1 values (generate_series(1,499999), 0, 0, 0, 0,
'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb');
update t1 set c2 = c1 % 100, c3 = c1 % 50, c4 = c1 % 10, c5 = c1 % 3;
update t1 set c6 = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb'
|| (c1 % 5)::text;

Query 1:

explain analyze select c1 from t1 order by c6, c5, c4, c3, c2, c1;

Disable Mksort

3021.636 ms
3014.669 ms
3033.588 ms

Enable Mksort

1688.590 ms
1686.956 ms
1688.567 ms

The improvement is 78.9%, which is reduced from the previous version
(129%). The most cost should be the pre-ordered check.

Query 2:

create index idx_t1_mk on t1 (c6, c5, c4, c3, c2, c1);

Disable Mksort

1674.648 ms
1680.608 ms
1681.373 ms

Enable Mksort

1143.341 ms
1143.462 ms
1143.894 ms

The improvement is ~47%, which is also reduced a bit (52%).

Data set 2 (with distinct values):
----------------------------------

create table t2 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 varchar(100));
insert into t2 values (generate_series(1,499999), 0, 0, 0, 0, '');
update t2 set c2 = 999990 - c1, c3 = 999991 - c1, c4 = 999992 - c1, c5
= 999993 - c1;
update t2 set c6 = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb'
|| (999994 - c1)::text;

Query 1:

explain analyze select c1 from t2 order by c6, c5, c4, c3, c2, c1;

Disable Mksort

12199.963 ms
12197.068 ms
12191.657 ms

Enable Mksort

9538.219 ms
9571.681 ms
9536.335 ms

The improvement is 27.9%, which is much better than the old approach (-6.2%).

Query 2 (the data is pre-ordered):

explain analyze select c1 from t2 order by c6 desc, c5, c4, c3, c2, c1;

Enable Mksort

768.191 ms
768.079 ms
767.026 ms

Disable Mksort

768.757 ms
766.166 ms
766.149 ms

They are almost the same since no actual sort was performed, and much
better than the old approach (-1198.1%).


Thanks,

Yao Wang

On Fri, May 24, 2024 at 8:50 PM Yao Wang <yao-yw.wang@broadcom.com> wrote:
>
> When all leading keys are different, mksort will finish the entire sort at the
> first sort key and never touch other keys. For the case, mksort falls back to
> kind of qsort actually.
>
> I created another data set with distinct values in all sort keys:
>
> create table t2 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 varchar(100));
> insert into t2 values (generate_series(1,499999), 0, 0, 0, 0, '');
> update t2 set c2 = 999990 - c1, c3 = 999991 - c1, c4 = 999992 - c1, c5
> = 999993 - c1;
> update t2 set c6 = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb'
>   || (999994 - c1)::text;
> explain analyze select c1 from t2 order by c6, c5, c4, c3, c2, c1;
>
> Results:
>
> MKsort:
> 12374.427 ms
> 12528.068 ms
> 12554.718 ms
>
> qsort:
> 12251.422 ms
> 12279.938 ms
> 12280.254 ms
>
> MKsort is a bit slower than qsort, which can be explained by extra
> checks of MKsort.
>
> Yao Wang
>
> On Fri, May 24, 2024 at 8:36 PM Wang Yao <yaowangm@outlook.com> wrote:
> >
> >
> >
> > 获取Outlook for Android
> > ________________________________
> > From: Heikki Linnakangas <hlinnaka@iki.fi>
> > Sent: Thursday, May 23, 2024 8:47:29 PM
> > To: Wang Yao <yaowangm@outlook.com>; PostgreSQL Hackers <pgsql-hackers@postgresql.org>
> > Cc: interma@outlook.com <interma@outlook.com>
> > Subject: Re: 回复: An implementation of multi-key sort
> >
> > On 23/05/2024 15:39, Wang Yao wrote:
> > > No obvious perf regression is expected because PG will follow original
> > > qsort code path when mksort is disabled. For the case, the only extra
> > > cost is the check in tuplesort_sort_memtuples() to enter mksort code path.
> >
> > And what about the case the mksort is enabled, but it's not effective
> > because all leading keys are different?
> >
> > --
> > Heikki Linnakangas
> > Neon (https://neon.tech)
> >

--
This electronic communication and the information and any files transmitted
with it, or attached to it, are confidential and are intended solely for
the use of the individual or entity to whom it is addressed and may contain
information that is confidential, legally privileged, protected by privacy
laws, or otherwise restricted from disclosure to anyone else. If you are
not the intended recipient or the person responsible for delivering the
e-mail to the intended recipient, you are hereby notified that any use,
copying, distributing, dissemination, forwarding, printing, or copying of
this e-mail is strictly prohibited. If you received this e-mail in error,
please return the e-mail to the sender, delete it from your computer, and
destroy any printed copy of it.

Вложения

Re: 回复: An implementation of multi-key sort

От
Yao Wang
Дата:
To be accurate, "multi-key sort" includes both "multi-key quick sort"
and "multi-key heap sort". This patch includes code change related to
only "multi-key quick sort" which is used to replace standard quick
sort for tuplesort. The "multi-key heap sort" is about an implementation
of multi-key heap and should be treated as a separated task. We need
to clarify the naming to avoid confusion.

I updated code which is related to only function/var renaming and
relevant comments, plus some minor assertions changes. Please see the
attachment.


Thanks,

Yao Wang

On Fri, May 31, 2024 at 8:09 PM Yao Wang <yao-yw.wang@broadcom.com> wrote:
>
> I added two optimizations to mksort which exist on qsort_tuple():
>
> 1. When selecting pivot, always pick the item in the middle of array but
> not by random. Theoretically it has the same effect to old approach, but
> it can eliminate some unstable perf test results, plus a bit perf benefit by
> removing random value generator.
> 2. Always check whether the array is ordered already, and return
> immediately if it is. The pre-ordered check requires extra cost and
> impacts perf numbers on some data sets, but can improve perf
> significantly on other data sets.
>
> By now, mksort has perf results equal or better than qsort on all data
> sets I ever used.
>
> I also updated test case. Please see v3 code as attachment.
>
> Perf test results:
>
> Data set 1 (with mass duplicate values):
> -----------------------------------------
>
> create table t1 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 varchar(100));
> insert into t1 values (generate_series(1,499999), 0, 0, 0, 0,
> 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb');
> update t1 set c2 = c1 % 100, c3 = c1 % 50, c4 = c1 % 10, c5 = c1 % 3;
> update t1 set c6 = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb'
> || (c1 % 5)::text;
>
> Query 1:
>
> explain analyze select c1 from t1 order by c6, c5, c4, c3, c2, c1;
>
> Disable Mksort
>
> 3021.636 ms
> 3014.669 ms
> 3033.588 ms
>
> Enable Mksort
>
> 1688.590 ms
> 1686.956 ms
> 1688.567 ms
>
> The improvement is 78.9%, which is reduced from the previous version
> (129%). The most cost should be the pre-ordered check.
>
> Query 2:
>
> create index idx_t1_mk on t1 (c6, c5, c4, c3, c2, c1);
>
> Disable Mksort
>
> 1674.648 ms
> 1680.608 ms
> 1681.373 ms
>
> Enable Mksort
>
> 1143.341 ms
> 1143.462 ms
> 1143.894 ms
>
> The improvement is ~47%, which is also reduced a bit (52%).
>
> Data set 2 (with distinct values):
> ----------------------------------
>
> create table t2 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 varchar(100));
> insert into t2 values (generate_series(1,499999), 0, 0, 0, 0, '');
> update t2 set c2 = 999990 - c1, c3 = 999991 - c1, c4 = 999992 - c1, c5
> = 999993 - c1;
> update t2 set c6 = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb'
> || (999994 - c1)::text;
>
> Query 1:
>
> explain analyze select c1 from t2 order by c6, c5, c4, c3, c2, c1;
>
> Disable Mksort
>
> 12199.963 ms
> 12197.068 ms
> 12191.657 ms
>
> Enable Mksort
>
> 9538.219 ms
> 9571.681 ms
> 9536.335 ms
>
> The improvement is 27.9%, which is much better than the old approach (-6.2%).
>
> Query 2 (the data is pre-ordered):
>
> explain analyze select c1 from t2 order by c6 desc, c5, c4, c3, c2, c1;
>
> Enable Mksort
>
> 768.191 ms
> 768.079 ms
> 767.026 ms
>
> Disable Mksort
>
> 768.757 ms
> 766.166 ms
> 766.149 ms
>
> They are almost the same since no actual sort was performed, and much
> better than the old approach (-1198.1%).
>
>
> Thanks,
>
> Yao Wang
>
> On Fri, May 24, 2024 at 8:50 PM Yao Wang <yao-yw.wang@broadcom.com> wrote:
> >
> > When all leading keys are different, mksort will finish the entire sort at the
> > first sort key and never touch other keys. For the case, mksort falls back to
> > kind of qsort actually.
> >
> > I created another data set with distinct values in all sort keys:
> >
> > create table t2 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 varchar(100));
> > insert into t2 values (generate_series(1,499999), 0, 0, 0, 0, '');
> > update t2 set c2 = 999990 - c1, c3 = 999991 - c1, c4 = 999992 - c1, c5
> > = 999993 - c1;
> > update t2 set c6 = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb'
> >   || (999994 - c1)::text;
> > explain analyze select c1 from t2 order by c6, c5, c4, c3, c2, c1;
> >
> > Results:
> >
> > MKsort:
> > 12374.427 ms
> > 12528.068 ms
> > 12554.718 ms
> >
> > qsort:
> > 12251.422 ms
> > 12279.938 ms
> > 12280.254 ms
> >
> > MKsort is a bit slower than qsort, which can be explained by extra
> > checks of MKsort.
> >
> > Yao Wang
> >
> > On Fri, May 24, 2024 at 8:36 PM Wang Yao <yaowangm@outlook.com> wrote:
> > >
> > >
> > >
> > > 获取Outlook for Android
> > > ________________________________
> > > From: Heikki Linnakangas <hlinnaka@iki.fi>
> > > Sent: Thursday, May 23, 2024 8:47:29 PM
> > > To: Wang Yao <yaowangm@outlook.com>; PostgreSQL Hackers <pgsql-hackers@postgresql.org>
> > > Cc: interma@outlook.com <interma@outlook.com>
> > > Subject: Re: 回复: An implementation of multi-key sort
> > >
> > > On 23/05/2024 15:39, Wang Yao wrote:
> > > > No obvious perf regression is expected because PG will follow original
> > > > qsort code path when mksort is disabled. For the case, the only extra
> > > > cost is the check in tuplesort_sort_memtuples() to enter mksort code path.
> > >
> > > And what about the case the mksort is enabled, but it's not effective
> > > because all leading keys are different?
> > >
> > > --
> > > Heikki Linnakangas
> > > Neon (https://neon.tech)
> > >

--
This electronic communication and the information and any files transmitted
with it, or attached to it, are confidential and are intended solely for
the use of the individual or entity to whom it is addressed and may contain
information that is confidential, legally privileged, protected by privacy
laws, or otherwise restricted from disclosure to anyone else. If you are
not the intended recipient or the person responsible for delivering the
e-mail to the intended recipient, you are hereby notified that any use,
copying, distributing, dissemination, forwarding, printing, or copying of
this e-mail is strictly prohibited. If you received this e-mail in error,
please return the e-mail to the sender, delete it from your computer, and
destroy any printed copy of it.

Вложения

Re: 回复: An implementation of multi-key sort

От
Tomas Vondra
Дата:
Hello Yao,

I was interested in the patch, considering the promise of significant
speedups of sorting, so I took a quick look and did some basic perf
testing today. Unfortunately, my benchmarks don't really confirm any
peformance benefits, so I haven't looked at the code very much and only
have some very basic feedback:

1) The new GUC is missing from the .sample config, triggering a failure
of "make check-world". Fixed by 0002.

2) There's a place mixing tabs/spaces in indentation. Fixed by 0003.

3) I tried running pgindent, mostly to see how that would affect the
comments, and for most it's probably fine, but a couple are mangled
(usually those with a numbered list of items). Might needs some changes
to use formatting that's not reformatted like this. The changes from
pgindent are in 0004, but this is not a fix - it just shows the changes
after running pgindent.

Now, regarding the performance tests - I decided to do the usual black
box testing, i.e. generate tables with varying numbers of columns, data
types, different data distribution (random, correlated, ...) and so on.
And then run simple ORDER BY queries on that, measuring timing with and
without mk-sort, and checking the effect.

So I wrote a simple bash script (attached) that does exactly that - it
generates a table with 1k - 10M rows, fills with with data (with some
basic simple data distributions), and then runs the queries.

The raw results are too large to attach, I'm only attaching a PDF
showing the summary with a "speedup heatmap" - it's a pivot with the
parameters on the left, and then the GUC and number on columns on top.
So the first group of columns is with enable_mk_sort=off, the second
group with enable_mk_sort=on, and finally the heatmap with relative
timing (enable_mk_sort=on / enable_mk_sort=off).

So values <100% mean it got faster (green color - good), and values
>100% mean it got slower (red - bad). And the thing is - pretty much
everything is red, often in the 200%-300% range, meaning it got 2x-3x
slower. There's only very few combinations where it got faster. That
does not seem very promising ... but maybe I did something wrong?

After seeing this, I took a look at your example again, which showed
some nice speedups. But it seems very dependent on the order of keys in
the ORDER BY clause. For example consider this:

set enable_mk_sort = on;
explain (analyze, timing off)
select * from t1 order by c6, c5, c4, c3, c2, c1;

                         QUERY PLAN
-------------------------------------------------------------------
 Sort  (cost=72328.81..73578.81 rows=499999 width=76)
       (actual rows=499999 loops=1)
   Sort Key: c6, c5, c4, c3, c2, c1
   Sort Method: quicksort  Memory: 59163kB
   ->  Seq Scan on t1  (cost=0.00..24999.99 rows=499999 width=76)
                       (actual rows=499999 loops=1)
 Planning Time: 0.054 ms
 Execution Time: 1095.183 ms
(6 rows)

set enable_mk_sort = on;
explain (analyze, timing off)
select * from t1 order by c6, c5, c4, c3, c2, c1;

                         QUERY PLAN
-------------------------------------------------------------------
 Sort  (cost=72328.81..73578.81 rows=499999 width=76)
       (actual rows=499999 loops=1)
   Sort Key: c6, c5, c4, c3, c2, c1
   Sort Method: multi-key quick sort  Memory: 59163kB
   ->  Seq Scan on t1  (cost=0.00..24999.99 rows=499999 width=76)
                       (actual rows=499999 loops=1)
 Planning Time: 0.130 ms
 Execution Time: 633.635 ms
(6 rows)

Which seems great, but let's reverse the sort keys:

set enable_mk_sort = off;
explain (analyze, timing off)
select * from t1 order by c1, c2, c3, c4, c5, c6;

                         QUERY PLAN
-------------------------------------------------------------------

 Sort  (cost=72328.81..73578.81 rows=499999 width=76)
       (actual rows=499999 loops=1)
   Sort Key: c1, c2, c3, c4, c5, c6
   Sort Method: quicksort  Memory: 59163kB
   ->  Seq Scan on t1  (cost=0.00..24999.99 rows=499999 width=76)
                       (actual rows=499999 loops=1)
 Planning Time: 0.146 ms
 Execution Time: 170.085 ms
(6 rows)

set enable_mk_sort = off;
explain (analyze, timing off)
select * from t1 order by c1, c2, c3, c4, c5, c6;

                         QUERY PLAN
-------------------------------------------------------------------
 Sort  (cost=72328.81..73578.81 rows=499999 width=76)
       (actual rows=499999 loops=1)
   Sort Key: c1, c2, c3, c4, c5, c6
   Sort Method: multi-key quick sort  Memory: 59163kB
   ->  Seq Scan on t1  (cost=0.00..24999.99 rows=499999 width=76)
                       (actual rows=499999 loops=1)
 Planning Time: 0.127 ms
 Execution Time: 367.263 ms
(6 rows)

I believe this is the case Heikki was asking about. I see the response
was that it's OK and the overhead is very low, but without too much
detail so I don't know what case you measured.

Anyway, I think it seems to be very sensitive to the exact data set.
Which is not entirely surprising, I guess - most optimizations have a
mix of improved/regressed cases, yielding a heatmap with a mix of green
and red areas, and we have to either optimize the code (or heuristics to
enable the feature), or convince ourselves the "red" cases are less
important / unlikely etc.

But here the results are almost universally "red", so it's going to be
very hard to convince ourselves this is a good trade off. Of course, you
may argue the cases I've tested are wrong and not representative. I
don't think that's the case, though.

It's also interesting (and perhaps a little bit bizarre) that almost all
the cases that got better are for a single-column sort. Which is exactly
the case the patch should not affect. But it seems pretty consistent, so
maybe this is something worth investigating.

FWIW I'm not familiar with the various quicksort variants, but I noticed
that the Bentley & Sedgewick paper mentioned as the basis for the patch
is from 1997, and apparently implements stuff originally proposed by
Hoare in 1961. So maybe this is just an example of an algorithm that was
good for a hardware at that time, but the changes (e.g. the growing
important of on-CPU caches) made it less relevant?

Another thing I noticed while skimming [1] is this:

    The algorithm is designed to exploit the property that in many
    problems, strings tend to have shared prefixes.

If that's the case, isn't it wrong to apply this to all sorts, including
sorts with non-string keys? It might explain why your example works OK,
as it involves key c6 which is string with all values sharing the same
(fairly long) prefix. But then maybe we should be careful and restrict
this to only such those cases?

regards

[1] https://en.wikipedia.org/wiki/Multi-key_quicksort

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Вложения

Re: 回复: An implementation of multi-key sort

От
Yao Wang
Дата:
hi Tomas,

So many thanks for your kind response and detailed report. I am working
on locating issues based on your report/script and optimizing code, and
will update later.

Could you please also send me the script to generate report pdf
from the test results (explain*.log)? I can try to make one by myself,
but I'd like to get a report exactly the same as yours. It's really
helpful.

Thanks in advance.


Yao Wang

On Mon, Jun 10, 2024 at 5:09 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> Hello Yao,
>
> I was interested in the patch, considering the promise of significant
> speedups of sorting, so I took a quick look and did some basic perf
> testing today. Unfortunately, my benchmarks don't really confirm any
> peformance benefits, so I haven't looked at the code very much and only
> have some very basic feedback:
>
> 1) The new GUC is missing from the .sample config, triggering a failure
> of "make check-world". Fixed by 0002.
>
> 2) There's a place mixing tabs/spaces in indentation. Fixed by 0003.
>
> 3) I tried running pgindent, mostly to see how that would affect the
> comments, and for most it's probably fine, but a couple are mangled
> (usually those with a numbered list of items). Might needs some changes
> to use formatting that's not reformatted like this. The changes from
> pgindent are in 0004, but this is not a fix - it just shows the changes
> after running pgindent.
>
> Now, regarding the performance tests - I decided to do the usual black
> box testing, i.e. generate tables with varying numbers of columns, data
> types, different data distribution (random, correlated, ...) and so on.
> And then run simple ORDER BY queries on that, measuring timing with and
> without mk-sort, and checking the effect.
>
> So I wrote a simple bash script (attached) that does exactly that - it
> generates a table with 1k - 10M rows, fills with with data (with some
> basic simple data distributions), and then runs the queries.
>
> The raw results are too large to attach, I'm only attaching a PDF
> showing the summary with a "speedup heatmap" - it's a pivot with the
> parameters on the left, and then the GUC and number on columns on top.
> So the first group of columns is with enable_mk_sort=off, the second
> group with enable_mk_sort=on, and finally the heatmap with relative
> timing (enable_mk_sort=on / enable_mk_sort=off).
>
> So values <100% mean it got faster (green color - good), and values
> >100% mean it got slower (red - bad). And the thing is - pretty much
> everything is red, often in the 200%-300% range, meaning it got 2x-3x
> slower. There's only very few combinations where it got faster. That
> does not seem very promising ... but maybe I did something wrong?
>
> After seeing this, I took a look at your example again, which showed
> some nice speedups. But it seems very dependent on the order of keys in
> the ORDER BY clause. For example consider this:
>
> set enable_mk_sort = on;
> explain (analyze, timing off)
> select * from t1 order by c6, c5, c4, c3, c2, c1;
>
>                          QUERY PLAN
> -------------------------------------------------------------------
>  Sort  (cost=72328.81..73578.81 rows=499999 width=76)
>        (actual rows=499999 loops=1)
>    Sort Key: c6, c5, c4, c3, c2, c1
>    Sort Method: quicksort  Memory: 59163kB
>    ->  Seq Scan on t1  (cost=0.00..24999.99 rows=499999 width=76)
>                        (actual rows=499999 loops=1)
>  Planning Time: 0.054 ms
>  Execution Time: 1095.183 ms
> (6 rows)
>
> set enable_mk_sort = on;
> explain (analyze, timing off)
> select * from t1 order by c6, c5, c4, c3, c2, c1;
>
>                          QUERY PLAN
> -------------------------------------------------------------------
>  Sort  (cost=72328.81..73578.81 rows=499999 width=76)
>        (actual rows=499999 loops=1)
>    Sort Key: c6, c5, c4, c3, c2, c1
>    Sort Method: multi-key quick sort  Memory: 59163kB
>    ->  Seq Scan on t1  (cost=0.00..24999.99 rows=499999 width=76)
>                        (actual rows=499999 loops=1)
>  Planning Time: 0.130 ms
>  Execution Time: 633.635 ms
> (6 rows)
>
> Which seems great, but let's reverse the sort keys:
>
> set enable_mk_sort = off;
> explain (analyze, timing off)
> select * from t1 order by c1, c2, c3, c4, c5, c6;
>
>                          QUERY PLAN
> -------------------------------------------------------------------
>
>  Sort  (cost=72328.81..73578.81 rows=499999 width=76)
>        (actual rows=499999 loops=1)
>    Sort Key: c1, c2, c3, c4, c5, c6
>    Sort Method: quicksort  Memory: 59163kB
>    ->  Seq Scan on t1  (cost=0.00..24999.99 rows=499999 width=76)
>                        (actual rows=499999 loops=1)
>  Planning Time: 0.146 ms
>  Execution Time: 170.085 ms
> (6 rows)
>
> set enable_mk_sort = off;
> explain (analyze, timing off)
> select * from t1 order by c1, c2, c3, c4, c5, c6;
>
>                          QUERY PLAN
> -------------------------------------------------------------------
>  Sort  (cost=72328.81..73578.81 rows=499999 width=76)
>        (actual rows=499999 loops=1)
>    Sort Key: c1, c2, c3, c4, c5, c6
>    Sort Method: multi-key quick sort  Memory: 59163kB
>    ->  Seq Scan on t1  (cost=0.00..24999.99 rows=499999 width=76)
>                        (actual rows=499999 loops=1)
>  Planning Time: 0.127 ms
>  Execution Time: 367.263 ms
> (6 rows)
>
> I believe this is the case Heikki was asking about. I see the response
> was that it's OK and the overhead is very low, but without too much
> detail so I don't know what case you measured.
>
> Anyway, I think it seems to be very sensitive to the exact data set.
> Which is not entirely surprising, I guess - most optimizations have a
> mix of improved/regressed cases, yielding a heatmap with a mix of green
> and red areas, and we have to either optimize the code (or heuristics to
> enable the feature), or convince ourselves the "red" cases are less
> important / unlikely etc.
>
> But here the results are almost universally "red", so it's going to be
> very hard to convince ourselves this is a good trade off. Of course, you
> may argue the cases I've tested are wrong and not representative. I
> don't think that's the case, though.
>
> It's also interesting (and perhaps a little bit bizarre) that almost all
> the cases that got better are for a single-column sort. Which is exactly
> the case the patch should not affect. But it seems pretty consistent, so
> maybe this is something worth investigating.
>
> FWIW I'm not familiar with the various quicksort variants, but I noticed
> that the Bentley & Sedgewick paper mentioned as the basis for the patch
> is from 1997, and apparently implements stuff originally proposed by
> Hoare in 1961. So maybe this is just an example of an algorithm that was
> good for a hardware at that time, but the changes (e.g. the growing
> important of on-CPU caches) made it less relevant?
>
> Another thing I noticed while skimming [1] is this:
>
>     The algorithm is designed to exploit the property that in many
>     problems, strings tend to have shared prefixes.
>
> If that's the case, isn't it wrong to apply this to all sorts, including
> sorts with non-string keys? It might explain why your example works OK,
> as it involves key c6 which is string with all values sharing the same
> (fairly long) prefix. But then maybe we should be careful and restrict
> this to only such those cases?
>
> regards
>
> [1] https://en.wikipedia.org/wiki/Multi-key_quicksort
>
> --
> Tomas Vondra
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

--
This electronic communication and the information and any files transmitted
with it, or attached to it, are confidential and are intended solely for
the use of the individual or entity to whom it is addressed and may contain
information that is confidential, legally privileged, protected by privacy
laws, or otherwise restricted from disclosure to anyone else. If you are
not the intended recipient or the person responsible for delivering the
e-mail to the intended recipient, you are hereby notified that any use,
copying, distributing, dissemination, forwarding, printing, or copying of
this e-mail is strictly prohibited. If you received this e-mail in error,
please return the e-mail to the sender, delete it from your computer, and
destroy any printed copy of it.



Re: 回复: An implementation of multi-key sort

От
Tomas Vondra
Дата:

On 6/14/24 13:20, Yao Wang wrote:
> hi Tomas,
> 
> So many thanks for your kind response and detailed report. I am working
> on locating issues based on your report/script and optimizing code, and
> will update later.
> 
> Could you please also send me the script to generate report pdf
> from the test results (explain*.log)? I can try to make one by myself,
> but I'd like to get a report exactly the same as yours. It's really
> helpful.
> 

I don't have a script for that. I simply load the results into a
spreadsheet, do a pivot table to "aggregate and reshuffle" it a bit, and
then add a heatmap. I use google sheets for this, but any other
spreadsheet should handle this too, I think.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: 回复: An implementation of multi-key sort

От
John Naylor
Дата:
On Fri, Jun 14, 2024 at 6:20 PM Yao Wang <yao-yw.wang@broadcom.com> wrote:
>
> hi Tomas,
>
> So many thanks for your kind response and detailed report. I am working
> on locating issues based on your report/script and optimizing code, and
> will update later.

Hi,
This is an interesting proof-of-concept!

Given the above, I've set this CF entry to "waiting on author".

Also, I see you've added Heikki as a reviewer. I'm not sure how others
think, but I consider a "reviewer" in the CF app to be someone who has
volunteered to be responsible to help move this patch forward. If
there is a name in the reviewer column, it may discourage others from
doing review. It also can happened that people ping reviewers to ask
"There's been no review for X months -- are you planning on looking at
this?", and it's not great if that message is a surprise.

Note that we prefer not to top-post in emails since it makes our web
archive more difficult to read.

Thanks,
John