On Tue, Jun 5, 2018 at 7:15 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> Hi!
>
> Currently _bt_search() always locks leaf buffer in shared mode (aka BT_READ),
> while caller can relock it later. However, I don't see what prevents
> _bt_search()
> from locking leaf immediately in exclusive mode (aka BT_WRITE) when required.
> When traversing downlink from non-leaf page of level 1 (lowest level of non-leaf
> pages just prior to leaf pages), we know that next page is going to be leaf. In
> this case, we can immediately lock next page in exclusive mode if required.
> For sure, it might happen that we didn't manage to exclusively lock leaf in this
> way when _bt_getroot() points us to leaf page. But this case could be handled
> in _bt_search() by relock. Please, find implementation of this approach in the
> attached patch.
>
> I've run following simple test of this patch on 72-cores machine.
>
> # postgresql.conf
> max_connections = 300
> shared_buffers = 32GB
> fsync = off
> synchronous_commit = off
>
> # DDL:
> CREATE UNLOGGED TABLE ordered (id serial primary key, value text not null);
> CREATE UNLOGGED TABLE unordered (i integer not null, value text not null);
>
> # script_ordered.sql
> INSERT INTO ordered (value) VALUES ('abcdefghijklmnoprsqtuvwxyz');
>
> # script_unordered.sql
> \set i random(1, 1000000)
> INSERT INTO unordered VALUES (:i, 'abcdefghijklmnoprsqtuvwxyz');
>
> # commands
> pgbench -T 60 -P 1 -M prepared -f script_ordered.sql -c 150 -j 150 postgres
> pgbench -T 60 -P 1 -M prepared -f script_unordered.sql -c 150 -j 150 postgres
>
> # results
> ordered, master: 157473 TPS
> ordered, patched 231374 TPS
> unordered, master: 232372 TPS
> unordered, patched: 232535 TPS
>
> As you can see, difference in unordered case is negligible Due to random
> inserts, concurrency for particular leafs is low. But ordered
> insertion is almost
> 50% faster on patched version.
>
> I wonder how could we miss such a simple optimization till now, but I also don't
> see this patch to brake anything.
>
> In patched version, it might appear that we have to traverse
> rightlinks in exclusive
> mode due to splits concurrent to downlink traversal. However, the same might
> happen in current master due to splits concurrent to relocks. So, I
> don't expect
> performance regression to be caused by this patch.
>
I had a look at this patch and it looks good to me.
With the improvement in performance you mentioned, this optimisation
looks worthy.
--
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/