pgsql: hio: Use ExtendBufferedRelBy() to extend tables more efficiently

Поиск
Список
Период
Сортировка
От Andres Freund
Тема pgsql: hio: Use ExtendBufferedRelBy() to extend tables more efficiently
Дата
Msg-id E1pkZXw-001lQQ-EJ@gemulon.postgresql.org
обсуждение исходный текст
Список pgsql-committers
hio: Use ExtendBufferedRelBy() to extend tables more efficiently

While we already had some form of bulk extension for relations, it was fairly
limited. It only amortized the cost of acquiring the extension lock, the
relation itself was still extended one-by-one. Bulk extension was also solely
triggered by contention, not by the amount of data inserted.

To address this, use ExtendBufferedRelBy(), introduced in 31966b151e6, to
extend the relation. We try to extend the relation by multiple blocks in two
situations:

1) The caller tells RelationGetBufferForTuple() that it will need multiple
   pages. For now that's only used by heap_multi_insert(), see commit FIXME.

2) If there is contention on the extension lock, use the number of waiters for
   the lock as a multiplier for the number of blocks to extend by. This is
   similar to what we already did. Previously we additionally multiplied the
   numbers of waiters by 20, but with the new relation extension
   infrastructure I could not see a benefit in doing so.

Using the freespacemap to provide empty pages can cause significant
contention, and adds measurable overhead, even if there is no contention. To
reduce that, remember the blocks the relation was extended by in the
BulkInsertState, in the extending backend. In case 1) from above, the blocks
the extending backend needs are not entered into the FSM, as we know that we
will need those blocks.

One complication with using the FSM to record empty pages, is that we need to
insert blocks into the FSM, when we already hold a buffer content lock. To
avoid doing IO while holding a content lock, release the content lock before
recording free space. Currently that opens a small window in which another
backend could fill the block, if a concurrent VACUUM records the free
space. If that happens, we retry, similar to the already existing case when
otherBuffer is provided. In the future it might be worth closing the race by
preventing VACUUM from recording the space in newly extended pages.

This change provides very significant wins (3x at 16 clients, on my
workstation) for concurrent COPY into a single relation. Even single threaded
COPY is measurably faster, primarily due to not dirtying pages while
extending, if supported by the operating system (see commit 4d330a61bb1). Even
single-row INSERTs benefit, although to a much smaller degree, as the relation
extension lock rarely is the primary bottleneck.

Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/00d1e02be24987180115e371abaeb84738257ae2

Modified Files
--------------
src/backend/access/heap/heapam.c |   2 +
src/backend/access/heap/hio.c    | 364 +++++++++++++++++++++++----------------
src/include/access/hio.h         |  11 ++
3 files changed, 233 insertions(+), 144 deletions(-)


В списке pgsql-committers по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: pgsql: Add VACUUM/ANALYZE BUFFER_USAGE_LIMIT option
Следующее
От: David Rowley
Дата:
Сообщение: pgsql: Add --buffer-usage-limit option to vacuumdb