pgsql: Don't rely on estimates for amcheck Bloom filters.

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема pgsql: Don't rely on estimates for amcheck Bloom filters.
Дата
Msg-id E1hotq1-0003A3-ID@gemulon.postgresql.org
обсуждение исходный текст
Список pgsql-committers
Don't rely on estimates for amcheck Bloom filters.

Solely relying on a relation's reltuples/relpages estimate to size the
Bloom filters used by amcheck verification makes verification less
effective when the estimates are very stale.  In extreme cases,
verification options that use Bloom filters internally could be totally
ineffective, without users receiving any clear indication that certain
types of corruption might easily be missed.

To fix, use RelationGetNumberOfBlocks() instead of relpages to size the
downlink block Bloom filter.  Use the same RelationGetNumberOfBlocks()
value to derive a minimum size for the heapallindexed Bloom filter,
rather than completely trusting reltuples.  Verification will still be
reasonably effective when the projected/estimated number of Bloom filter
elements is at least 1/5 of the final number of elements, which is
assured by the new sizing logic.

Reported-By: Alexander Korotkov
Discussion: https://postgr.es/m/CAH2-Wzk0ke2J42KrNYBKu0Xovjy-sU5ub7PWjgpbsKdAQcL4OA@mail.gmail.com
Backpatch: 11-, where downlink/heapallindexed verification were added.

Branch
------
REL_12_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/980224b4a23056de76d902b539980868d33d1b8d

Modified Files
--------------
contrib/amcheck/verify_nbtree.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)


В списке pgsql-committers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: pgsql: Use column collation for extended statistics
Следующее
От: David Rowley
Дата:
Сообщение: pgsql: Speed up finding EquivalenceClasses for a given set of rels