Обсуждение: bumping HASH_VERSION to 3

Поиск
Список
Период
Сортировка

bumping HASH_VERSION to 3

От
Robert Haas
Дата:
Starting a new thread about this to get more visibility.

Despite the extensive work that has been done on hash indexes this
release, we have thus far not made any change to the on-disk format
that is not nominally backward-compatible.  Commit
293e24e507838733aba4748b514536af2d39d7f2 did make a change for new
hash indexes, but included backward-compatibility code so that old
indexes would continue to work.  However, I'd like to also commit
Mithun Cy's patch to expand hash indexes more gradually -- latest
version in http://postgr.es/m/CAD__OujD-iBxm91ZcqziaYftWqJxnFqgMv361V9zke83s6ifBg@mail.gmail.com
-- and that's not backward-compatible.

It would be possible to write code to convert the old metapage format
to the new metapage format introduced by that patch, and it wouldn't
be very hard, but I think it would be better to NOT do that, and
instead force everybody upgrading to v10 to rebuild all of their hash
indexes.   If we don't do that, then we'll never know whether
instances of hash index corruption reported against v10 or higher are
caused by defects in the new code, because there's always the chance
that the hash index could have been built on a pre-v10 version, got
corrupted because of the lack of WAL-logging, and then been brought up
to v10+ via pg_upgrade.  Forcing a reindex in v10 kills three birds
with one stone:

- No old, not logged, possibly corrupt hash indexes floating around
after an upgrade to v10.
- Can remove the backward-compatibility code added by
293e24e507838733aba4748b514536af2d39d7f2 instead of keeping it around
forever.
- No need to worry about doing an in-place upgrade of the metapage for
the above-mentioned patch.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: bumping HASH_VERSION to 3

От
Magnus Hagander
Дата:
On Fri, Mar 31, 2017 at 8:17 PM, Robert Haas <robertmhaas@gmail.com> wrote:
Starting a new thread about this to get more visibility.

Despite the extensive work that has been done on hash indexes this
release, we have thus far not made any change to the on-disk format
that is not nominally backward-compatible.  Commit
293e24e507838733aba4748b514536af2d39d7f2 did make a change for new
hash indexes, but included backward-compatibility code so that old
indexes would continue to work.  However, I'd like to also commit
Mithun Cy's patch to expand hash indexes more gradually -- latest
version in http://postgr.es/m/CAD__OujD-iBxm91ZcqziaYftWqJxnFqgMv361V9zke83s6ifBg@mail.gmail.com
-- and that's not backward-compatible.

It would be possible to write code to convert the old metapage format
to the new metapage format introduced by that patch, and it wouldn't
be very hard, but I think it would be better to NOT do that, and
instead force everybody upgrading to v10 to rebuild all of their hash
indexes.   If we don't do that, then we'll never know whether
instances of hash index corruption reported against v10 or higher are
caused by defects in the new code, because there's always the chance
that the hash index could have been built on a pre-v10 version, got
corrupted because of the lack of WAL-logging, and then been brought up
to v10+ via pg_upgrade.  Forcing a reindex in v10 kills three birds
with one stone:

- No old, not logged, possibly corrupt hash indexes floating around
after an upgrade to v10.
- Can remove the backward-compatibility code added by
293e24e507838733aba4748b514536af2d39d7f2 instead of keeping it around
forever.
- No need to worry about doing an in-place upgrade of the metapage for
the above-mentioned patch.

Thoughts?

Given the state of hash indexes in <= 9.6, I think this is a reasonable tradeoff. Most people won't be using them at all today. Those that do will have to "pay" with a REINDEX on upgrade. I think the benefits definitely outweigh the cost.

So +1 for doing it. 

--

Re: bumping HASH_VERSION to 3

От
Jesper Pedersen
Дата:
On 03/31/2017 02:17 PM, Robert Haas wrote:
> Starting a new thread about this to get more visibility.
>
> Despite the extensive work that has been done on hash indexes this
> release, we have thus far not made any change to the on-disk format
> that is not nominally backward-compatible.  Commit
> 293e24e507838733aba4748b514536af2d39d7f2 did make a change for new
> hash indexes, but included backward-compatibility code so that old
> indexes would continue to work.  However, I'd like to also commit
> Mithun Cy's patch to expand hash indexes more gradually -- latest
> version in http://postgr.es/m/CAD__OujD-iBxm91ZcqziaYftWqJxnFqgMv361V9zke83s6ifBg@mail.gmail.com
> -- and that's not backward-compatible.
>
> It would be possible to write code to convert the old metapage format
> to the new metapage format introduced by that patch, and it wouldn't
> be very hard, but I think it would be better to NOT do that, and
> instead force everybody upgrading to v10 to rebuild all of their hash
> indexes.   If we don't do that, then we'll never know whether
> instances of hash index corruption reported against v10 or higher are
> caused by defects in the new code, because there's always the chance
> that the hash index could have been built on a pre-v10 version, got
> corrupted because of the lack of WAL-logging, and then been brought up
> to v10+ via pg_upgrade.  Forcing a reindex in v10 kills three birds
> with one stone:
>
> - No old, not logged, possibly corrupt hash indexes floating around
> after an upgrade to v10.
> - Can remove the backward-compatibility code added by
> 293e24e507838733aba4748b514536af2d39d7f2 instead of keeping it around
> forever.
> - No need to worry about doing an in-place upgrade of the metapage for
> the above-mentioned patch.
>
> Thoughts?
>

+1

Best regards, Jesper




Re: bumping HASH_VERSION to 3

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> Forcing a reindex in v10 kills three birds
> with one stone:

> - No old, not logged, possibly corrupt hash indexes floating around
> after an upgrade to v10.
> - Can remove the backward-compatibility code added by
> 293e24e507838733aba4748b514536af2d39d7f2 instead of keeping it around
> forever.
> - No need to worry about doing an in-place upgrade of the metapage for
> the above-mentioned patch.

> Thoughts?

+1, as long as we're clear on what will happen when pg_upgrade'ing
an installation containing hash indexes.  I think a minimum requirement is
that it succeed and be able to start up, and allow the user to manually
REINDEX such indexes afterwards.  Bonus points for:

1. teaching pg_upgrade to create a script containing the required REINDEX
commands.  (I think it's produced scripts for similar requirements in the
past.)

2. marking the index invalid so that the system would silently ignore it
until it's been reindexed.  I think there might be adequate infrastructure
for that already thanks to REINDEX CONCURRENTLY, and it'd just be a matter
of getting pg_upgrade to hack the indexes' catalog state.  (If not, it's
probably not worth the trouble.)

A variant on that might just be to not transfer over hash indexes,
leaving the user to CREATE them rather than REINDEX them.
        regards, tom lane



Re: bumping HASH_VERSION to 3

От
Joe Conway
Дата:
On 03/31/2017 11:19 AM, Magnus Hagander wrote:
> On Fri, Mar 31, 2017 at 8:17 PM, Robert Haas <robertmhaas@gmail.com
> <mailto:robertmhaas@gmail.com>> wrote:
>
>     Starting a new thread about this to get more visibility.
>
>     Despite the extensive work that has been done on hash indexes this
>     release, we have thus far not made any change to the on-disk format
>     that is not nominally backward-compatible.  Commit
>     293e24e507838733aba4748b514536af2d39d7f2 did make a change for new
>     hash indexes, but included backward-compatibility code so that old
>     indexes would continue to work.  However, I'd like to also commit
>     Mithun Cy's patch to expand hash indexes more gradually -- latest
>     version in
>     http://postgr.es/m/CAD__OujD-iBxm91ZcqziaYftWqJxnFqgMv361V9zke83s6ifBg@mail.gmail.com
>     <http://postgr.es/m/CAD__OujD-iBxm91ZcqziaYftWqJxnFqgMv361V9zke83s6ifBg@mail.gmail.com>
>     -- and that's not backward-compatible.
>
>     It would be possible to write code to convert the old metapage format
>     to the new metapage format introduced by that patch, and it wouldn't
>     be very hard, but I think it would be better to NOT do that, and
>     instead force everybody upgrading to v10 to rebuild all of their hash
>     indexes.   If we don't do that, then we'll never know whether
>     instances of hash index corruption reported against v10 or higher are
>     caused by defects in the new code, because there's always the chance
>     that the hash index could have been built on a pre-v10 version, got
>     corrupted because of the lack of WAL-logging, and then been brought up
>     to v10+ via pg_upgrade.  Forcing a reindex in v10 kills three birds
>     with one stone:
>
>     - No old, not logged, possibly corrupt hash indexes floating around
>     after an upgrade to v10.
>     - Can remove the backward-compatibility code added by
>     293e24e507838733aba4748b514536af2d39d7f2 instead of keeping it around
>     forever.
>     - No need to worry about doing an in-place upgrade of the metapage for
>     the above-mentioned patch.
>
>     Thoughts?
>
>
> Given the state of hash indexes in <= 9.6, I think this is a reasonable
> tradeoff. Most people won't be using them at all today. Those that do
> will have to "pay" with a REINDEX on upgrade. I think the benefits
> definitely outweigh the cost.
>
> So +1 for doing it.

+1


--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development