Обсуждение: bumping HASH_VERSION to 3
Starting a new thread about this to get more visibility. Despite the extensive work that has been done on hash indexes this release, we have thus far not made any change to the on-disk format that is not nominally backward-compatible. Commit 293e24e507838733aba4748b514536af2d39d7f2 did make a change for new hash indexes, but included backward-compatibility code so that old indexes would continue to work. However, I'd like to also commit Mithun Cy's patch to expand hash indexes more gradually -- latest version in http://postgr.es/m/CAD__OujD-iBxm91ZcqziaYftWqJxnFqgMv361V9zke83s6ifBg@mail.gmail.com -- and that's not backward-compatible. It would be possible to write code to convert the old metapage format to the new metapage format introduced by that patch, and it wouldn't be very hard, but I think it would be better to NOT do that, and instead force everybody upgrading to v10 to rebuild all of their hash indexes. If we don't do that, then we'll never know whether instances of hash index corruption reported against v10 or higher are caused by defects in the new code, because there's always the chance that the hash index could have been built on a pre-v10 version, got corrupted because of the lack of WAL-logging, and then been brought up to v10+ via pg_upgrade. Forcing a reindex in v10 kills three birds with one stone: - No old, not logged, possibly corrupt hash indexes floating around after an upgrade to v10. - Can remove the backward-compatibility code added by 293e24e507838733aba4748b514536af2d39d7f2 instead of keeping it around forever. - No need to worry about doing an in-place upgrade of the metapage for the above-mentioned patch. Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Mar 31, 2017 at 8:17 PM, Robert Haas <robertmhaas@gmail.com> wrote:
Starting a new thread about this to get more visibility.
Despite the extensive work that has been done on hash indexes this
release, we have thus far not made any change to the on-disk format
that is not nominally backward-compatible. Commit
293e24e507838733aba4748b514536af2d39d7f2 did make a change for new
hash indexes, but included backward-compatibility code so that old
indexes would continue to work. However, I'd like to also commit
Mithun Cy's patch to expand hash indexes more gradually -- latest
version in http://postgr.es/m/CAD__OujD-iBxm91ZcqziaYftWqJxnFqgMv361V9 zke83s6ifBg@mail.gmail.com
-- and that's not backward-compatible.
It would be possible to write code to convert the old metapage format
to the new metapage format introduced by that patch, and it wouldn't
be very hard, but I think it would be better to NOT do that, and
instead force everybody upgrading to v10 to rebuild all of their hash
indexes. If we don't do that, then we'll never know whether
instances of hash index corruption reported against v10 or higher are
caused by defects in the new code, because there's always the chance
that the hash index could have been built on a pre-v10 version, got
corrupted because of the lack of WAL-logging, and then been brought up
to v10+ via pg_upgrade. Forcing a reindex in v10 kills three birds
with one stone:
- No old, not logged, possibly corrupt hash indexes floating around
after an upgrade to v10.
- Can remove the backward-compatibility code added by
293e24e507838733aba4748b514536af2d39d7f2 instead of keeping it around
forever.
- No need to worry about doing an in-place upgrade of the metapage for
the above-mentioned patch.
Thoughts?
Given the state of hash indexes in <= 9.6, I think this is a reasonable tradeoff. Most people won't be using them at all today. Those that do will have to "pay" with a REINDEX on upgrade. I think the benefits definitely outweigh the cost.
So +1 for doing it.
On 03/31/2017 02:17 PM, Robert Haas wrote: > Starting a new thread about this to get more visibility. > > Despite the extensive work that has been done on hash indexes this > release, we have thus far not made any change to the on-disk format > that is not nominally backward-compatible. Commit > 293e24e507838733aba4748b514536af2d39d7f2 did make a change for new > hash indexes, but included backward-compatibility code so that old > indexes would continue to work. However, I'd like to also commit > Mithun Cy's patch to expand hash indexes more gradually -- latest > version in http://postgr.es/m/CAD__OujD-iBxm91ZcqziaYftWqJxnFqgMv361V9zke83s6ifBg@mail.gmail.com > -- and that's not backward-compatible. > > It would be possible to write code to convert the old metapage format > to the new metapage format introduced by that patch, and it wouldn't > be very hard, but I think it would be better to NOT do that, and > instead force everybody upgrading to v10 to rebuild all of their hash > indexes. If we don't do that, then we'll never know whether > instances of hash index corruption reported against v10 or higher are > caused by defects in the new code, because there's always the chance > that the hash index could have been built on a pre-v10 version, got > corrupted because of the lack of WAL-logging, and then been brought up > to v10+ via pg_upgrade. Forcing a reindex in v10 kills three birds > with one stone: > > - No old, not logged, possibly corrupt hash indexes floating around > after an upgrade to v10. > - Can remove the backward-compatibility code added by > 293e24e507838733aba4748b514536af2d39d7f2 instead of keeping it around > forever. > - No need to worry about doing an in-place upgrade of the metapage for > the above-mentioned patch. > > Thoughts? > +1 Best regards, Jesper
Robert Haas <robertmhaas@gmail.com> writes: > Forcing a reindex in v10 kills three birds > with one stone: > - No old, not logged, possibly corrupt hash indexes floating around > after an upgrade to v10. > - Can remove the backward-compatibility code added by > 293e24e507838733aba4748b514536af2d39d7f2 instead of keeping it around > forever. > - No need to worry about doing an in-place upgrade of the metapage for > the above-mentioned patch. > Thoughts? +1, as long as we're clear on what will happen when pg_upgrade'ing an installation containing hash indexes. I think a minimum requirement is that it succeed and be able to start up, and allow the user to manually REINDEX such indexes afterwards. Bonus points for: 1. teaching pg_upgrade to create a script containing the required REINDEX commands. (I think it's produced scripts for similar requirements in the past.) 2. marking the index invalid so that the system would silently ignore it until it's been reindexed. I think there might be adequate infrastructure for that already thanks to REINDEX CONCURRENTLY, and it'd just be a matter of getting pg_upgrade to hack the indexes' catalog state. (If not, it's probably not worth the trouble.) A variant on that might just be to not transfer over hash indexes, leaving the user to CREATE them rather than REINDEX them. regards, tom lane
On 03/31/2017 11:19 AM, Magnus Hagander wrote: > On Fri, Mar 31, 2017 at 8:17 PM, Robert Haas <robertmhaas@gmail.com > <mailto:robertmhaas@gmail.com>> wrote: > > Starting a new thread about this to get more visibility. > > Despite the extensive work that has been done on hash indexes this > release, we have thus far not made any change to the on-disk format > that is not nominally backward-compatible. Commit > 293e24e507838733aba4748b514536af2d39d7f2 did make a change for new > hash indexes, but included backward-compatibility code so that old > indexes would continue to work. However, I'd like to also commit > Mithun Cy's patch to expand hash indexes more gradually -- latest > version in > http://postgr.es/m/CAD__OujD-iBxm91ZcqziaYftWqJxnFqgMv361V9zke83s6ifBg@mail.gmail.com > <http://postgr.es/m/CAD__OujD-iBxm91ZcqziaYftWqJxnFqgMv361V9zke83s6ifBg@mail.gmail.com> > -- and that's not backward-compatible. > > It would be possible to write code to convert the old metapage format > to the new metapage format introduced by that patch, and it wouldn't > be very hard, but I think it would be better to NOT do that, and > instead force everybody upgrading to v10 to rebuild all of their hash > indexes. If we don't do that, then we'll never know whether > instances of hash index corruption reported against v10 or higher are > caused by defects in the new code, because there's always the chance > that the hash index could have been built on a pre-v10 version, got > corrupted because of the lack of WAL-logging, and then been brought up > to v10+ via pg_upgrade. Forcing a reindex in v10 kills three birds > with one stone: > > - No old, not logged, possibly corrupt hash indexes floating around > after an upgrade to v10. > - Can remove the backward-compatibility code added by > 293e24e507838733aba4748b514536af2d39d7f2 instead of keeping it around > forever. > - No need to worry about doing an in-place upgrade of the metapage for > the above-mentioned patch. > > Thoughts? > > > Given the state of hash indexes in <= 9.6, I think this is a reasonable > tradeoff. Most people won't be using them at all today. Those that do > will have to "pay" with a REINDEX on upgrade. I think the benefits > definitely outweigh the cost. > > So +1 for doing it. +1 -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development