Обсуждение: Finding database for pg_upgrade missing library

Поиск
Список
Период
Сортировка

Finding database for pg_upgrade missing library

От
Bruce Momjian
Дата:
I received a private pg_upgrade feature request to report the database
name for missing loadable libraries.  Currently we report "could not
load library" and the library file name, e.g. $libdir/pgpool-regclass.

The request is that we report the _database_ name that contained the
loadable library reference.  However, that isn't easy to do because we
gather all loadable library file names, sort them, and remove
duplicates, for reasons of efficiency and so we check libraries in a
predictable alphabetical order.

Is it worth modifying pg_upgrade to report the first or all databases
that contain references to missing loadable libraries?  I don't think
so, but I wanted to ask here.

-- 
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +


Re: Finding database for pg_upgrade missing library

От
Daniel Gustafsson
Дата:
> On 13 Jul 2018, at 18:28, Bruce Momjian <bruce@momjian.us> wrote:
>
> I received a private pg_upgrade feature request to report the database
> name for missing loadable libraries.  Currently we report "could not
> load library" and the library file name, e.g. $libdir/pgpool-regclass.
>
> The request is that we report the _database_ name that contained the
> loadable library reference.  However, that isn't easy to do because we
> gather all loadable library file names, sort them, and remove
> duplicates, for reasons of efficiency and so we check libraries in a
> predictable alphabetical order.
>
> Is it worth modifying pg_upgrade to report the first or all databases
> that contain references to missing loadable libraries?  I don't think
> so, but I wanted to ask here.

I ran into that very problem when upgrading 50+ clusters during a long night a
while back, and patched pg_upgrade to report the database which helped a lot
(or at least helped me be a bit lazy).  So, +1 on the feature from me.  If
there is interest I can see if I can dig out the patch and polish it up.

cheers ./daniel

Re: Finding database for pg_upgrade missing library

От
Justin T Pryzby
Дата:
On Fri, Jul 13, 2018 at 12:28:15PM -0400, Bruce Momjian wrote:
> I received a private pg_upgrade feature request to report the database
> name for missing loadable libraries.  Currently we report "could not
> load library" and the library file name, e.g. $libdir/pgpool-regclass.
> 
> The request is that we report the _database_ name that contained the
> loadable library reference.  However, that isn't easy to do because we
> gather all loadable library file names, sort them, and remove
> duplicates, for reasons of efficiency and so we check libraries in a
> predictable alphabetical order.
> 
> Is it worth modifying pg_upgrade to report the first or all databases
> that contain references to missing loadable libraries?  I don't think
> so, but I wanted to ask here.

Yes please, with a preference for the "all databases" option.

We typically have only 4 DBs, including postgres and template1,2.  It's
annoying enough when an upgrade process breaks because pg_repack or
pg_stat_buffercache installed into postgres DB.  But it's a veritable pain when
you discover in the middle of an upgrade that postgis had been somehow loaded
into template1, needs to be uninstalled (or upgraded from 22 to 23 to allow
upgrade), old postgis package was already removed..  Maybe you find that one
library was installed one place, fix it and restart the upgrade process.  Then
it fails because the old library was also installed some other place..

When I've had to figure this out in the past, I ended up grepping the dumps to
figure out what old library was where.

I have these comments to myself from the last time I had to figure out what
[(database, [missing library,...]), ...] were involved, probably last
September.

# time /usr/pgsql-9.6/bin/pg_dump --schema-only --quote-all-identifiers --binary-upgrade --format=custom dbname=ts
|grep-a postgis- |grep -wv postgis-2.2
 
# [pryzbyj@database ~]$ time sudo sh -c 'for f in /var/lib/pgsql/pg_upgrade_dump_*.custom; do x=`pg_restore "$f" |grep
"postgis-2.[^4]"`&& echo "$f $x"; done'
 

Thanks for considering,
Justin


Re: Finding database for pg_upgrade missing library

От
Bruce Momjian
Дата:
On Sat, Jul 14, 2018 at 12:14:51AM +0200, Daniel Gustafsson wrote:
> > On 13 Jul 2018, at 18:28, Bruce Momjian <bruce@momjian.us> wrote:
> > 
> > I received a private pg_upgrade feature request to report the database
> > name for missing loadable libraries.  Currently we report "could not
> > load library" and the library file name, e.g. $libdir/pgpool-regclass.
> > 
> > The request is that we report the _database_ name that contained the
> > loadable library reference.  However, that isn't easy to do because we
> > gather all loadable library file names, sort them, and remove
> > duplicates, for reasons of efficiency and so we check libraries in a
> > predictable alphabetical order.
> > 
> > Is it worth modifying pg_upgrade to report the first or all databases
> > that contain references to missing loadable libraries?  I don't think
> > so, but I wanted to ask here.
> 
> I ran into that very problem when upgrading 50+ clusters during a long night a
> while back, and patched pg_upgrade to report the database which helped a lot
> (or at least helped me be a bit lazy).  So, +1 on the feature from me.  If
> there is interest I can see if I can dig out the patch and polish it up.

Yes, please post the patch.  Seems we now have three people who want
this.  Even though it is related to reporting errors, I think this is a
new feature so will only be in PG 12.

Looking at the code, we do a qsort(), then detect (since they are all
now adjacent) and remove the duplicate references to the library.  What
I think should be done is to move the duplicate detection down to the
place where we check for the library, then print all the database names
of the duplicates if we don't find the library.  I guess we either need
a separate array for the database name, or a 'struct' that holds the
library name and database name.

-- 
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +


Re: Finding database for pg_upgrade missing library

От
Bruce Momjian
Дата:
On Fri, Jul 13, 2018 at 10:15:34PM -0500, Justin T Pryzby wrote:
> On Fri, Jul 13, 2018 at 12:28:15PM -0400, Bruce Momjian wrote:
> > I received a private pg_upgrade feature request to report the database
> > name for missing loadable libraries.  Currently we report "could not
> > load library" and the library file name, e.g. $libdir/pgpool-regclass.
> > 
> > The request is that we report the _database_ name that contained the
> > loadable library reference.  However, that isn't easy to do because we
> > gather all loadable library file names, sort them, and remove
> > duplicates, for reasons of efficiency and so we check libraries in a
> > predictable alphabetical order.
> > 
> > Is it worth modifying pg_upgrade to report the first or all databases
> > that contain references to missing loadable libraries?  I don't think
> > so, but I wanted to ask here.
> 
> Yes please, with a preference for the "all databases" option.
> 
> We typically have only 4 DBs, including postgres and template1,2.  It's
> annoying enough when an upgrade process breaks because pg_repack or
> pg_stat_buffercache installed into postgres DB.  But it's a veritable pain when
> you discover in the middle of an upgrade that postgis had been somehow loaded
> into template1, needs to be uninstalled (or upgraded from 22 to 23 to allow
> upgrade), old postgis package was already removed..  Maybe you find that one
> library was installed one place, fix it and restart the upgrade process.  Then
> it fails because the old library was also installed some other place..
> 
> When I've had to figure this out in the past, I ended up grepping the dumps to
> figure out what old library was where.

OK, we now have three people who want this so we will get it into PG 12.
Thanks.

-- 
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +


Re: Finding database for pg_upgrade missing library

От
Daniel Gustafsson
Дата:
> On 16 Jul 2018, at 14:13, Bruce Momjian <bruce@momjian.us> wrote:
>
> On Sat, Jul 14, 2018 at 12:14:51AM +0200, Daniel Gustafsson wrote:
>>> On 13 Jul 2018, at 18:28, Bruce Momjian <bruce@momjian.us> wrote:
>>>
>>> I received a private pg_upgrade feature request to report the database
>>> name for missing loadable libraries.  Currently we report "could not
>>> load library" and the library file name, e.g. $libdir/pgpool-regclass.
>>>
>>> The request is that we report the _database_ name that contained the
>>> loadable library reference.  However, that isn't easy to do because we
>>> gather all loadable library file names, sort them, and remove
>>> duplicates, for reasons of efficiency and so we check libraries in a
>>> predictable alphabetical order.
>>>
>>> Is it worth modifying pg_upgrade to report the first or all databases
>>> that contain references to missing loadable libraries?  I don't think
>>> so, but I wanted to ask here.
>>
>> I ran into that very problem when upgrading 50+ clusters during a long night a
>> while back, and patched pg_upgrade to report the database which helped a lot
>> (or at least helped me be a bit lazy).  So, +1 on the feature from me.  If
>> there is interest I can see if I can dig out the patch and polish it up.
>
> Yes, please post the patch.  Seems we now have three people who want
> this.  Even though it is related to reporting errors, I think this is a
> new feature so will only be in PG 12.

Completely agree.

Sorry for the delay in posting the this.  I dug out the code I used when
upgrading my clusters, rebased and adapted the patch slightly and it seems to
work as expected.

> Looking at the code, we do a qsort(), then detect (since they are all
> now adjacent) and remove the duplicate references to the library.  What
> I think should be done is to move the duplicate detection down to the
> place where we check for the library, then print all the database names
> of the duplicates if we don't find the library.  I guess we either need
> a separate array for the database name, or a 'struct' that holds the
> library name and database name.

I went for the solution of not removing the duplicates in the first pass, and
instead dedup during reporting.  It’s sort of unelegant in that it consumes
more memory than strictly limited, but the simplicity of the code weighs up
IMO.  An additional step to free the libraries struct member before continuing
could perhaps be worth it?

cheers ./daniel


Вложения

Re: Finding database for pg_upgrade missing library

От
Bruce Momjian
Дата:
On Tue, Jul 24, 2018 at 10:33:48AM +0200, Daniel Gustafsson wrote:
> > Looking at the code, we do a qsort(), then detect (since they are all
> > now adjacent) and remove the duplicate references to the library.  What
> > I think should be done is to move the duplicate detection down to the
> > place where we check for the library, then print all the database names
> > of the duplicates if we don't find the library.  I guess we either need
> > a separate array for the database name, or a 'struct' that holds the
> > library name and database name.
> 
> I went for the solution of not removing the duplicates in the first pass, and
> instead dedup during reporting.  It’s sort of unelegant in that it consumes
> more memory than strictly limited, but the simplicity of the code weighs up
> IMO.  An additional step to free the libraries struct member before continuing
> could perhaps be worth it?

I have applied a patch to implement this for PG 12:

       https://git.postgresql.org/pg/commitdiff/60e3bd1d7f92430b24b710ecf0559656eb8ed499

I looked at your patch and made some adjustments:

        sort by library name, then database id (this last part was
        missing)
        your 'continue' block repeatedly probed the same library name if
                the library was found
        restructured the loop to report the database name at the end of
                the loop, not in a 'continue'
        report the database name in a format like other pg_upgrade error
                text files

Thanks.

-- 
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +