Обсуждение: Upgrade from PG12 to PG

Поиск
Список
Период
Сортировка

Upgrade from PG12 to PG

От
Jef Mortelle
Дата:
Hello,

I want to upgrade from PG12 to P15.

I use /usr/lib/postgresql15/bin/pg_upgrade -v -p 5431 -P 5432

I takes a lot of time, more than 6 hours

Log file, al lot of  lines pg_restore: executing BLOB 1

Looking at the dump file: man many lines like SELECT 
pg_catalog.lo_unlink('100000');


I have the same issue with /usr/lib/postgresql15/bin/pg_upgrade -v -p 
5431 -P 5432 -k


Whats going on ?


Kind regards




Re: Upgrade from PG12 to PG

От
Ilya Kosmodemiansky
Дата:
Hi Jef,


On Thu, Jul 20, 2023 at 1:23 PM Jef Mortelle <jefmortelle@gmail.com> wrote:
> Looking at the dump file: man many lines like SELECT
> pg_catalog.lo_unlink('100000');
>
>
> I have the same issue with /usr/lib/postgresql15/bin/pg_upgrade -v -p
> 5431 -P 5432 -k
>
>
> Whats going on ?

pg_upgrade is known to be problematic with large objects.
Please take a look here to start with:
https://www.postgresql.org/message-id/20210309200819.GO2021%40telsasoft.com

>
>
> Kind regards
>
>
>



Re: Upgrade from PG12 to PG

От
"Edward J. Sabol"
Дата:
On Jul 20, 2023, at 7:23 AM, Jef Mortelle <jefmortelle@gmail.com> wrote:
> I use /usr/lib/postgresql15/bin/pg_upgrade -v -p 5431 -P 5432
>
> I takes a lot of time, more than 6 hours

I'm in the middle of a similar database migration myself, but I'm coming from 10.23.

Have you considered adding the --link and --jobs=NN (where NN is the number of CPU cores on your server minus 2 or
somethinglike that) to your pg_upgrade command? I wonder if that would speed things up significantly. 

Also, if your file system supports reflinks (Linux kernel 4.5 and XFS with reflink support activated or Btrfs or macOS
APFS),then use --clone instead of --link. 

Good luck,
Ed




Re: Upgrade from PG12 to PG

От
Jef Mortelle
Дата:
Hi,

Alsready tried to use --link and --jobs, but you cannot ommit the 
"select  lo_unlink ...."   for every rows containing datatype text in 
your database that the pg_* program creates in the export/dump file.
These select's takes a lot of time, hours ...

The pg_dumpall => size of the dumpfile is much more than the size of 
exported database

Kind regards


On 20/07/2023 15:29, Edward J. Sabol wrote:
> On Jul 20, 2023, at 7:23 AM, Jef Mortelle <jefmortelle@gmail.com> wrote:
>> I use /usr/lib/postgresql15/bin/pg_upgrade -v -p 5431 -P 5432
>>
>> I takes a lot of time, more than 6 hours
> I'm in the middle of a similar database migration myself, but I'm coming from 10.23.
>
> Have you considered adding the --link and --jobs=NN (where NN is the number of CPU cores on your server minus 2 or
somethinglike that) to your pg_upgrade command? I wonder if that would speed things up significantly.
 
>
> Also, if your file system supports reflinks (Linux kernel 4.5 and XFS with reflink support activated or Btrfs or
macOSAPFS), then use --clone instead of --link.
 
>
> Good luck,
> Ed
>



Re: Upgrade from PG12 to PG

От
Jef Mortelle
Дата:
Hi,

Many thanks for your answer.

So: not possible to have very little downtime if you have a database 
with al lot rows containing text  as datatype, as pg_upgrade needs 12hr 
for 24 milj rows in pg_largeobject.

Testing now with pg_dumpall en pg_restore ....


I think, postgresql should take this in high priority to  resolve this 
problem.

I have to make a choice in the near future: Postgres or Oracle, and that 
database would have a lot of datatype text.
Database would have 1 TB.
It seems me a little bit tricky/dangerous to use Postgres, just for 
being able to upgrade to a newer version.

Kind regards.

On 20/07/2023 13:43, Ilya Kosmodemiansky wrote:
> Hi Jef,
>
>
> On Thu, Jul 20, 2023 at 1:23 PM Jef Mortelle <jefmortelle@gmail.com> wrote:
>> Looking at the dump file: man many lines like SELECT
>> pg_catalog.lo_unlink('100000');
>>
>>
>> I have the same issue with /usr/lib/postgresql15/bin/pg_upgrade -v -p
>> 5431 -P 5432 -k
>>
>>
>> Whats going on ?
> pg_upgrade is known to be problematic with large objects.
> Please take a look here to start with:
> https://www.postgresql.org/message-id/20210309200819.GO2021%40telsasoft.com
>
>>
>> Kind regards
>>
>>
>>



Re: Upgrade from PG12 to PG

От
Laurenz Albe
Дата:
On Thu, 2023-07-20 at 15:46 +0200, Jef Mortelle wrote:
> So: not possible to have very little downtime if you have a database
> with al lot rows containing text  as datatype, as pg_upgrade needs 12hr
> for 24 milj rows in pg_largeobject.

To clarify: "text" is no problem at all.  Large objects are.

Yours,
Laurenz Albe



Re: Upgrade from PG12 to PG

От
Jef Mortelle
Дата:

Yes,
But create text allows you to store more data,


8.3. Character Types

Table 8.4. Character Types

NameDescription
character varying(n), varchar(n)variable-length with limit
character(n), char(n)fixed-length, blank padded
textvariable unlimited length

On 20/07/2023 16:46, Laurenz Albe wrote:
On Thu, 2023-07-20 at 15:46 +0200, Jef Mortelle wrote:
So: not possible to have very little downtime if you have a database 
with al lot rows containing text  as datatype, as pg_upgrade needs 12hr 
for 24 milj rows in pg_largeobject.
To clarify: "text" is no problem at all.  Large objects are.

Yours,
Laurenz Albe

Re: Upgrade from PG12 to PG

От
Scott Ribe
Дата:
> On Jul 20, 2023, at 7:46 AM, Jef Mortelle <jefmortelle@gmail.com> wrote:
>
> So: not possible to have very little downtime if you have a database with al lot rows containing text  as datatype,
aspg_upgrade needs 12hr for 24 milj rows in pg_largeobject. 

We need to get terminology straight, as at the moment your posts are very confusing. In PostgreSQL large objects and
textare not the same. Text is basically varchar without a specified length limit. Large object is a blob (but not what
SQLcalls a BLOB)--it is kind of like a file stored outside the normal table mechanism, and provides facilities for
partialreads, etc: https://www.postgresql.org/docs/15/largeobjects.html. There are a number of ways to wind up with
referencesto large objects all deleted, but the orphaned large objects still in the database. 

First thing you should do: run lovacuum -n to find out if you have orphaned large objects. If so, start cleaning those
up,then see how long pg_upgrade takes. 

Second, what's your hardware? I really don't see dump & restore of a 1TB database taking 6 hours.

> Alsready tried to use --link and --jobs, but you cannot ommit the "select  lo_unlink ...."   for every rows
containingdatatype text in your database that the pg_* program creates in the export/dump file. 

Terminology again, or are you conflating two different issues? pg_upgrade --link does not create a dump file.


Re: Upgrade from PG12 to PG

От
Scott Ribe
Дата:
> But create text allows you to store more data, 

not true

varchar without length specified and text both have an upper limit around 1GB




Re: Upgrade from PG12 to PG

От
Ron
Дата:
Don't use pg_dumpall.  Use this instead:

pg_dump --format=directory --jobs=X --verbose

On 7/20/23 08:46, Jef Mortelle wrote:
> Hi,
>
> Many thanks for your answer.
>
> So: not possible to have very little downtime if you have a database with 
> al lot rows containing text  as datatype, as pg_upgrade needs 12hr for 24 
> milj rows in pg_largeobject.
>
> Testing now with pg_dumpall en pg_restore ....
>
>
> I think, postgresql should take this in high priority to  resolve this 
> problem.
>
> I have to make a choice in the near future: Postgres or Oracle, and that 
> database would have a lot of datatype text.
> Database would have 1 TB.
> It seems me a little bit tricky/dangerous to use Postgres, just for being 
> able to upgrade to a newer version.
>
> Kind regards.
>
> On 20/07/2023 13:43, Ilya Kosmodemiansky wrote:
>> Hi Jef,
>>
>>
>> On Thu, Jul 20, 2023 at 1:23 PM Jef Mortelle <jefmortelle@gmail.com> wrote:
>>> Looking at the dump file: man many lines like SELECT
>>> pg_catalog.lo_unlink('100000');
>>>
>>>
>>> I have the same issue with /usr/lib/postgresql15/bin/pg_upgrade -v -p
>>> 5431 -P 5432 -k
>>>
>>>
>>> Whats going on ?
>> pg_upgrade is known to be problematic with large objects.
>> Please take a look here to start with:
>> https://www.postgresql.org/message-id/20210309200819.GO2021%40telsasoft.com
>>
>>>
>>> Kind regards
>>>
>>>
>>>
>
>

-- 
Born in Arizona, moved to Babylonia.



Re: Upgrade from PG12 to PG

От
Jef Mortelle
Дата:
running /usr/lib/postgresql15/bin/pg_upgrade -v -p 5431 -P 5432 -k

gives you in the output :

...
"/usr/lib/postgresql15/bin/pg_dump" --host /pg/PG15 --port 5431 
--username postgres --schema-only --quote-all-identifiers 
--binary-upgrade --format=custom --verbose 
--file="/pg/PG15/system/pg_upgrade_output.d/20230720T184333.610/dump/pg_upgrade_dump_16385.custom" 
'dbname=dbname' >> 
"/pg/PG15/system/pg_upgrade_output.d/20230720T184333.610/log/pg_upgrade_dump_16385.log" 
2>&1
...


so, yes pg_ugrade start a pg_dump session,


even a pg_restore at the end, runs about 7 hours

"/usr/lib/postgresql15/bin/pg_restore" --host /pg/PG15 --port 5432 
--username postgres --create --exit-on-error --verbose --dbname 
template1 
"/pg/PG15/system/pg_upgrade_output.d/20230720T184333.610/dump/pg_upgrade_dump_16385.custom" 
 >> 
"/pg/PG15/system/pg_upgrade_output.d/20230720T184333.610/log/pg_upgrade_dump_16385.log" 
2>&1

with a lot of lines in the logfiles  like:  pg_restore: executing BLOB 
11108809


Server is a VM server, my VM has 64GB SuseSLES  attached to a SAN with 
SSD disk (Hp3Par)

2)vacuumlo

lovacum -n =>  I have vacuumlo, running this:

vacuumlo  dbname  -v

postgres@dbfluxd02:/pg/data> vacuumlo  dflux -v
Connected to database "dflux"
Checking message_oid in fluxnaf.flux_message_content
Checking payload_oid in fluxnaf.bridge_payload_content
Successfully removed 0 large objects from database "dflux".

and again same issues



On 20/07/2023 16:51, Scott Ribe wrote:
>> On Jul 20, 2023, at 7:46 AM, Jef Mortelle <jefmortelle@gmail.com> wrote:
>>
>> So: not possible to have very little downtime if you have a database with al lot rows containing text  as datatype,
aspg_upgrade needs 12hr for 24 milj rows in pg_largeobject.
 
> We need to get terminology straight, as at the moment your posts are very confusing. In PostgreSQL large objects and
textare not the same. Text is basically varchar without a specified length limit. Large object is a blob (but not what
SQLcalls a BLOB)--it is kind of like a file stored outside the normal table mechanism, and provides facilities for
partialreads, etc: https://www.postgresql.org/docs/15/largeobjects.html. There are a number of ways to wind up with
referencesto large objects all deleted, but the orphaned large objects still in the database.
 
>
> First thing you should do: run lovacuum -n to find out if you have orphaned large objects. If so, start cleaning
thoseup, then see how long pg_upgrade takes.
 
>
> Second, what's your hardware? I really don't see dump & restore of a 1TB database taking 6 hours.
>
>> Alsready tried to use --link and --jobs, but you cannot ommit the "select  lo_unlink ...."   for every rows
containingdatatype text in your database that the pg_* program creates in the export/dump file.
 
> Terminology again, or are you conflating two different issues? pg_upgrade --link does not create a dump file.



Re: Upgrade from PG12 to PG

От
Ron
Дата:
Jeff,

It does a "--schema-only" dump.  The gianter the schema, the longer that "pg_dump --schema-only" takes.

Note also that there's a known issue with pg_upgrade and millions of Large Objects (not bytea or text, but lo_* columns).

On 7/20/23 12:05, Jef Mortelle wrote:
running /usr/lib/postgresql15/bin/pg_upgrade -v -p 5431 -P 5432 -k

gives you in the output :

...
"/usr/lib/postgresql15/bin/pg_dump" --host /pg/PG15 --port 5431 --username postgres --schema-only --quote-all-identifiers --binary-upgrade --format=custom --verbose --file="/pg/PG15/system/pg_upgrade_output.d/20230720T184333.610/dump/pg_upgrade_dump_16385.custom" 'dbname=dbname' >> "/pg/PG15/system/pg_upgrade_output.d/20230720T184333.610/log/pg_upgrade_dump_16385.log" 2>&1
...


so, yes pg_ugrade start a pg_dump session,


even a pg_restore at the end, runs about 7 hours

"/usr/lib/postgresql15/bin/pg_restore" --host /pg/PG15 --port 5432 --username postgres --create --exit-on-error --verbose --dbname template1 "/pg/PG15/system/pg_upgrade_output.d/20230720T184333.610/dump/pg_upgrade_dump_16385.custom" >> "/pg/PG15/system/pg_upgrade_output.d/20230720T184333.610/log/pg_upgrade_dump_16385.log" 2>&1

with a lot of lines in the logfiles  like:  pg_restore: executing BLOB 11108809


Server is a VM server, my VM has 64GB SuseSLES  attached to a SAN with SSD disk (Hp3Par)

2)vacuumlo

lovacum -n =>  I have vacuumlo, running this:

vacuumlo  dbname  -v

postgres@dbfluxd02:/pg/data> vacuumlo  dflux -v
Connected to database "dflux"
Checking message_oid in fluxnaf.flux_message_content
Checking payload_oid in fluxnaf.bridge_payload_content
Successfully removed 0 large objects from database "dflux".

and again same issues



On 20/07/2023 16:51, Scott Ribe wrote:
On Jul 20, 2023, at 7:46 AM, Jef Mortelle <jefmortelle@gmail.com> wrote:

So: not possible to have very little downtime if you have a database with al lot rows containing text  as datatype, as pg_upgrade needs 12hr for 24 milj rows in pg_largeobject.
We need to get terminology straight, as at the moment your posts are very confusing. In PostgreSQL large objects and text are not the same. Text is basically varchar without a specified length limit. Large object is a blob (but not what SQL calls a BLOB)--it is kind of like a file stored outside the normal table mechanism, and provides facilities for partial reads, etc: https://www.postgresql.org/docs/15/largeobjects.html. There are a number of ways to wind up with references to large objects all deleted, but the orphaned large objects still in the database.

First thing you should do: run lovacuum -n to find out if you have orphaned large objects. If so, start cleaning those up, then see how long pg_upgrade takes.

Second, what's your hardware? I really don't see dump & restore of a 1TB database taking 6 hours.

Alsready tried to use --link and --jobs, but you cannot ommit the "select  lo_unlink ...."   for every rows containing datatype text in your database that the pg_* program creates in the export/dump file.
Terminology again, or are you conflating two different issues? pg_upgrade --link does not create a dump file.



--
Born in Arizona, moved to Babylonia.

Re: Upgrade from PG12 to PG

От
Scott Ribe
Дата:
> On Jul 20, 2023, at 11:05 AM, Jef Mortelle <jefmortelle@gmail.com> wrote:
>
> so, yes pg_ugrade start a pg_dump session,

Only for the schema, which you can see in the output you posted.

> Server is a VM server, my VM has 64GB SuseSLES  attached to a SAN with SSD disk (Hp3Par)

VM + SAN can perform well, or introduce all sorts of issues: busy neighbor, poor VM drivers, SAN only fast for large
sequentialwrites, etc. 

> On Jul 20, 2023, at 11:22 AM, Ron <ronljohnsonjr@gmail.com> wrote:
>
> Note also that there's a known issue with pg_upgrade and millions of Large Objects (not bytea or text, but lo_*
columns).


Good to know, but it would be weird to have millions of large objects in a 1TB database. (Then again, I found an old
postabout 3M large objects taking 5.5GB...) 

Try:
  time a run of that pg_dump command, then time a run of pg_restore of the schema only dump
  time a file copy of the db to a location on the SAN--purpose is not to produce a usable backup, but rather to check
IOthroughput 
  use the link option on pg_upgrade

Searching on this subject turns up some posts about slow restore of large objects under much older versions of PG--not
sureif any of it still applies. 

Finally given the earlier confusion between text and large objects, your apparent belief that text columns correlated
tolarge objects, and that text could hold more data than varchar, it's worth asking: do you actually need large objects
atall? (Is this even under your control?) 


Re: Upgrade from PG12 to PG

От
Jef Mortelle
Дата:


On 20/07/2023 16:51, Scott Ribe wrote:
Alsready tried to use --link and --jobs, but you cannot ommit the "select  lo_unlink ...."   for every rows containing datatype text in your database that the pg_* program creates in the export/dump file.
Terminology again, or are you conflating two different issues? pg_upgrade --link does not create a dump file.

why does pg_dump/pg_restore for every row in pg_largeobject a   "select lo_unlink ...." this takes a lot of time

Re: Upgrade from PG12 to PG

От
Jef Mortelle
Дата:


On 20/07/2023 20:34, Scott Ribe wrote:
On Jul 20, 2023, at 11:05 AM, Jef Mortelle <jefmortelle@gmail.com> wrote:

so, yes pg_ugrade start a pg_dump session,
Only for the schema, which you can see in the output you posted.

=> the pg_restore of this pg_dump takes about 7 hours ... which is 99% used for executing the query like:  SELECT pg_catalog.lo_unlink('oid');


Good to know, but it would be weird to have millions of large objects in a 1TB database. (Then again, I found an old post about 3M large objects taking 5.5GB...)

Try:  time a run of that pg_dump command, then time a run of pg_restore of the schema only dump
=> pg_dump schema_only, after RAM upgrade from 8GB up to 64GB (otherwise the query against pg_largeobject ends in a OUT of Memory error) runs in about 3-4 minutes
=> pg_restore takes 7 hours, which is 99% used for executing the query like:  SELECT pg_catalog.lo_unlink('oid');
  use the link option on pg_upgrade
I used the link option in al my tests, and it takes the times

For some reason Postgres creates a new subdirectory for each PG version (I make use of tablespaces for each database in my PG cluster), also with using the link option.
So after some upgrade,  it ends in a really mess with directory's?

Searching on this subject turns up some posts about slow restore of large objects under much older versions of PG--not sure if any of it still applies.

Finally given the earlier confusion between text and large objects, your apparent belief that text columns correlated to large objects, and that text could hold more data than varchar, it's worth asking: do you actually need large objects at all? (Is this even under your control?)
The use of OID (large objects): it depends on the vendor of the software. I can ask the vendor to change to another type .... but honestly I don't believe it will changed in the near feature.
Database is 95GB, so not so big ;-) but have ~25miljon large objects in it.

Re: Upgrade from PG12 to PG

От
Scott Ribe
Дата:
> On Jul 24, 2023, at 12:38 AM, Jef Mortelle <jefmortelle@gmail.com> wrote:

> For some reason Postgres creates a new subdirectory for each PG version (I make use of tablespaces for each database
inmy PG cluster), also with using the link option. 
> So after some upgrade,  it ends in a really mess with directory's?

At the end of pg_upgrade, you can start up the old version against the old directory, or the new version against the
newdirectory. (With --link, only until writing into the db, then you are committed to the running version.) Once you
arecomfortable that everything is good with the new version, you should delete the old data. Alternatively, if there is
aproblem forcing you back to the old version, you delete the new data. 

> => pg_dump schema_only, after RAM upgrade from 8GB up to 64GB (otherwise the query against pg_largeobject ends in a
OUTof Memory error) runs in about 3-4 minutes 
> => pg_restore takes 7 hours, which is 99% used for executing the query like:  SELECT pg_catalog.lo_unlink('oid');

Given the tests you've run, it seems to me that it is doing something which it ought not when using --link.

> Database is 95GB, so not so big ;-) but have ~25miljon large objects in it.

I suppose the use of large objects here is an artifact of support for other databases which have much lower limits on
varcharcolumn length. 





Re: Upgrade from PG12 to PG

От
Jef Mortelle
Дата:
Hello

This is the syntax:

export PGDATA=/pg/PG15/system
export PATH=/usr/lib/postgresql15/bin:/bin:/usr/bin:/usr/local/bin

export PGDATAOLD=/pg/data
export PGDATANEW=/pg/PG15/system
export PGBINOLD=/usr/lib/postgresql12/bin
export PGBINNEW=/usr/lib/postgresql15/bin

/usr/lib/postgresql15/bin/pg_upgrade -r -v -p 5431 -P 5432 -k -j 8

  -r =--link

Kind regards

On 24/07/2023 14:52, Scott Ribe wrote:
>> On Jul 24, 2023, at 12:38 AM, Jef Mortelle <jefmortelle@gmail.com> wrote:
>> For some reason Postgres creates a new subdirectory for each PG version (I make use of tablespaces for each database
inmy PG cluster), also with using the link option.
 
>> So after some upgrade,  it ends in a really mess with directory's?
> At the end of pg_upgrade, you can start up the old version against the old directory, or the new version against the
newdirectory. (With --link, only until writing into the db, then you are committed to the running version.) Once you
arecomfortable that everything is good with the new version, you should delete the old data. Alternatively, if there is
aproblem forcing you back to the old version, you delete the new data.
 
>
>> => pg_dump schema_only, after RAM upgrade from 8GB up to 64GB (otherwise the query against pg_largeobject ends in a
OUTof Memory error) runs in about 3-4 minutes
 
>> => pg_restore takes 7 hours, which is 99% used for executing the query like:  SELECT pg_catalog.lo_unlink('oid');
> Given the tests you've run, it seems to me that it is doing something which it ought not when using --link.
>
>> Database is 95GB, so not so big ;-) but have ~25miljon large objects in it.
> I suppose the use of large objects here is an artifact of support for other databases which have much lower limits on
varcharcolumn length.
 
>
>



Re: Upgrade from PG12 to PG

От
Jef Mortelle
Дата:
correction: -k = --link


On 24/07/2023 14:59, Jef Mortelle wrote:
> Hello
>
> This is the syntax:
>
> export PGDATA=/pg/PG15/system
> export PATH=/usr/lib/postgresql15/bin:/bin:/usr/bin:/usr/local/bin
>
> export PGDATAOLD=/pg/data
> export PGDATANEW=/pg/PG15/system
> export PGBINOLD=/usr/lib/postgresql12/bin
> export PGBINNEW=/usr/lib/postgresql15/bin
>
> /usr/lib/postgresql15/bin/pg_upgrade -r -v -p 5431 -P 5432 -k -j 8
>
>  -r =--link
>
> Kind regards
>
> On 24/07/2023 14:52, Scott Ribe wrote:
>>> On Jul 24, 2023, at 12:38 AM, Jef Mortelle <jefmortelle@gmail.com> 
>>> wrote:
>>> For some reason Postgres creates a new subdirectory for each PG 
>>> version (I make use of tablespaces for each database in my PG 
>>> cluster), also with using the link option.
>>> So after some upgrade,  it ends in a really mess with directory's?
>> At the end of pg_upgrade, you can start up the old version against 
>> the old directory, or the new version against the new directory. 
>> (With --link, only until writing into the db, then you are committed 
>> to the running version.) Once you are comfortable that everything is 
>> good with the new version, you should delete the old data. 
>> Alternatively, if there is a problem forcing you back to the old 
>> version, you delete the new data.
>>
>>> => pg_dump schema_only, after RAM upgrade from 8GB up to 64GB 
>>> (otherwise the query against pg_largeobject ends in a OUT of Memory 
>>> error) runs in about 3-4 minutes
>>> => pg_restore takes 7 hours, which is 99% used for executing the 
>>> query like:  SELECT pg_catalog.lo_unlink('oid');
>> Given the tests you've run, it seems to me that it is doing something 
>> which it ought not when using --link.
>>
>>> Database is 95GB, so not so big ;-) but have ~25miljon large objects 
>>> in it.
>> I suppose the use of large objects here is an artifact of support for 
>> other databases which have much lower limits on varchar column length.
>>
>>