Обсуждение: pg_upgradecluster transfering only a portion of the data

Поиск
Список
Период
Сортировка

pg_upgradecluster transfering only a portion of the data

От
Dávid Suchan
Дата:
Hello, I tried upgrading pg db from version 9.6 to 14 by using pg_upgradecluster command. I freshly installed pg 14 -> ran pg_dropcluster 14 main --stop -> and then upgraded using pg_upgradecluster 9.6 main.
After a successful prompt finished, I checked the database and the size went from originally 20gb (in 9.6) to 700~ mb (in 14) while the disk space available shrank by about 2gb meaning that there is still the 20gb of data. I tried the entire process twice (since I had created an AWS EC2 snapshot for this) and the result was the same. 
Is my solution to migrating old pg version to the new one wrong? Before this I tried the same process with around 300mb of data and all of that transferred successfully. If I did not understand the pg_upgradecluster command, what would be the best practice when upgrading pg version with huge amounts of data(could be a terabyte)?

Re: pg_upgradecluster transfering only a portion of the data

От
Adrian Klaver
Дата:
On 2/27/23 07:44, Dávid Suchan wrote:
> Hello, I tried upgrading pg db from version 9.6 to 14 by using 
> pg_upgradecluster command. I freshly installed pg 14 -> ran 
> pg_dropcluster 14 main --stop -> and then upgraded using 
> pg_upgradecluster 9.6 main.
> After a successful prompt finished, I checked the database and the size 
> went from originally 20gb (in 9.6) to 700~ mb (in 14) while the disk 
> space available shrank by about 2gb meaning that there is still the 20gb 
> of data. I tried the entire process twice (since I had created an AWS 

1) How did you measure the size of the database clusters?

2) pg_upgrade will not remove the old cluster automatically so it not 
surprising the overall disk usage increased.

3) Did you see any messages at end of upgrade mentioning issues?

4) Have you connected to new cluster to see if everything is there?

5) Does the Postgres log provide any relevant information?

> EC2 snapshot for this) and the result was the same.
> Is my solution to migrating old pg version to the new one wrong? Before 
> this I tried the same process with around 300mb of data and all of that 
> transferred successfully. If I did not understand the pg_upgradecluster 
> command, what would be the best practice when upgrading pg version with 
> huge amounts of data(could be a terabyte)?

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: pg_upgradecluster transfering only a portion of the data

От
Tom Lane
Дата:
Adrian Klaver <adrian.klaver@aklaver.com> writes:
> On 2/27/23 07:44, Dávid Suchan wrote:
>> After a successful prompt finished, I checked the database and the size 
>> went from originally 20gb (in 9.6) to 700~ mb (in 14) while the disk 
>> space available shrank by about 2gb meaning that there is still the 20gb 
>> of data. I tried the entire process twice (since I had created an AWS 

> 1) How did you measure the size of the database clusters?

If it was based on something like "du", perhaps the measurement
was fooled by the fact that most of the data files will be hard-linked
between the old and new clusters.

            regards, tom lane



Re: pg_upgradecluster transfering only a portion of the data

От
Adrian Klaver
Дата:
On 2/27/23 08:36, Tom Lane wrote:
> Adrian Klaver <adrian.klaver@aklaver.com> writes:
>> On 2/27/23 07:44, Dávid Suchan wrote:
>>> After a successful prompt finished, I checked the database and the size
>>> went from originally 20gb (in 9.6) to 700~ mb (in 14) while the disk
>>> space available shrank by about 2gb meaning that there is still the 20gb
>>> of data. I tried the entire process twice (since I had created an AWS
> 
>> 1) How did you measure the size of the database clusters?
> 
> If it was based on something like "du", perhaps the measurement
> was fooled by the fact that most of the data files will be hard-linked
> between the old and new clusters.

Does that happen without the --link option?

> 
>             regards, tom lane

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: pg_upgradecluster transfering only a portion of the data

От
Tom Lane
Дата:
Adrian Klaver <adrian.klaver@aklaver.com> writes:
> On 2/27/23 08:36, Tom Lane wrote:
>> If it was based on something like "du", perhaps the measurement
>> was fooled by the fact that most of the data files will be hard-linked
>> between the old and new clusters.

> Does that happen without the --link option?

No, but the OP didn't mention whether he used that, and even if he
didn't say it explicitly the pg_upgradecluster wrapper might've
supplied it.

            regards, tom lane



Re: pg_upgradecluster transfering only a portion of the data

От
Adrian Klaver
Дата:
On 2/27/23 08:48, Tom Lane wrote:
> Adrian Klaver <adrian.klaver@aklaver.com> writes:
>> On 2/27/23 08:36, Tom Lane wrote:
>>> If it was based on something like "du", perhaps the measurement
>>> was fooled by the fact that most of the data files will be hard-linked
>>> between the old and new clusters.
> 
>> Does that happen without the --link option?
> 
> No, but the OP didn't mention whether he used that, and even if he
> didn't say it explicitly the pg_upgradecluster wrapper might've
> supplied it.

Alright the OP's command was:

pg_upgradecluster 9.6 main

pg_upgradecluster needs the --link(-k) option to use link instead of copy.

> 
>             regards, tom lane

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: pg_upgradecluster transfering only a portion of the data

От
Adrian Klaver
Дата:
On 2/27/23 08:49, Dávid Suchan wrote:

Reply to list
Ccing list for real this time.

> 1) I used \l+ in psql and then counted rows- millions were missing

\l lists databases.

Are you saying there are millions of database?

Otherwise what rows where you counting?

> 3) nothing at all, everything was "success"
> 4) I did not, I presume it is there, the question is why only 700 mb was 
> transferred

If you have not connected how could you do the \l and row count?

> 5) would it be inside main pg log? Or some special one?
> 
> Dňa po 27. 2. 2023, 17:14 Adrian Klaver <adrian.klaver@aklaver.com 
> <mailto:adrian.klaver@aklaver.com>> napísal(a):
> 
>     On 2/27/23 07:44, Dávid Suchan wrote:
>      > Hello, I tried upgrading pg db from version 9.6 to 14 by using
>      > pg_upgradecluster command. I freshly installed pg 14 -> ran
>      > pg_dropcluster 14 main --stop -> and then upgraded using
>      > pg_upgradecluster 9.6 main.
>      > After a successful prompt finished, I checked the database and
>     the size
>      > went from originally 20gb (in 9.6) to 700~ mb (in 14) while the disk
>      > space available shrank by about 2gb meaning that there is still
>     the 20gb
>      > of data. I tried the entire process twice (since I had created an
>     AWS
> 
>     1) How did you measure the size of the database clusters?
> 
>     2) pg_upgrade will not remove the old cluster automatically so it not
>     surprising the overall disk usage increased.
> 
>     3) Did you see any messages at end of upgrade mentioning issues?
> 
>     4) Have you connected to new cluster to see if everything is there?
> 
>     5) Does the Postgres log provide any relevant information?
> 
>      > EC2 snapshot for this) and the result was the same.
>      > Is my solution to migrating old pg version to the new one wrong?
>     Before
>      > this I tried the same process with around 300mb of data and all
>     of that
>      > transferred successfully. If I did not understand the
>     pg_upgradecluster
>      > command, what would be the best practice when upgrading pg
>     version with
>      > huge amounts of data(could be a terabyte)?
> 
>     -- 
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: pg_upgradecluster transfering only a portion of the data

От
Adrian Klaver
Дата:
On 2/27/23 09:05, Dávid Suchan wrote:

Please use Reply All
Ccing list

> My bad,
> \l+ lists databases and their respective sizes- I used that and also 
> pg_size_pretty(), the result size was the same - before it was 20gb for 
> the biggest db, after it was 700mb.
> I counted rows before the upgrade in one of the biggest and most 
> important table that I was watching and comparing - before there were 
> hundreds of thousands or millions of rows(not sure about the exact 
> number, just a lot), after the upgrade only like 15, but the first 15 
> rows matched after the upgrade when I checked with select of that table.
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: pg_upgradecluster transfering only a portion of the data

От
Dávid Suchan
Дата:
I did not use the -k --link argument while upgrading as that I presume does not copy the data 

Dňa po 27. 2. 2023, 18:10 Adrian Klaver <adrian.klaver@aklaver.com> napísal(a):
On 2/27/23 09:05, Dávid Suchan wrote:

Please use Reply All
Ccing list

> My bad,
> \l+ lists databases and their respective sizes- I used that and also
> pg_size_pretty(), the result size was the same - before it was 20gb for
> the biggest db, after it was 700mb.
> I counted rows before the upgrade in one of the biggest and most
> important table that I was watching and comparing - before there were
> hundreds of thousands or millions of rows(not sure about the exact
> number, just a lot), after the upgrade only like 15, but the first 15
> rows matched after the upgrade when I checked with select of that table.
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: pg_upgradecluster transfering only a portion of the data

От
Laurenz Albe
Дата:
On Mon, 2023-02-27 at 18:18 +0100, Dávid Suchan wrote:
> I did not use the -k --link argument while upgrading as that I presume does not copy the data 

It would be great if you shared the exact command line you used.

The man page of "pg_upgradecluster" says:

 -m, --method=dump|upgrade|link|clone
     Specify the upgrade method.  dump uses pg_dump(1) and pg_restore(1), upgrade uses pg_upgrade(1).  The default is
dump.

Yours,
Laurenz Albe



Re: pg_upgradecluster transfering only a portion of the data

От
Adrian Klaver
Дата:
On 2/27/23 09:10, Adrian Klaver wrote:
> On 2/27/23 09:05, Dávid Suchan wrote:
> 
> Please use Reply All
> Ccing list
> 
>> My bad,
>> \l+ lists databases and their respective sizes- I used that and also 
>> pg_size_pretty(), the result size was the same - before it was 20gb 
>> for the biggest db, after it was 700mb.
>> I counted rows before the upgrade in one of the biggest and most 
>> important table that I was watching and comparing - before there were 
>> hundreds of thousands or millions of rows(not sure about the exact 
>> number, just a lot), after the upgrade only like 15, but the first 15 
>> rows matched after the upgrade when I checked with select of that table.

Hmm, I can't see how you got that state without there being some sort of 
error messages.

Just to be clear:

1) The 9.6 and 14 instances where installed from the same source?

2) Are you using tablespaces other then the default?


Scan the Postgres log for the 14 instance at /var/log/postgresql for 
error messages.



-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: pg_upgradecluster transfering only a portion of the data

От
Dávid Suchan
Дата:
I used "pg_upgradecluster 9.6 main", all commands are in my first message: I freshly installed pg 14 -> ran pg_dropcluster 14 main --stop -> and then upgraded using pg_upgradecluster 9.6 main.

po 27. 2. 2023 o 18:40 Laurenz Albe <laurenz.albe@cybertec.at> napísal(a):
On Mon, 2023-02-27 at 18:18 +0100, Dávid Suchan wrote:
> I did not use the -k --link argument while upgrading as that I presume does not copy the data 

It would be great if you shared the exact command line you used.

The man page of "pg_upgradecluster" says:

 -m, --method=dump|upgrade|link|clone
     Specify the upgrade method.  dump uses pg_dump(1) and pg_restore(1), upgrade uses pg_upgrade(1).  The default is dump.

Yours,
Laurenz Albe

Re: pg_upgradecluster transfering only a portion of the data

От
Dávid Suchan
Дата:
1) i downloaded both versions using apt-get install postgres
2) i will check the tablespace and log files tomorrow, i don't have access to the workstation right now.

po 27. 2. 2023 o 18:44 Adrian Klaver <adrian.klaver@aklaver.com> napísal(a):
On 2/27/23 09:10, Adrian Klaver wrote:
> On 2/27/23 09:05, Dávid Suchan wrote:
>
> Please use Reply All
> Ccing list
>
>> My bad,
>> \l+ lists databases and their respective sizes- I used that and also
>> pg_size_pretty(), the result size was the same - before it was 20gb
>> for the biggest db, after it was 700mb.
>> I counted rows before the upgrade in one of the biggest and most
>> important table that I was watching and comparing - before there were
>> hundreds of thousands or millions of rows(not sure about the exact
>> number, just a lot), after the upgrade only like 15, but the first 15
>> rows matched after the upgrade when I checked with select of that table.

Hmm, I can't see how you got that state without there being some sort of
error messages.

Just to be clear:

1) The 9.6 and 14 instances where installed from the same source?

2) Are you using tablespaces other then the default?


Scan the Postgres log for the 14 instance at /var/log/postgresql for
error messages.



--
Adrian Klaver
adrian.klaver@aklaver.com