Обсуждение: Assistance Needed: Issue with pg_upgrade and --link option
Dear Postgres Hackers,
I hope this email finds you well. I am currently facing an issue while performing an upgrade using the pg_upgrade utility with the --link option. I was under the impression that the --link option would create hard links between the old and new cluster's data files, but it appears that the entire old cluster data was copied to the new cluster, resulting in a significant increase in the new cluster's size.
Here are the details of my scenario:
- PostgreSQL version: [Old Version: Postgres 11.4 | New Version: Postgres 14.0]
- Command used for pg_upgrade: [~/pg_upgrade_testing/postgres_14/bin/pg_upgrade -b ~/pg_upgrade_testing/postgres_11.4/bin -B ~/pg_upgrade_testing/postgres_14/bin -d ~/pg_upgrade_testing/postgres_11.4/replica_db2 -D ~/pg_upgrade_testing/postgres_14/new_pg -r -k
I hope this email finds you well. I am currently facing an issue while performing an upgrade using the pg_upgrade utility with the --link option. I was under the impression that the --link option would create hard links between the old and new cluster's data files, but it appears that the entire old cluster data was copied to the new cluster, resulting in a significant increase in the new cluster's size.
Here are the details of my scenario:
- PostgreSQL version: [Old Version: Postgres 11.4 | New Version: Postgres 14.0]
- Command used for pg_upgrade: [~/pg_upgrade_testing/postgres_14/bin/pg_upgrade -b ~/pg_upgrade_testing/postgres_11.4/bin -B ~/pg_upgrade_testing/postgres_14/bin -d ~/pg_upgrade_testing/postgres_11.4/replica_db2 -D ~/pg_upgrade_testing/postgres_14/new_pg -r -k
- Paths to the old and new data directories: [~/pg_upgrade_testing/postgres_11.4/replica_db2] [~/pg_upgrade_testing/postgres_14/new_pg]
- OS information: [Ubuntu 22.04.2 linux]
However, after executing the pg_upgrade command with the --link option, I observed that the size of the new cluster is much larger than expected. I expected the --link option to create hard links instead of duplicating the data files.
I am seeking assistance to understand the following:
1. Is my understanding of the --link option correct?
2. Is there any additional configuration or step required to properly utilize the --link option?
3. Are there any limitations or considerations specific to my PostgreSQL version or file system that I should be aware of?
Any guidance, clarification, or troubleshooting steps you can provide would be greatly appreciated. I want to ensure that I am utilizing the --link option correctly and optimize the upgrade process.
Best regards,
Pradeep Kumar
However, after executing the pg_upgrade command with the --link option, I observed that the size of the new cluster is much larger than expected. I expected the --link option to create hard links instead of duplicating the data files.
I am seeking assistance to understand the following:
1. Is my understanding of the --link option correct?
2. Is there any additional configuration or step required to properly utilize the --link option?
3. Are there any limitations or considerations specific to my PostgreSQL version or file system that I should be aware of?
Any guidance, clarification, or troubleshooting steps you can provide would be greatly appreciated. I want to ensure that I am utilizing the --link option correctly and optimize the upgrade process.
Best regards,
Pradeep Kumar
On Wed, 2023-06-28 at 11:49 +0530, Pradeep Kumar wrote: > I was under the impression that the --link option would create hard links between the > old and new cluster's data files, but it appears that the entire old cluster data was > copied to the new cluster, resulting in a significant increase in the new cluster's size. Please provide some numbers, ideally du -sk <old_data_directory> <new_data_directory> Yours, Laurenz Albe
On 28.06.23 08:24, Laurenz Albe wrote: > On Wed, 2023-06-28 at 11:49 +0530, Pradeep Kumar wrote: >> I was under the impression that the --link option would create hard links between the >> old and new cluster's data files, but it appears that the entire old cluster data was >> copied to the new cluster, resulting in a significant increase in the new cluster's size. > > Please provide some numbers, ideally > > du -sk <old_data_directory> <new_data_directory> I don't think you can observe the effects of the --link option this way. It would just give you the full size count for both directories, even though the point to the same underlying inodes. To see the effect, you could perhaps use `df` to see how much overall disk space the upgrade step eats up.
Sure,
du -sk ~/pradeep_test/pg_upgrade_testing/postgres_11.4/master ~/pradeep_test/pg_upgrade_testing/postgres_14/new_pg
11224524 /home/test/pradeep_test/pg_upgrade_testing/postgres_11.4/master
41952 /home/test/pradeep_test/pg_upgrade_testing/postgres_14/new_pg
du -sk ~/pradeep_test/pg_upgrade_testing/postgres_11.4/master ~/pradeep_test/pg_upgrade_testing/postgres_14/new_pg
11224524 /home/test/pradeep_test/pg_upgrade_testing/postgres_11.4/master
41952 /home/test/pradeep_test/pg_upgrade_testing/postgres_14/new_pg
On Wed, Jun 28, 2023 at 11:54 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Wed, 2023-06-28 at 11:49 +0530, Pradeep Kumar wrote:
> I was under the impression that the --link option would create hard links between the
> old and new cluster's data files, but it appears that the entire old cluster data was
> copied to the new cluster, resulting in a significant increase in the new cluster's size.
Please provide some numbers, ideally
du -sk <old_data_directory> <new_data_directory>
Yours,
Laurenz Albe
This is my numbers.
df ~/pradeep_test/pg_upgrade_testing/postgres_11.4/master ~/pradeep_test/pg_upgrade_testing/postgres_14/new_pg
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/nvme0n1p4_crypt 375161856 102253040 270335920 28% /home
/dev/mapper/nvme0n1p4_crypt 375161856 102253040 270335920 28% /home
df ~/pradeep_test/pg_upgrade_testing/postgres_11.4/master ~/pradeep_test/pg_upgrade_testing/postgres_14/new_pg
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/nvme0n1p4_crypt 375161856 102253040 270335920 28% /home
/dev/mapper/nvme0n1p4_crypt 375161856 102253040 270335920 28% /home
On Wed, Jun 28, 2023 at 3:14 PM Peter Eisentraut <peter@eisentraut.org> wrote:
On 28.06.23 08:24, Laurenz Albe wrote:
> On Wed, 2023-06-28 at 11:49 +0530, Pradeep Kumar wrote:
>> I was under the impression that the --link option would create hard links between the
>> old and new cluster's data files, but it appears that the entire old cluster data was
>> copied to the new cluster, resulting in a significant increase in the new cluster's size.
>
> Please provide some numbers, ideally
>
> du -sk <old_data_directory> <new_data_directory>
I don't think you can observe the effects of the --link option this way.
It would just give you the full size count for both directories, even
though the point to the same underlying inodes.
To see the effect, you could perhaps use `df` to see how much overall
disk space the upgrade step eats up.
On Wed, 2023-06-28 at 15:40 +0530, Pradeep Kumar wrote: > > > I was under the impression that the --link option would create hard links between the > > > old and new cluster's data files, but it appears that the entire old cluster data was > > > copied to the new cluster, resulting in a significant increase in the new cluster's size. > > > > Please provide some numbers, ideally > > > > du -sk <old_data_directory> <new_data_directory> > > du -sk ~/pradeep_test/pg_upgrade_testing/postgres_11.4/master ~/pradeep_test/pg_upgrade_testing/postgres_14/new_pg > 11224524 /home/test/pradeep_test/pg_upgrade_testing/postgres_11.4/master > 41952 /home/test/pradeep_test/pg_upgrade_testing/postgres_14/new_pg That looks fine. The files exist only once, and the 41MB that only exist in the new data directory are catalog data and other stuff that is different on the new cluster. Yours, Laurenz Albe
On 28.06.23 12:46, Laurenz Albe wrote: > On Wed, 2023-06-28 at 15:40 +0530, Pradeep Kumar wrote: >>>> I was under the impression that the --link option would create hard links between the >>>> old and new cluster's data files, but it appears that the entire old cluster data was >>>> copied to the new cluster, resulting in a significant increase in the new cluster's size. >>> >>> Please provide some numbers, ideally >>> >>> du -sk <old_data_directory> <new_data_directory> >> >> du -sk ~/pradeep_test/pg_upgrade_testing/postgres_11.4/master ~/pradeep_test/pg_upgrade_testing/postgres_14/new_pg >> 11224524 /home/test/pradeep_test/pg_upgrade_testing/postgres_11.4/master >> 41952 /home/test/pradeep_test/pg_upgrade_testing/postgres_14/new_pg > > That looks fine. The files exist only once, and the 41MB that only exist in > the new data directory are catalog data and other stuff that is different > on the new cluster. Interesting, so it actually does count files with multiple hardlinks only once.