Re: fdatasync performance problem with large number of DB files

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: fdatasync performance problem with large number of DB files
Дата
Msg-id CA+hUKGJpKUMRqurMCkf+zy1WrH9WMZTWiMPu-JOmpsbsT9UhFQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: fdatasync performance problem with large number of DB files  (Thomas Munro <thomas.munro@gmail.com>)
Список pgsql-hackers
On Wed, Mar 17, 2021 at 11:42 PM Paul Guo <paulguo@gmail.com> wrote:
> I just quickly reviewed the patch (the code part). It looks good. Only
> one concern
> or question is do_syncfs() for symlink opened fd for syncfs() - I'm
> not 100% sure.

Alright, let me try to prove that it works the way we want with an experiment.

I'll make a directory with a file in it, and create a symlink to it in
another filesystem:

tmunro@x1:~/junk$ mkdir my_wal_dir
tmunro@x1:~/junk$ touch my_wal_dir/foo
tmunro@x1:~/junk$ ln -s /home/tmunro/junk/my_wal_dir /dev/shm/my_wal_dir_symlink
tmunro@x1:~/junk$ ls /dev/shm/my_wal_dir_symlink/
foo

Now I'll write a program that repeatedly dirties the first block of
foo, and calls syncfs() on the containing directory that it opened
using the symlink:

tmunro@x1:~/junk$ cat test.c
#define _GNU_SOURCE

#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int
main()
{
    int symlink_fd, file_fd;

    symlink_fd = open("/dev/shm/my_wal_dir_symlink", O_RDONLY);
    if (symlink_fd < 0) {
        perror("open1");
        return EXIT_FAILURE;
    }

    file_fd = open("/home/tmunro/junk/my_wal_dir/foo", O_RDWR);
    if (file_fd < 0) {
        perror("open2");
        return EXIT_FAILURE;
    }

    for (int i = 0; i < 4; ++i) {
        if (pwrite(file_fd, "hello world", 10, 0) != 10) {
            perror("pwrite");
            return EXIT_FAILURE;
        }
        if (syncfs(symlink_fd) < 0) {
            perror("syncfs");
            return EXIT_FAILURE;
        }
        sleep(1);
    }
    return EXIT_SUCCESS;
}
tmunro@x1:~/junk$ cc test.c
tmunro@x1:~/junk$ ./a.out

While that's running, to prove that it does what we want it to do,
I'll first find out where foo lives on the disk:

tmunro@x1:~/junk$ /sbin/xfs_bmap my_wal_dir/foo
my_wal_dir/foo:
    0: [0..7]: 242968520..242968527

Now I'll trace the writes going to block 242968520, and start the program again:

tmunro@x1:~/junk$ sudo btrace /dev/nvme0n1p2 | grep 242968520
259,0    4       93     4.157000669 724924  A   W 244019144 + 8 <-
(259,2) 242968520
259,0    2      155     5.158446989 718635  A   W 244019144 + 8 <-
(259,2) 242968520
259,0    7       23     6.163765728 724924  A   W 244019144 + 8 <-
(259,2) 242968520
259,0    7       30     7.169112683 724924  A   W 244019144 + 8 <-
(259,2) 242968520



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Rahila Syed
Дата:
Сообщение: Re: row filtering for logical replication
Следующее
От: Dilip Kumar
Дата:
Сообщение: Re: [HACKERS] Custom compression methods