Re: Storing files: 2.3TBytes, 17M file count

Поиск
Список
Период
Сортировка
От Daniel Verite
Тема Re: Storing files: 2.3TBytes, 17M file count
Дата
Msg-id 9f29ff97-e4aa-4bdf-b4cb-f2c4b296e886@manitou-mail.org
обсуждение исходный текст
Ответ на Storing files: 2.3TBytes, 17M file count  (Thomas Güttler <guettliml@thomas-guettler.de>)
Ответы We reached the limit of inotify. Was: Storing files: 2.3TBytes, 17M file count  (Thomas Güttler <guettliml@thomas-guettler.de>)
Список pgsql-general
    Thomas Güttler wrote:

> Up to now we use rsync (via rsnapshot) to backup our data.
>
> But it takes longer and longer for rsync to detect
> the changes. Rsync checks many files. But daily only
> very few files really change. More than 99.9% don't.

lsyncd+rsync has worked nicely for me on Linux in such cases,
as opposed to rsync alone which is indeed very slow with large
trees. Check out https://github.com/axkibe/lsyncd

If you think of using Postgres large objects, be aware that they
are stored in a single table (pg_largeobject), sliced
as rows of 1/4 block in size each (typically 2048 bytes).
2.3 TB in a single database would mean more than 1.2 billion
rows in that table, and as a system table it can't be partitioned
or moved to another tablespace.

OTOH with large objects, files can be stored and retrieved easily
between client and server with efficient built-in functions at both ends.
In particular, they don't need the binary<->text conversions or
large memory allocations mentioned by Chris Travers upthread,
that may happen when writing your own methods with bytea columns.

But for the amount of data you have, the monolithic pg_largeobject
would likely be problematic.

Ideally there should be an extension implementing something like
DATALINK (SQL99), with external storage. I wonder if an extension
could provide custom WAL records replicating content changes to the
external storage of a standby. That would be awesome.


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


В списке pgsql-general по дате отправления:

Предыдущее
От: Adrian Klaver
Дата:
Сообщение: Re: pg_dump system catalog
Следующее
От: Melvin Davidson
Дата:
Сообщение: Re: pg_dump system catalog