Re: Large files for relations

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: Large files for relations
Дата
Msg-id ZF5E3E79OQHM4jAx@tamriel.snowman.net
обсуждение исходный текст
Ответ на Re: Large files for relations  (Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>)
Ответы Re: Large files for relations  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Greetings,

* Dagfinn Ilmari Mannsåker (ilmari@ilmari.org) wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
> > On Fri, May 12, 2023 at 8:16 AM Jim Mlodgenski <jimmy76@gmail.com> wrote:
> >> On Mon, May 1, 2023 at 9:29 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> >>> I am not aware of any modern/non-historic filesystem[2] that can't do
> >>> large files with ease.  Anyone know of anything to worry about on that
> >>> front?
> >>
> >> There is some trouble in the ambiguity of what we mean by "modern" and
> >> "large files". There are still a large number of users of ext4 where
> >> the max file size is 16TB. Switching to a single large file per
> >> relation would effectively cut the max table size in half for those
> >> users. How would a user with say a 20TB table running on ext4 be
> >> impacted by this change?
> […]
> > A less aggressive version of the plan would be that we just keep the
> > segment code for the foreseeable future with no planned cut off, and
> > we make all of those "piggy back" transformations that I showed in the
> > patch set optional.  For example, I had it so that CLUSTER would
> > quietly convert your relation to large format, if it was still in
> > segmented format (might as well if you're writing all the data out
> > anyway, right?), but perhaps that could depend on a GUC.  Likewise for
> > base backup.  Etc.  Then someone concerned about hitting the 16TB
> > limit on ext4 could opt out.  Or something like that.  It seems funny
> > though, that's exactly the user who should want this feature (they
> > have 16,000 relation segment files).
>
> If we're going to have to keep the segment code for the foreseeable
> future anyway, could we not get most of the benefit by increasing the
> segment size to something like 1TB?  The vast majority of tables would
> fit in one file, and there would be less risk of hitting filesystem
> limits.

While I tend to agree that 1GB is too small, 1TB seems like it's
possibly going to end up on the too big side of things, or at least,
if we aren't getting rid of the segment code then it's possibly throwing
away the benefits we have from the smaller segments without really
giving us all that much.  Going from 1G to 10G would reduce the number
of open file descriptors by quite a lot without having much of a net
change on other things.  50G or 100G would reduce the FD handles further
but starts to make us lose out a bit more on some of the nice parts of
having multiple segments.

Just some thoughts.

Thanks,

Stephen

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jim Mlodgenski
Дата:
Сообщение: Re: Large files for relations
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: createuser --memeber and PG 16