Re: [HACKERS] HASH_CHUNK_SIZE vs malloc rounding

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: [HACKERS] HASH_CHUNK_SIZE vs malloc rounding
Дата
Msg-id CAEepm=2Q1LxZiV1NkPgZ0Cx1xuX1U5hTk46=yJt-VMwydiVbqQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: HASH_CHUNK_SIZE vs malloc rounding  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: [HACKERS] HASH_CHUNK_SIZE vs malloc rounding  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Tue, Nov 29, 2016 at 6:27 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@enterprisedb.com> writes:
>> I bet other allocators also do badly with "32KB plus a smidgen".  To
>> minimise overhead we'd probably need to try to arrange for exactly
>> 32KB (or some other power of 2 or at least factor of common page/chunk
>> size?) to arrive into malloc, which means accounting for both
>> nodeHash.c's header and aset.c's headers in nodeHash.c, which seems a
>> bit horrible.  It may not be worth doing anything about.
>
> Yeah, the other problem is that without a lot more knowledge of the
> specific allocator, we shouldn't really assume that it's good or bad with
> an exact-power-of-2 request --- it might well have its own overhead.
> It is an issue though, and not only in nodeHash.c.  I'm pretty sure that
> StringInfo also makes exact-power-of-2 requests for no essential reason,
> and there are probably many other places.
>
> We could imagine providing an mmgr API function along the lines of "adjust
> this request size to the nearest thing that can be allocated efficiently".
> That would avoid the need for callers to know about aset.c overhead
> explicitly.  I'm not sure how it could deal with platform-specific malloc
> vagaries though :-(

Someone pointed out to me off-list that jemalloc's next size class
after 32KB is in fact 40KB by default[1].  So PostgreSQL uses 25% more
memory for hash joins than it thinks it does on some platforms.  Ouch.

It doesn't seem that crazy to expose aset.c's overhead size to client
code does it?  Most client code wouldn't care, but things that are
doing something closer to memory allocator work themselves like
dense_alloc could care.  It could deal with its own overhead itself,
and subtract aset.c's overhead using a macro.

[1] https://www.freebsd.org/cgi/man.cgi?jemalloc(3)

-- 
Thomas Munro
http://www.enterprisedb.com


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Combine function returning NULL unhandled?
Следующее
От: Craig Ringer
Дата:
Сообщение: Re: Failed to delete old ReorderBuffer spilled files