Re: Using quicksort for every external sort run

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: Using quicksort for every external sort run
Дата
Msg-id CAM3SWZR4D0Uey-rfEJTNEiEJd0kzeQ8RxjUg7tqd65qNAS9t1A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Using quicksort for every external sort run  (Greg Stark <stark@mit.edu>)
Список pgsql-hackers
On Wed, Mar 30, 2016 at 4:22 AM, Greg Stark <stark@mit.edu> wrote:
> I'm sorry I was intending to run those benchmarks again this past week
> but haven't gotten around to it. But my plan was to run them on a good
> server I borrowed, an i7 with 8MB cache. I can still go ahead with
> that but I can also try running it on the home server again too if you
> want (and AMD N36L with 1MB cache).

I don't want to suggest that people not test the very low end on very
high end hardware. That's fine, as long as it's put in context.
Considerations about the economics of cache sizes and work_mem
settings are crucial to testing the patch objectively. If everything
fits in cache anyway, then you almost eliminate the advantages
quicksort has, but you should be using an internal sort for anyway. I
think that this is just common sense.

I would like to see a low-end benchmark for low-end work_mem settings
too, though. Maybe you could repeat the benchmark I linked to, but
with a recent version of the patch, including commit 0011c0091e886b.
Compare that to the master branch just before 0011c0091e886b went in.
I'm curious about how the more recent memory context resetting stuff
that made it into 0011c0091e886b left us regression-wise.  Tomas
tested that, of course, but I have some concerns about how
representative his numbers are at the low end.

> But even for the smaller machines I don't think we should really be
> caring about regressions in the 4-8MB work_mem range. Earlier in the
> fuzzer work I was surprised to find out it can take tens of megabytes
> to compile a single regular expression (iirc it was about 30MB for a
> 64-bit machine) before you get errors. It seems surprising to me that
> a single operator would consume more memory than an ORDER BY clause. I
> was leaning towards suggesting we just bump up the default work_mem to
> 8MB or 16MB.

Today, it costs less than USD $40 for a new Raspberry Pi 2, which has
1GB of memory. I couldn't figure out exactly how much CPU cache that
model has, put I'm pretty sure it's no more than 256KB. Memory just
isn't that expensive; memory bandwidth is expensive. I agree that we
could easily justify increasing work_mem to 8MB, or even 16MB.

It seems almost silly to point it out, but: Increasing sort
performance has the effect of decreasing the duration of sorts, which
could effectively decrease memory use on the system. Increasing the
memory available to sorts could decrease the overall use of memory.
Being really frugal with memory is expensive, maybe even if your
primary concern is the expense of memory usage, which it probably
isn't these days.

-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kevin Grittner
Дата:
Сообщение: Re: snapshot too old, configured by time
Следующее
От: Kevin Grittner
Дата:
Сообщение: Re: snapshot too old, configured by time