Re: optimize lookups in snapshot [sub]xip arrays

Поиск
Список
Период
Сортировка
От John Naylor
Тема Re: optimize lookups in snapshot [sub]xip arrays
Дата
Msg-id CAFBsxsGmNefB-LAo8hNXLDAwWN0vHXc6H2A+NfFUQrj=rn50DQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: optimize lookups in snapshot [sub]xip arrays  (Nathan Bossart <nathandbossart@gmail.com>)
Ответы Re: optimize lookups in snapshot [sub]xip arrays  (Nathan Bossart <nathandbossart@gmail.com>)
Список pgsql-hackers
On Fri, Aug 5, 2022 at 5:15 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Thu, Aug 04, 2022 at 02:58:14PM +0700, John Naylor wrote:
> > Were you considering adding the new function to simd.h now that that's
> > committed? It's a bit up in the air what should go in there, but this new
> > function is low-level and generic enough to be a candidate...
>
> I don't have a strong opinion.  I went with a separate file because I
> envisioned a variety of possible linear search functions (e.g., char,
> uint16, uint32), and some might not use SIMD instructions.  Futhermore, it
> seemed less obvious to look in simd.h for linear search functions.

That is a good point. Maybe potential helpers in simd.h should only deal specifically with vector registers, with it's users providing C fallbacks. I don't have any good ideas of where to put the new function, though.

> > I wonder if the "pg_" prefix is appropriate here, as that is most often
> > used for things that hide specific details *and* where the base name would
> > clash, like OS calls or C library functions. I'm not quite sure where the
> > line is drawn, but I mean that "linearsearch" is a generic algorithm and
> > not a specific API we are implementing, if that makes sense.
>
> Yeah, I was concerned about clashing with lsearch() and lfind().  I will
> drop the prefix.

Hmm, I didn't know about those. lfind() is similar enough that it would make sense to have pg_lfind32() etc in src/include/port/pg_lsearch.h, at least for the v4 version that returns the pointer. We already declare bsearch_arg() in src/include/port.h and that's another kind of array search. Returning bool is different enough to have a different name. pg_lfind32_ispresent()?  *_noptr()? Meh.

Having said all that, the man page under BUGS [1] says "The naming is unfortunate."

> > Out of curiosity do we know how much we get by loading four registers
> > rather than two?
>
> The small program I've been using for testing takes about 40% more time
> with the two register approach.  The majority of this test involves
> searching for elements that either don't exist in the array or that live
> near the end of the array, so this is probably close to the worst case.

Ok, sounds good.

[1] https://man7.org/linux/man-pages/man3/lsearch.3.html#BUGS

--
John Naylor
EDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: BTMaxItemSize seems to be subtly incorrect
Следующее
От: Dilip Kumar
Дата:
Сообщение: Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints