Hi Zoltán.
I was reading through your posix_fadvise patch, and I wanted to ask
about this change in particular:
> --- a/src/backend/executor/nodeIndexscan.c
> +++ b/src/backend/executor/nodeIndexscan.c
> @@ -290,7 +290,7 @@ ExecIndexEvalArrayKeys(ExprContext *econtext,
> /* We want to keep the arrays in per-tuple memory */
> oldContext = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
>
> - for (j = 0; j < numArrayKeys; j++)
> + for (j = numArrayKeys-1; j >= 0; j--)
> {
> ScanKey scan_key = arrayKeys[j].scan_key;
> ExprState *array_expr = arrayKeys[j].array_expr;
Why is this loop reversed? (I could have missed some discussion about
this, I just wanted to make sure it was intentional.)
I can confirm that the patch applies to HEAD, that the configure test
correctly #defines USE_POSIX_FADVISE, that it compiles cleanly, and it
looks basically sensible to me. The nodeBitmapHeapscan.c and iterator
changes need a second look by someone who understands the code better
than I do.
The documentation patch could use a little tweaking:
> + <productname>PostgreSQL</productname> can give hints to POSIX compatible
> + operating systems using posix_fadvise(2) to pre-read pages that will
> + be needed in the near future. Reading the pages into the OS kernel's
> + disk buffer will be done asynchronically while <productname>PostgreSQL</productname>
> + works on pages which are already in memory. This may speed up bitmap scans
> + considerably. This setting only applies to bitmap scans.
> + Hinting for sequential scans is also used but no GUC is needed in this case.
I would suggest something like this instead:
<productname>PostgreSQL</productname> can use posix_fadvise(2) to tell the operating system about pages that it
knowswill be needed in the near future. The performance of bitmap scans is considerably improved when the kernel
pre-readssuch pages into its disk buffers. This variable controls how many pages are marked for pre-reading at a
timeduring a bitmap scan.
But I'm not convinced that this GUC is well-advised; at least, it needs
some advice about how to determine a sensible size for the parameter
(and maybe a better name). But is it really necessary at all?
Thanks.
-- ams