> At least on the systems I am intimately familiar with, the prefetch that the
> OS does (assuming a modern OS like Linux) is pretty hard to beat. If you have
> a table that was bulk loaded in key order, a sequential scan is going to
> result in a sequential access pattern to the underlying file and the OS
> prefetch does the right thing. If you have an unindexed table with rows
> inserted at the end, the OS prefetch still works. If you are using a secondary
> index on some sort of chopped up table with rows inserted willy-nilly, it
> then, it may be worth doing async reads in a burst and let the disk request
> sort make the best of it.
>
> As far as I am aware, Postgres does not do async I/O. Perhaps it should.
I am adding this to the TODO list:
* Do async I/O to do better read-ahead of data
Because we are not threaded, we really can't do anything else while we
are waiting for I/O, but we can pre-request data we know we will need.
>
> > Also nice so you can control what gets written to disk/fsync'ed and what doesn't
> > get fsync'ed.
>
> This is really the big win.
Yep, and this is what we are trying to work around in our buffered
pg_log change. Because we have the transaction ids all compact in one
place, this seems like a workable solution to our lack of write-to-disk
control. We just control the pg_log writes.
>
> > Our idea is to control when pg_log gets written to disk. We keep active
> > pg_log pages in shared memory, and every 30-60 seconds, we make a memory
> > copy of the current pg_log active pages, do a system sync() (which
> > happens anyway at that interval), update the pg_log file with the saved
> > changes, and fsync() the pg_log pages to disk. That way, after a crash,
> > the current database only shows transactions as committed where we are
> > sure all the data has made it to disk.
>
> OK as far as it goes, but probably bad for concurrancy if I have understood
> you.
Interesed in hearing your comments.
--
Bruce Momjian | 830 Blythe Avenue
maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)