Обсуждение: Re: [ADMIN] ERROR: could not read block

Поиск
Список
Период
Сортировка

Re: [ADMIN] ERROR: could not read block

От
"Magnus Hagander"
Дата:
> >>> Tom Lane <tgl@sss.pgh.pa.us>  >>>
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> > None of this seems material, however.  It's pretty clear that the
> > problem was exhaustion of the Windows page pool.
> > ...
> > If we don't want to tell Windows users to make highly technical
> > changes to the Windows registry in order to use PostgreSQL, it does
> > seem wise to use retries, as has already been discussed on this
> > thread.
>
> Would a simple retry loop actually help?  It's not clear to
> me how persistent such a failure would be.

(Not sure why I didn't get Toms mail - lists acting up again? Anyway, I
got Kevins response, but am responding primarily to Tom)

The way I read it, a delay should help. It's basically running out of
kernel buffers, and we just delay, somebody else (another process, or an
IRQ handler, or whatever) should get finished with their I/O, free up
the buffer, and let us have it. Looking around a bit I see several
references that you should retry on it, but nothing in the API docs.
I do think it's probably a good idea to do a short delay before retrying
- at least to yield the CPU for one slice. That would greatly increase
the probability of someone else finishing their I/O...

That's how I read it, but I'm not 100% sure.

//Magnus

Re: [ADMIN] ERROR: could not read block

От
"Jim C. Nasby"
Дата:
On Thu, Nov 17, 2005 at 07:56:21PM +0100, Magnus Hagander wrote:
> The way I read it, a delay should help. It's basically running out of
> kernel buffers, and we just delay, somebody else (another process, or an
> IRQ handler, or whatever) should get finished with their I/O, free up
> the buffer, and let us have it. Looking around a bit I see several
> references that you should retry on it, but nothing in the API docs.
> I do think it's probably a good idea to do a short delay before retrying
> - at least to yield the CPU for one slice. That would greatly increase
> the probability of someone else finishing their I/O...

If that makes it into code, ISTM it would be good if it also threw a
NOTICE so that users could see if this was happening; kinda like the
notice about log files being recycled frequently.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: [ADMIN] ERROR: could not read block

От
"Qingqing Zhou"
Дата:
""Magnus Hagander"" <mha@sollentuna.net> wrote
>
> The way I read it, a delay should help. It's basically running out of
> kernel buffers, and we just delay, somebody else (another process, or an
> IRQ handler, or whatever) should get finished with their I/O, free up
> the buffer, and let us have it. Looking around a bit I see several
> references that you should retry on it, but nothing in the API docs.
> I do think it's probably a good idea to do a short delay before retrying
> - at least to yield the CPU for one slice. That would greatly increase
> the probability of someone else finishing their I/O...
>

More I read on the second thread:

" NTBackupread and NTBackupwrite both use buffered I/O. This means that 
Windows NT caches the I/O that is performed against the stream. It is also 
the only API that will back up the metadata of a file. This cache is pulled 
from limited resources: namely, pool and nonpaged pool. Because of this, 
extremely large numbers of files or files that are very large may cause the 
pool resources to run low. "

So does it imply that if we use unbuffered I/O in Windows system will 
elminate this problem? If so, just add FILE_FLAG_NO_BUFFERING when we open 
data file will solve the problem -- but this change in fact very invasive, 
because it will make the strategy of server I/O optimization totally 
different from *nix.

Regards,
Qingqing