Re: PostgreSQL 6.2.5 Visual Studio Build does not pass the regression tests.
От | Noah Misch |
---|---|
Тема | Re: PostgreSQL 6.2.5 Visual Studio Build does not pass the regression tests. |
Дата | |
Msg-id | 20140124042742.GA1975540@tornado.leadboat.com обсуждение исходный текст |
Ответ на | PostgreSQL 6.2.5 Visual Studio Build does not pass the regression tests. ("Shiv Shivaraju Gowda (shivshi)" <shivshi@cisco.com>) |
Ответы |
Re: Re: PostgreSQL 6.2.5 Visual Studio Build does not pass the regression tests.
Re: PostgreSQL 6.2.5 Visual Studio Build does not pass the regression tests. |
Список | pgsql-bugs |
On Fri, Nov 22, 2013 at 01:59:58AM +0000, Shiv Shivaraju Gowda (shivshi) wrote: > There seems to be a has a dormant bug in PostgreSQL source code build using VisualStudio(VS) which shows up with thenewer OS( not sure if it is the OS or some other thing in the environment ). Thanks for this report, and apologies for the delay. Please tell us as much as possible about the OS configurations that have exhibited the problem and the OS configurations that have *not* exhibited the problem with the same binaries. > MinGW build doesn't access this code path and thus doesn't hit this bug. A MinGW build certainly includes win32_sema.c, and I don't see why it wouldn't use that code path. How did you determine that? > SYMPTOM: > The following symptom is encountered in PostgreSQL build using Visual Studio: PostgreSQL regression tests fail with servercrashing repeatedly with this message in the log file: "PANIC: could not lock semaphore: error code 0". The issueis encountered for VS 2005, VS 2008( 32bit and 64bit executables), VS 2010 and VS 2012 built executables. The issuewas reproducible with PostgreSQL 6.2.3 and 6.2.5. We didn't encounter this issue in MinGW build or EnterpriseDB Packagedexecutables (which seems to have been built using VisualStudio 2010). I assume you mean 9.2.3 and 9.2.5? > CAUSE: > The PGSemaphoreLock function in postgresql-9.2.5\src\backend\port\win32_sema.c (https://github.com/postgres/postgres/blob/master/src/backend/port/win32_sema.c)uses "Ex" version of "WaitForMultipleObjects"Windows function (http://msdn.microsoft.com/en-us/library/windows/desktop/ms687028%28v=vs.85%29.aspx)however doesn't handle the additionalawake calls from the "bAlertable" state. Specifically, it doesn't handle the WAIT_IO_COMPLETION return code whenwoken up by a User-mode Asynchronous Procedure Call(APC) or Async IO completion. > > The part I do NOT understand is why do the User-Mode APC or Async IO completion triggers get fired only in the executableswe built and not in the ones built by EnterpriseDB and bits built by other users since no one has complained aboutit. Irrespective if it is triggered or not, the code should have handled all the return codes of the WaitForMultipleObjectsEXAPI and that is the reason I think it is a bug. > > I checked the source code for calls which will trigger the Async IO or user-mode APC(ReadFileEx<http://msdn.microsoft.com/en-us/library/windows/desktop/aa365468(v=vs.85).aspx>, WriteFileEx<http://msdn.microsoft.com/en-us/library/windows/desktop/aa365748(v=vs.85).aspx>, QueueUserAPC<http://msdn.microsoft.com/en-us/library/windows/desktop/ms684954(v=vs.85).aspx>)and could not find any. I amnot sure what triggers the WAIT_IO_COMPLETION return call. I could not find a way to figure out that information in thedebug environment. Yes, that's weird. I could believe anti-malware software doing this. If any successful test has taken place on the same machine and user account as the unsuccessful test, that explanation is unlikely. It's also possible that you're building against an unusual version of a 3rd-party library that, unlike versions used in popular builds, does use APCs or async I/O. All that being said, it's clearly wrong to call WaitForMultipleObjectsEx(..., TRUE) and then treat WAIT_IO_COMPLETION like WAIT_FAILED. > (Posible) FIX: > Either of the following changes to the PGSemaphoreLock function work fine and pass the regression tests. > > 1) Replace the call to WaitForMultipleObjectsEx with WaitForMultipleObjects. I don't know how we decided which waits to make alertable, so I lean against changing this wait from alertable to normal. > 2) Handle the WAIT_IO_COMPLETION return code same as WAIT_OBJECT_0. There is a similar code like this in socket.c, so thischange should be safe too. Sounds good. I looked through the other WaitFor*Object* callers for similar problems. pgwin32_select() also mishandles WAIT_IO_COMPLETION, returning zero as though the full timeout had elapsed. It should return -1 and set EINTR. This could affect CheckRADIUSAuth() and PgstatCollectorMain(), both of which distinguish full timeout expiration from EINTR. Would you verify that the attached patch fixes your builds? nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com
Вложения
В списке pgsql-bugs по дате отправления:
Предыдущее
От: Bruce MomjianДата:
Сообщение: Re: BUG #8218: Error when querying an JSON data, 9.3beta
Следующее
От: Andrew DunstanДата:
Сообщение: Re: BUG #8218: Error when querying an JSON data, 9.3beta