Обсуждение: Fix fseek() detection of unseekable files on WIN32
Hello all,
As highlighted in [1] fseek() might fail to error even when accessing unseekable streams.
PFA a patch that checks the file type before the actual fseek(), so only supported calls are made.
Regards,
Juan José Santamaría Flecha
Вложения
On Tue, Mar 14, 2023 at 01:26:27PM +0100, Juan José Santamaría Flecha wrote: > As highlighted in [1] fseek() might fail to error even when accessing > unseekable streams. > > PFA a patch that checks the file type before the actual fseek(), so only > supported calls are made. + * streams, so harden that funcion with our version. s/funcion/function/. +extern int pgfseek64(FILE *stream, pgoff_t offset, int origin); +extern pgoff_t pgftell64(FILE *stream); +#define fseeko(stream, offset, origin) pgfseek64(stream, offset, origin) +#define ftello(stream) pgftell64(stream) What about naming the internal wrappers _pgfseeko64() and _pgftello64(), located in a new file named win32fseek.c? It may be possible that we would need a similar treatment for fseek(), in the future, though I don't see an issue why this would be needed now. + if (GetFileType((HANDLE) _get_osfhandle(_fileno(stream))) != FILE_TYPE_DISK) + { + errno = ESPIPE; + return -1; + } Shouldn't there be cases where we should return EINVAL for some of the other types, like FILE_TYPE_REMOTE or FILE_TYPE_UNKNOWN? We should return ESPIPE only for FILE_TYPE_PIPE and FILE_TYPE_CHAR, then? -- Michael
Вложения
On Wed, Mar 15, 2023 at 5:57 AM Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Mar 14, 2023 at 01:26:27PM +0100, Juan José Santamaría Flecha wrote:
> As highlighted in [1] fseek() might fail to error even when accessing
> unseekable streams.
>
> PFA a patch that checks the file type before the actual fseek(), so only
> supported calls are made.
+ * streams, so harden that funcion with our version.
s/funcion/function/.
Done.
+extern int pgfseek64(FILE *stream, pgoff_t offset, int origin);
+extern pgoff_t pgftell64(FILE *stream);
+#define fseeko(stream, offset, origin) pgfseek64(stream, offset, origin)
+#define ftello(stream) pgftell64(stream)
What about naming the internal wrappers _pgfseeko64() and
_pgftello64(), located in a new file named win32fseek.c? It may be
possible that we would need a similar treatment for fseek(), in the
future, though I don't see an issue why this would be needed now.
Done.
+ if (GetFileType((HANDLE) _get_osfhandle(_fileno(stream))) != FILE_TYPE_DISK)
+ {
+ errno = ESPIPE;
+ return -1;
+ }
Shouldn't there be cases where we should return EINVAL for some of the
other types, like FILE_TYPE_REMOTE or FILE_TYPE_UNKNOWN? We should
return ESPIPE only for FILE_TYPE_PIPE and FILE_TYPE_CHAR, then?
Done.
PFA a new version of the patch.
Regards,
Juan José Santamaría Flecha
Вложения
On Wed, Mar 15, 2023 at 12:18:25PM +0100, Juan José Santamaría Flecha wrote: > PFA a new version of the patch. +_pgftello64(FILE *stream) +{ + DWORD fileType; + + fileType = GetFileType((HANDLE) _get_osfhandle(_fileno(stream))); Hmm. I am a bit surprised here.. It seems to me that we should make sure that: - We exist quickly if _get_osfhandle() returns -2 or INVALID_HANDLE_VALUE, returning EINVAL? - After GetFileType(), check for GetLastError() and the FILE_TYPE_UNKNOWN case? Do you think that these would be improvements? -- Michael
Вложения
On Thu, Mar 16, 2023 at 2:05 AM Michael Paquier <michael@paquier.xyz> wrote:
On Wed, Mar 15, 2023 at 12:18:25PM +0100, Juan José Santamaría Flecha wrote:
> PFA a new version of the patch.
+_pgftello64(FILE *stream)
+{
+ DWORD fileType;
+
+ fileType = GetFileType((HANDLE) _get_osfhandle(_fileno(stream)));
Hmm. I am a bit surprised here.. It seems to me that we should make
sure that:
- We exist quickly if _get_osfhandle() returns -2 or
INVALID_HANDLE_VALUE, returning EINVAL?
- After GetFileType(), check for GetLastError() and the
FILE_TYPE_UNKNOWN case?
Do you think that these would be improvements?
IDK, this is just looking for the good case, anything else we'll fail with ESPIPE or EINVAL anyway. If we want to get the proper file type we can call fstat(), which has the full logic.
Regards,
Juan José Santamaría Flecha
On Thu, Mar 16, 2023 at 10:08:44AM +0100, Juan José Santamaría Flecha wrote: > IDK, this is just looking for the good case, anything else we'll fail with > ESPIPE or EINVAL anyway. If we want to get the proper file type we can call > fstat(), which has the full logic. I am not sure, TBH. As presented, the two GetFileType() calls in _pgftello64() and _pgfseeko64() ignore the case where it returns FILE_TYPE_UNKNOWN and GetLastError() has something else than NO_ERROR. The code would return EINVAL for all the errors happening. Perhaps that's fine, though I am wondering if we should report something more exact, based on _dosmaperr(GetLastError())? -- Michael
Вложения
On Sun, Mar 19, 2023 at 08:20:27PM +0900, Michael Paquier wrote: > I am not sure, TBH. As presented, the two GetFileType() calls in > _pgftello64() and _pgfseeko64() ignore the case where it returns > FILE_TYPE_UNKNOWN and GetLastError() has something else than > NO_ERROR. The code would return EINVAL for all the errors happening. > Perhaps that's fine, though I am wondering if we should report > something more exact, based on _dosmaperr(GetLastError())? In short, I was thinking among the lines of something like the attached, where I have invented a pgwin32_get_file_type() that acts as a wrapper of GetFileType() in a new file called win32common.c, with all the error handling we would use between fstat(), fseeko() and ftello() centralized in a single code path. The refactoring with win32common.c had better be separated into its own patch, at the end, if using an approach like that. -- Michael
Вложения
On Sun, Mar 19, 2023 at 12:45 PM Michael Paquier <michael@paquier.xyz> wrote:
In short, I was thinking among the lines of something like the
attached, where I have invented a pgwin32_get_file_type() that acts as
a wrapper of GetFileType() in a new file called win32common.c, with
all the error handling we would use between fstat(), fseeko() and
ftello() centralized in a single code path.
The refactoring with win32common.c had better be separated into its
own patch, at the end, if using an approach like that.
My approach was trying to make something minimal so it could be backpatchable. This looks fine for HEAD, but are you planning on something similar for the other branches?
Doesn't pgwin32_get_file_type() fit in dirmod.c? Might be a question of personal taste, I don't really have strong feelings against win32common.c.
Regards,
Juan José Santamaría Flecha
On Sun, Mar 19, 2023 at 08:10:10PM +0100, Juan José Santamaría Flecha wrote: > My approach was trying to make something minimal so it could be > backpatchable. This looks fine for HEAD, but are you planning on something > similar for the other branches? Yes. This is actually not invasive down to 14 as the code is consistent for these branches. > Doesn't pgwin32_get_file_type() fit in dirmod.c? Might be a question of > personal taste, I don't really have strong feelings against win32common.c. Not sure about this one. I have considered it and dirmod.c includes also bits for cygwin, while being aimed for higher-level routines like rename(), unlink() or symlink(). This patch is only for WIN32, and aimed for common parts in win32*.c code, so a separate file seemed a bit cleaner to me at the end. -- Michael
Вложения
On Mon, Mar 20, 2023 at 07:06:22AM +0900, Michael Paquier wrote: > Not sure about this one. I have considered it and dirmod.c includes > also bits for cygwin, while being aimed for higher-level routines like > rename(), unlink() or symlink(). This patch is only for WIN32, and > aimed for common parts in win32*.c code, so a separate file seemed a > bit cleaner to me at the end. By the way, do you think that we could be able to get a TAP test for that? It does not seem that it needs to be that complicated, as long as we use a pg_dump command that pipes its output to a pg_restore command launched by system()? -- Michael
Вложения
On Mon, Mar 20, 2023 at 07:06:22AM +0900, Michael Paquier wrote: > Not sure about this one. I have considered it and dirmod.c includes > also bits for cygwin, while being aimed for higher-level routines like > rename(), unlink() or symlink(). This patch is only for WIN32, and > aimed for common parts in win32*.c code, so a separate file seemed a > bit cleaner to me at the end. After going through the installation of a Windows setup with meson and ninja under VS, I have checked that this is working correctly by myself, so I am going to apply that. One of the tests I have done involved feeding a dump of the regression data through a pipe to pg_restore, and the whole was able to work fine, while head broke when using a pipe. Digressing a bit, while I don't forget.. Spoiler 1: I don't think that recommending ActivePerl in the documentation is a good idea these days. They do not provide anymore a standalone installer that deploys the binaries you can use, and they've made it really difficult to even access a "perl" command as it has become necessary to use an extra command "state activate --default" to link with a project registered in their stuff, meaning a connection to their project. Once this command is launched, the terminal links to a cached state in AppData. This is very unfriendly. In comparison, relying on StrawberryPerl and Chocolatey feels like a breeze.. Spoiler 2: mingw.org seems to be dead, and we have two links in the docs referring to it. -- Michael
Вложения
On Tue, Apr 11, 2023 at 02:43:25PM +0900, Michael Paquier wrote: > After going through the installation of a Windows setup with meson and > ninja under VS, I have checked that this is working correctly by > myself, so I am going to apply that. One of the tests I have done > involved feeding a dump of the regression data through a pipe to > pg_restore, and the whole was able to work fine, while head broke when > using a pipe. Applied this one down to 14. The first responses from the buildfarm are good, I'll keep an eye on all that. -- Michael