Обсуждение: BUG #17543: CSVLOG malformed from disk space error
The following bug has been logged on the website: Bug reference: 17543 Logged by: Nathan Ballance Email address: nathanfballance@gmail.com PostgreSQL version: 14.4 Operating system: Linux/UNIX Description: Postgresql server with csvlog log_destination enabled will have malformed CSV upon a disk space error. This causes any loading of the malformed *.csv log file to error Steps to Reproduce: *Applies to all postgresql versions. Tested using git mirror - https://github.com/postgres/postgres Modify 'postgresql.conf' for the following parameter changes: log_destination = 'stderr,csvlog' logging_collector = on Cause system to run out of disk space postgresql-2022-XX-XX_XXXXXX.csv is now in malformed format Returns error when executing \copy postgres_log FROM postgresql-2022-XX-XX_XXXXXX.csv csv Testing example: template1=# \copy postgres_log FROM '/home/ec2-user/postgres/test_postgres/log/postgresql-2022-07-08_012958.csv' csv ERROR: unterminated CSV quoted field CONTEXT: COPY postgres_log, line 107: "2022-07-08 01:41:59.246 UTC,,,12154,,62c78b67.2f7a,1,,2022-07-08 01:41:59 UTC,4/17,0,ERROR,53100,"co..." template1=# In postgresql-2022-XX-XX_XXXXXX.csv the line will not terminate in proper CSV format Log snippet from postgresql-2022-07-08_012958.csv ================= 2022-07-08 01:41:49.969 UTC,"ec2-user","template1",12152,"[local]",62c78b2f.2f78,1,"idle",2022-07-08 01:41:03 UTC,3/51,0,ERROR,42601,"syntax error at or near ""IF""",,,,,,"D ROP TABLE test_5 IF EXISTS;",19,,"psql","client backend",,0 2022-07-08 01:41:59.246 UTC,,,12154,,62c78b67.2f7a,1,,2022-07-08 01:41:59 UTC,4/17,0,ERROR,53100,"could not extend file ""base/1/16389_vm"": No space left on device",,"Check free disk space.",,,"while scanning block 32672 of relation ""public.test_5"" automatic vacuum of table ""template1.public.test_5""",,,,"","autovacuum wor2022-07-08 01:42:44.035 UTC,,,12073,,62c78896.2f29,31,,2022-07-08 01:29:58 UTC,,0,LOG,00000,"chec kpoint complete: wrote 4169 buffers (25.4%); 0 WAL file(s) added, 0 removed, 33 recycled; write=269.903 s, sync=0.002 s, total=269.972 s; sync files=2, longest=0.002 s, aver age=0.001 s; distance=538440 kB, estimate=540448 kB; lsn=1/7E026E90, redo lsn=1/6C1D0148",,,,,,,,,"","checkpointer",,0 2022-07-08 01:43:14.040 UTC,,,12073,,62c78896.2f29,32,,2022-07-08 01:29:58 UTC,,0,LOG,00000,"checkpoint starting: time",,,,,,,,,"","checkpointer",,0 ================= From the above the malformed log line is: autovacuum wor2022-07-08 01:42:44.035 UTC This thread is writing 'autovacuum worker' but terminates but does not fully write to CSV file and next log line is started. This is malformed. Please note, when testing that the exact location when this terminates is not repeatable, however, the CSV will always be malformed upon a disk space error ASK: Can the CSV file be written to in a safer way which ensures proper format even upon disk error?
On Saturday, July 9, 2022, PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:
Bug reference: 17543
Logged by: Nathan Ballance
Email address: nathanfballance@gmail.com
PostgreSQL version: 14.4
Operating system: Linux/UNIX
Description:
Postgresql server with csvlog log_destination enabled will have malformed
CSV upon a disk space error. This causes any loading of the malformed *.csv
log file to error
ASK: Can the CSV file be written to in a safer way which ensures proper
format even upon disk error?
This list is for reporting bugs, not feature requests.
I’d have to say that there is little interest in sacrificing performance for safety here, which seems like an unavoidable proposition.
David J.
On Mon, Jul 11, 2022 at 05:31:35AM -0700, David G. Johnston wrote: > This list is for reporting bugs, not feature requests. > > I’d have to say that there is little interest in sacrificing performance > for safety here, which seems like an unavoidable proposition. I'd be curious to hear about an actual proposal if there are things that could be improved. Saying that, after playing with this area in the last couple of months to add support for the JSON format, the design of the syslogger with its set of synced pipes to ensure the ordering of the logs sent by the backends to the syslogger process relies on many assumptions, and it may be hard to make things more reliable without sacrificing performance. Aka, this can already become a bottleneck with short logs and a rather high logging level used and we don't want to make that worse. -- Michael
Вложения
Hi, On 2022-07-11 05:31:35 -0700, David G. Johnston wrote: > On Saturday, July 9, 2022, PG Bug reporting form <noreply@postgresql.org> > wrote: > > > The following bug has been logged on the website: > > > > Bug reference: 17543 > > Logged by: Nathan Ballance > > Email address: nathanfballance@gmail.com > > PostgreSQL version: 14.4 > > Operating system: Linux/UNIX > > Description: > > > > Postgresql server with csvlog log_destination enabled will have malformed > > CSV upon a disk space error. This causes any loading of the malformed *.csv > > log file to error > > > > > > ASK: Can the CSV file be written to in a safer way which ensures proper > > format even upon disk error? > > > > > This list is for reporting bugs, not feature requests. > > I’d have to say that there is little interest in sacrificing performance > for safety here, which seems like an unavoidable proposition. I agree in general, but this specific issue seems easy enough to address. We could just track whether the last write failed, and if so, emit an additional newline. But that just fixes the simple case - if the last successful write contained the start of an escaped string, the newline won't necessarily be recognized as the end of a "row". We can't generally solve this with reasonable cost, because if e.g. postgres runs out of space and then is restarted, we'll not know how to "break out" of whatever state we were in. We'd have to read the file to figure that out, obviously not a realistic proposition. It might be possible to devise a scheme in which we force rotation in cases where corruption might have occurred? Greetings, Andres Freund
At Tue, 12 Jul 2022 17:48:53 -0700, Andres Freund <andres@anarazel.de> wrote in > It might be possible to devise a scheme in which we force rotation in cases > where corruption might have occurred? It would be cumbersome to generate unique names for new files then let the server grab the newest file at restart. I'm not sure what happens if many files are quickly generated due to ENOSPC. (Maybe fills-up the dirent then moves to another failure mode..) Considering how often that can happen, I would choose to make the reader function robust or to fix the file manually.. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Tue, Jul 12, 2022 at 05:48:53PM -0700, Andres Freund wrote: > On 2022-07-11 05:31:35 -0700, David G. Johnston wrote: > > On Saturday, July 9, 2022, PG Bug reporting form <noreply@postgresql.org> wrote: > > > Postgresql server with csvlog log_destination enabled will have malformed > > > CSV upon a disk space error. This causes any loading of the malformed *.csv > > > log file to error > > > > > > ASK: Can the CSV file be written to in a safer way which ensures proper > > > format even upon disk error? > > I’d have to say that there is little interest in sacrificing performance > > for safety here, which seems like an unavoidable proposition. > > I agree in general, but this specific issue seems easy enough to address. We > could just track whether the last write failed, and if so, emit an additional > newline. > > But that just fixes the simple case - if the last successful write contained > the start of an escaped string, the newline won't necessarily be recognized as > the end of a "row". Here's one approach avoiding that problem. After ENOSPC causes the logfile to end with a prefix of a message, issue ftruncate(logfile, logfile_length - written_bytes_of_message_prefix). An alternative would be to periodically posix_fallocate() substantial space in the logfile, and write messages only to already-allocated space. At rotation, clean shutdown, or startup, ftruncate() away trailing NUL bytes. I figure this is inferior to the other approach, because the trailing NUL bytes will be user-visible after OS crashes and when tailing active logs. (Neither approach prevents CSV corruption if the OS crashes in the middle of syslogger's processing of one record. I don't know a low-cost, general fix for that. One tough case is a field that should have been "foo""bar" getting truncated to "foo".)