Re: Add new error_action COPY ON_ERROR "log"
От | torikoshia |
---|---|
Тема | Re: Add new error_action COPY ON_ERROR "log" |
Дата | |
Msg-id | 3bad6fd5570c5b63e0b68f79e20253f2@oss.nttdata.com обсуждение исходный текст |
Ответ на | Re: Add new error_action COPY ON_ERROR "log" (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>) |
Ответы |
Re: Add new error_action COPY ON_ERROR "log"
(Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
|
Список | pgsql-hackers |
On 2024-02-12 15:46, Bharath Rupireddy wrote: Thanks for your comments. > On Mon, Jan 29, 2024 at 8:41 AM torikoshia <torikoshia@oss.nttdata.com> > wrote: >> >> On 2024-01-27 00:43, David G. Johnston wrote: >> >> >> I'd like to have a new option "log", which skips soft errors and >> >> logs >> >> information that should have resulted in errors to PostgreSQL log. >> > >> > user-specified tables in the same database. Setting up temporary >> > tables or unlogged tables probably is going to be a more acceptable >> > methodology than trying to get to the log files. >> >> OTOH not everyone thinks saving table information is the best idea[2]. > > The added NOTICE message gives a summary of how many rows were > skipped, but not the line numbers. It's hard for the users to find the > malformed data, especially when they are bulk-inserting from data > files of multiple GBs in size (hard to open such files in any editor > too). > > I understand the value of writing the info to server log or tables of > users' choice as being discussed in a couple of other threads. > However, I'd prefer we do the simplest thing first before we go debate > server log or table - let the users know what line numbers are > containing the errors that COPY ignored something like [1] with a > simple change like [2]. Agreed. Unlike my patch, it hides the error information(i.e. 22P02: invalid input syntax for type integer: ), but I feel that it's usually sufficient to know the row number and column where the error occurred. > It not only helps users figure out which rows > and attributes were malformed, but also helps them redirect them to > server logs with setting log_min_messages = notice [3]. In the worst > case scenario, a problem with this one NOTICE per malformed row is > that it can overload the psql session if all the rows are malformed. > I'm not sure if this is a big problem, but IMO better than a single > summary NOTICE message and simpler than writing to tables of users' > choice. Maybe could we do what you suggested for the behavior when 'log' is set to on_error? > Thoughts? > > FWIW, I presented the new COPY ... ON_ERROR option feature at > Hyderabad PostgreSQL User Group meetup and it was well-received by the > audience. The audience felt it's going to be extremely helpful for > bulk-loading tasks. Thank you all for working on this feature. Thanks for informing it! I hope it will not be reverted as the audience:) > [1] > postgres=# CREATE TABLE check_ign_err (n int, m int[], k int); > CREATE TABLE > postgres=# COPY check_ign_err FROM STDIN WITH (on_error ignore); > Enter data to be copied followed by a newline. > End with a backslash and a period on a line by itself, or an EOF > signal. >>> 1 {1} 1 > a {2} 2 > 3 {3} 3333333333 > 4 {a, 4} 4 > > 5 {5}>> >> >> >> >> 5 > \.>> > NOTICE: detected data type incompatibility at line number 2 for > column check_ign_err, COPY n > NOTICE: detected data type incompatibility at line number 3 for > column check_ign_err, COPY k > NOTICE: detected data type incompatibility at line number 4 for > column check_ign_err, COPY m > NOTICE: detected data type incompatibility at line number 5 for > column check_ign_err, COPY n > NOTICE: 4 rows were skipped due to data type incompatibility > COPY 2 > > [2] > diff --git a/src/backend/commands/copyfromparse.c > b/src/backend/commands/copyfromparse.c > index 906756362e..93ab5d4ebb 100644 > --- a/src/backend/commands/copyfromparse.c > +++ b/src/backend/commands/copyfromparse.c > @@ -961,7 +961,16 @@ NextCopyFrom(CopyFromState cstate, ExprContext > *econtext, > > &values[m])) > { > cstate->num_errors++; > - return true; > + > + if (cstate->opts.on_error != > COPY_ON_ERROR_STOP) > + { > + ereport(NOTICE, > + > errmsg("detected data type incompatibility at line number %llu for > column %s, COPY %s", > + > (unsigned long long) cstate->cur_lineno, > + > cstate->cur_relname, > + > cstate->cur_attname)); > + return true; > + } > } > > cstate->cur_attname = NULL; > > [3] > 2024-02-12 06:20:29.363 UTC [427892] NOTICE: detected data type > incompatibility at line number 2 for column check_ign_err, COPY n > 2024-02-12 06:20:29.363 UTC [427892] CONTEXT: COPY check_ign_err, > line 2, column n: "a" > 2024-02-12 06:20:29.363 UTC [427892] NOTICE: detected data type > incompatibility at line number 3 for column check_ign_err, COPY k > 2024-02-12 06:20:29.363 UTC [427892] CONTEXT: COPY check_ign_err, > line 3, column k: "3333333333" > 2024-02-12 06:20:29.363 UTC [427892] NOTICE: detected data type > incompatibility at line number 4 for column check_ign_err, COPY m > 2024-02-12 06:20:29.363 UTC [427892] CONTEXT: COPY check_ign_err, > line 4, column m: "{a, 4}" > 2024-02-12 06:20:29.363 UTC [427892] NOTICE: detected data type > incompatibility at line number 5 for column check_ign_err, COPY n > 2024-02-12 06:20:29.363 UTC [427892] CONTEXT: COPY check_ign_err, > line 5, column n: "" > 2024-02-12 06:20:29.363 UTC [427892] NOTICE: 4 rows were skipped due > to data type incompatibility > > -- > Bharath Rupireddy > PostgreSQL Contributors Team > RDS Open Source Databases > Amazon Web Services: https://aws.amazon.com -- Regards, -- Atsushi Torikoshi NTT DATA Group Corporation
В списке pgsql-hackers по дате отправления: