Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

Поиск
Список
Период
Сортировка
От jian he
Тема Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
Дата
Msg-id CACJufxH3KGOGhnS-J2bYnA-mpABg4HG9Xddfj9zBBgoK+iGSsQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)  (Alena Rybakina <lena.ribackina@yandex.ru>)
Ответы Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)  (Alena Rybakina <lena.ribackina@yandex.ru>)
Список pgsql-hackers
On Tue, Dec 5, 2023 at 6:07 PM Alena Rybakina <lena.ribackina@yandex.ru> wrote:
>
> Hi!
>
> Thank you for your contribution to this thread.
>
>
> I reviewed it and have a few questions.
>
> 1. I have seen that you delete a table before creating it, to which you want to add errors due to a failed "copy
from"operation. I think this is wrong because this table can save useful data for the user. 
> At a minimum, we should warn the user about this, but I think we can just add some number at the end of the name,
suchas name_table1, name_table_2. 

Sorry. I don't understand this part.
Currently, if the error table name already exists, then the copy will
fail, an error will be reported.
I try to first create a table, if no error then the error table will be dropped.
Can you demo the expected behavior?

> 2. I noticed that you are forming a table name using the type of errors that prevent rows from being added during
'copyfrom' operation. 
> I think it would be better to use the name of the source file that was used while 'copy from' was running.
> In addition, there may be several such files, it is also worth considering.
>

Another column added.
now it looks like:

SELECT * FROM save_error_csv_error;
 filename | lineno |                        line
 | field | source |                 err_message                 |
err_detail | errorcode

----------+--------+----------------------------------------------------+-------+--------+---------------------------------------------+------------+-----------
 STDIN    |      1 | 2002    232     40      50      60      70
80 | NULL  | NULL   | extra data after last expected column       |
NULL       | 22P04
 STDIN    |      1 | 2000    230     23
 | d     | NULL   | missing data for column "d"                 | NULL
      | 22P04
 STDIN    |      1 | z,,""
 | a     | z      | invalid input syntax for type integer: "z"  | NULL
      | 22P02
 STDIN    |      2 | \0,,
 | a     | \0     | invalid input syntax for type integer: "\0" | NULL
      | 22P02


> 3. I found spelling:
>
> /* no err_nsp.error_rel table then crete one. for holding error. */
>

fixed.

> 4. Maybe rewrite this comment
>
> these info need, no error will drop err_nsp.error_rel table
> to:
> this information is necessary, no error will lead to the deletion of the err_sp.error_rel table.
>

fixed.

> 5. Is this part of the comment needed? I think it duplicates the information below when we form the query.
>
>  * . column list(order by attnum, begin from ctid) =
>  *    {ctid, lineno,line,field,source,err_message,err_detail,errorcode}
>  * . data types (from attnum = -1) ={tid, int8,text,text,text,text,text,text}
>
> I'm not sure if we need to order the rows by number. It might be easier to work with these lines in the order they
appear.
>
Simplified the comment. "order by attnum" is to make sure that if
there is a table already existing, and the column name is like X and
the data type like Y, then we consider this table is good for holding
potential error info.

COPY FROM, main entry point is NextCopyFrom.
Now for non-binary mode, if you specified save_error then it will not
fail at NextCopyFrom.
all these three errors will be tolerated: extra data after last
expected column, missing data for column, data type conversion.

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: remaining sql/json patches
Следующее
От: shveta malik
Дата:
Сообщение: Re: Synchronizing slots from primary to standby