Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

Поиск
Список
Период
Сортировка
От jian he
Тема Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
Дата
Msg-id CACJufxEPOPfMbTPsuj91_1ra2hXJMHw_jMFnGiLBrCcE5Q_Bng@mail.gmail.com
обсуждение исходный текст
Ответ на Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)  (torikoshia <torikoshia@oss.nttdata.com>)
Ответы Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)  (Masahiko Sawada <sawada.mshk@gmail.com>)
Список pgsql-hackers
On Mon, Dec 18, 2023 at 1:09 PM torikoshia <torikoshia@oss.nttdata.com> wrote:
>
> Hi,
>
> > save the error metadata to  system catalogs would be more expensive,
> > please see below explanation.
> > I have no knowledge of publications.
> > but i feel there is a feature request: publication FOR ALL TABLES
> > exclude regex_pattern.
> > Anyway, that would be another topic.
>
> I think saving error metadata to system catalog is not a good idea, too.
> And I believe Sawada-san just pointed out missing features and did not
> suggested that we use system catalog.
>
> > I don't think "specify the maximum number  of errors to tolerate
> > before raising an ERROR." is very useful....
>
> That may be so.
> I imagine it's useful in some use case since some loading tools have
> such options.
> Anyway I agree it's not necessary for initial patch as mentioned in [1].
>
> > I suppose we can specify an ERRORFILE directory. similar
> > implementation [2], demo in [3]
> > it will generate 2 files, one file shows the malform line content as
> > is, another file shows the error info.
>
> That may be a good option when considering "(2) logging errors to
> somewhere".
>
> What do you think about the proposal to develop these features in
> incrementally?
>

I am more with  tom's idea [1], that is when errors happen (data type
conversion only), do not fail, AND we save the error to a table.  I
guess we can implement this logic together, only with a new COPY
option.

imagine a case (it's not that contrived, imho), while conversion from
text to table's int, postgres isspace is different from the source
text file's isspace logic.
then all the lines are malformed. If we just say on error continue and
not save error meta info, the user is still confused which field has
the wrong data, then the user will probably try to incrementally test
which field contains malformed data.

Since we need to save the error somewhere.
Everyone has the privilege to INSERT can do COPY.
I think we also need to handle the access privilege also.
So like I mentioned above, one copy_error error table hub, then
everyone can view/select their own copy failure record.

but save to a server text file/directory, not easy for an INSERT
privilege user to see these files, I think.
similarly not easy to see these failed records in log for limited privilege.

if someone wants to fail at maxerror rows, they can do it, since we
will count how many rows failed.
even though I didn't get it.

[1] https://www.postgresql.org/message-id/900123.1699488001%40sss.pgh.pa.us



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Hayato Kuroda (Fujitsu)"
Дата:
Сообщение: RE: [PoC] pg_upgrade: allow to upgrade publisher node
Следующее
От: Andrei Lepikhov
Дата:
Сообщение: Re: introduce dynamic shared memory registry