Re: Should CSV parsing be stricter about mid-field quotes?

Поиск
Список
Период
Сортировка
От Daniel Verite
Тема Re: Should CSV parsing be stricter about mid-field quotes?
Дата
Msg-id 5094828d-8f56-47de-a4eb-7dfbbca783c8@manitou-mail.org
обсуждение исходный текст
Ответ на Re: Should CSV parsing be stricter about mid-field quotes?  ("Joel Jacobson" <joel@compiler.org>)
Ответы Re: Should CSV parsing be stricter about mid-field quotes?  ("Joel Jacobson" <joel@compiler.org>)
Список pgsql-hackers
    Joel Jacobson wrote:

> I've been using that trick myself many times in the past, but thanks to this
> deep-dive into this topic, it looks to me like TEXT would be a better format
> fit when dealing with unquoted TSV files, or?
>
> OTOH, one would then need to inspect the TSV file doesn't contain \. on an
> empty line...

Note that this is the case for valid CSV contents, since backslash-dot
on a line by itself is both an end-of-data marker for COPY FROM and a
valid CSV line.
Having this line in the data results in either an error or having the
rest of the data silently discarded, depending on the context. There
is some previous discussion about this in [1].
Since the TEXT format doesn't have this kind of problem, one solution
is to filter the data through PROGRAM with an [untrusted CSV]->TEXT
filter. This is to be preferred over direct CSV loading when
strictness or robustness are more important than convenience.


[1]
https://www.postgresql.org/message-id/10e3eff6-eb04-4b3f-aeb4-b920192b977a@manitou-mail.org

Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Assert failure of the cross-check for nullingrels
Следующее
От: Jeff Davis
Дата:
Сообщение: Re: Order changes in PG16 since ICU introduction