Re: Should CSV parsing be stricter about mid-field quotes?

Поиск
Список
Период
Сортировка
От Andrew Dunstan
Тема Re: Should CSV parsing be stricter about mid-field quotes?
Дата
Msg-id e819612f-f75f-ec88-0d0c-d63ffb6c8745@dunslane.net
обсуждение исходный текст
Ответ на Should CSV parsing be stricter about mid-field quotes?  ("Joel Jacobson" <joel@compiler.org>)
Ответы Re: Should CSV parsing be stricter about mid-field quotes?  ("Joel Jacobson" <joel@compiler.org>)
Список pgsql-hackers


On 2023-05-11 Th 10:03, Joel Jacobson wrote:
Hi hackers,

I've come across an unexpected behavior in our CSV parser that I'd like to
bring up for discussion.

% cat example.csv
id,rating,review
1,5,"Great product, will buy again."
2,3,"I bought this for my 6" laptop but it didn't fit my 8" tablet"

% psql
CREATE TABLE reviews (id int, rating int, review text);
\COPY reviews FROM example.csv WITH CSV HEADER;
SELECT * FROM reviews;

This gives:

id | rating |                           review
----+--------+-------------------------------------------------------------
  1 |      5 | Great product, will buy again.
  2 |      3 | I bought this for my 6 laptop but it didn't fit my 8 tablet
(2 rows)


Maybe this is unexpected by you, but it's not by me. What other sane interpretation of that data could there be? And what CSV producer outputs such horrible content? As you've noted, ours certainly does not. Our rules are clear: quotes within quotes must be escaped (default escape is by doubling the quote char). Allowing partial fields to be quoted was a deliberate decision when CSV parsing was implemented, because examples have been seen in the wild.

So I don't think our behaviour is broken or needs fixing. As mentioned by Greg, this is an example of the adage about being liberal in what you accept.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: psql tests hangs
Следующее
От: Nathan Bossart
Дата:
Сообщение: improve more permissions-related error messages