Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Tom Lane wrote:
> >> In the examples given, the error didn't show up till later rows, in
> >> fields where there was no \r anywhere.
>
> > Hard to imagine why a failure would occur on anything but the first row.
>
> Simple: suppose the dumped data contains \ \n in the body of a field,
> which is the currently-accepted representation for a data newline.
> Microsoft munges this to \ \r \n, which will now be read by COPY IN as a
> backslashed \r (ie, a data \r) followed by a non-escaped newline.
> Ergo, that's the end of the current row. No error is detected (since a
> datatype that could have contained a data \n is unlikely to reject a
> data \r). Then parsing resumes for the next row at the next input
> character. We're totally out of sync and will produce lord only knows
> what peculiar error message when, eventually, some datatype input
> converter doesn't like what it's given. But that could be many rows
> later than where the problem was.
>
> Part of the problem here is that COPY IN will accept a "short" row (one
> with fewer fields than the table actually has columns). So early
> termination of a line isn't recognizable as an error unless the last
> column that receives some data manages to recognize a formatting error.
> Personally I think it would be a good idea to change that --- but I
> suppose you'll complain that that might break existing applications,
> even though the "feature" has never been documented.
No, I will not complain at all. We should not accept short rows,
period. It is reasonable to expect a full row every time. I just think
forcing the escape of \r isn't reasonable.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026