Обсуждение: \COPY ... CSV with hex escapes

Поиск
Список
Период
Сортировка

\COPY ... CSV with hex escapes

От
Craig Ringer
Дата:
Hi folks

I have an odd csv input format to deal with. I'm about to put some
Python together to reprocess it, but I thought I'd check in and see if
I'm missing something obvious in \copy's capabilities.

The input is fairly conventional comma-delimeted text with quoted
fields, as output by Sybase SQL Anywhere 5.5's isql 'OUTPUT TO' filter.
Yes, that's ancient. It is handled quite happily by \copy in csv mode,
except that when csv mode is active, \xnn escapes do not seem to be
processed. So I can have *either* \xnn escape processing *or* csv-style
input processing.

Anyone know of a way to get escape processing in csv mode?

--
Craig Ringer

Re: \COPY ... CSV with hex escapes

От
Peter Hunsberger
Дата:
On Wed, Jul 7, 2010 at 9:21 PM, Craig Ringer
<craig@postnewspapers.com.au> wrote:
> Hi folks
>
> I have an odd csv input format to deal with. I'm about to put some
> Python together to reprocess it, but I thought I'd check in and see if
> I'm missing something obvious in \copy's capabilities.
>
> The input is fairly conventional comma-delimeted text with quoted
> fields, as output by Sybase SQL Anywhere 5.5's isql 'OUTPUT TO' filter.
> Yes, that's ancient. It is handled quite happily by \copy in csv mode,
> except that when csv mode is active, \xnn escapes do not seem to be
> processed. So I can have *either* \xnn escape processing *or* csv-style
> input processing.
>
> Anyone know of a way to get escape processing in csv mode?
>
Don't know if you can do it directly, but this seem like one of those
cases where a ETL tool like that from Pentaho (Kettle / Spoon) might
be in order?  One step to handle the escape chars and one to load the
actual CSV...

--
Peter Hunsberger

Re: \COPY ... CSV with hex escapes

От
Alban Hertroys
Дата:
On 8 Jul 2010, at 4:21, Craig Ringer wrote:

> Yes, that's ancient. It is handled quite happily by \copy in csv mode,
> except that when csv mode is active, \xnn escapes do not seem to be
> processed. So I can have *either* \xnn escape processing *or* csv-style
> input processing.
>
> Anyone know of a way to get escape processing in csv mode?


And what do those hex-escaped bytes mean? Are they in text strings? AFAIK CSV doesn't contain any information about
whatencoding was used to create it, so it could be about anything; UTF-8, Win1252, ISO-8859-something, or whatever
Sybasewas using. 

I'm just saying, be careful what you're parsing there ;)

Alban Hertroys

--
Screwing up is an excellent way to attach something to the ceiling.


!DSPAM:737,4c359d9f286212106016419!



Re: \COPY ... CSV with hex escapes

От
Craig Ringer
Дата:
On 08/07/10 17:42, Alban Hertroys wrote:
> On 8 Jul 2010, at 4:21, Craig Ringer wrote:
>
>> Yes, that's ancient. It is handled quite happily by \copy in csv mode,
>> except that when csv mode is active, \xnn escapes do not seem to be
>> processed. So I can have *either* \xnn escape processing *or* csv-style
>> input processing.
>>
>> Anyone know of a way to get escape processing in csv mode?
>
>
> And what do those hex-escaped bytes mean? Are they in text strings? AFAIK CSV doesn't contain any information about
whatencoding was used to create it, so it could be about anything; UTF-8, Win1252, ISO-8859-something, or whatever
Sybasewas using. 
>
> I'm just saying, be careful what you're parsing there ;)

Thanks for that. In this case, the escapes are just "bytes" - what's
important is that, after unescaping, the CSV data is interpreted as
latin-1. OK, Windows-1252, but close enough.

In the end Python's csv module did the trick. I just pulled in the CSV
data, and spat out Postgresql-friendly COPY format so that I didn't need
to use the COPY ... CSV modifier and Pg would interpret the escapes
during input.

In case anyone else needs to deal with this format, here's the program I
used.

--
Craig Ringer

Tech-related writing: http://soapyfrogs.blogspot.com/

Вложения