Обсуждение: Import German Decimal Numbers

Поиск
Список
Период
Сортировка

Import German Decimal Numbers

От
Jan Christian Dittmer
Дата:
Hi!

I want to import some data from an ascii file using the COPY sql-command.
Unfortunatly the decimal numbers are given in german format, meaning the
decimal point is replaced by a comma (, instead of .). Is there any
possiblility
to switch the clients behaviourr (like 'set datestyle') for
interpretation of decimals?

Thanks!

best regards,
  Christian Dittmer

Re: Import German Decimal Numbers

От
"A. Kretschmer"
Дата:
am  Wed, dem 07.05.2008, um 14:25:13 +0200 mailte Jan Christian Dittmer folgendes:
> Hi!
>
> I want to import some data from an ascii file using the COPY sql-command.
> Unfortunatly the decimal numbers are given in german format, meaning the
> decimal point is replaced by a comma (, instead of .). Is there any
> possiblility
> to switch the clients behaviourr (like 'set datestyle') for
> interpretation of decimals?

Don't know, but you can replace the , to . within the ascii-file (sed,
awk, ...).


Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

Re: Import German Decimal Numbers

От
"Ken Allen"
Дата:
I would replace the ',' with something else such as a '#' first then
replace the decimal with the ',' then replace the '#' with a decimal '.'

If you do the ',' with a '.' first then all of them will be '.' and you
wont know which ones to change.

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of A. Kretschmer
Sent: Wednesday, May 07, 2008 9:38 AM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Import German Decimal Numbers

am  Wed, dem 07.05.2008, um 14:25:13 +0200 mailte Jan Christian Dittmer
folgendes:
> Hi!
>
> I want to import some data from an ascii file using the COPY
sql-command.
> Unfortunatly the decimal numbers are given in german format, meaning
the
> decimal point is replaced by a comma (, instead of .). Is there any
> possiblility
> to switch the clients behaviourr (like 'set datestyle') for
> interpretation of decimals?

Don't know, but you can replace the , to . within the ascii-file (sed,
awk, ...).


Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
This message has been scanned by MailScanner


Re: Import German Decimal Numbers

От
Jan Christian Dittmer
Дата:
Thank you very much!
You have remind me that the our server runs under Linux and not under
Windows as our clients :-)
So indeed I can use a sed-pipe construct to switch '.' and ','.
But wait, there is just another problem then. Our date format is also
german :-( "DD.MM.YY" or
"DD.MM.YYYY". So if I just exchange '.' and ',' the date will be
unreadable for the import :-(
The (current) file is 1.4 GB so it will take ages to let awk chew on it
I guess.

    Christian

Ken Allen wrote:
> I would replace the ',' with something else such as a '#' first then
> replace the decimal with the ',' then replace the '#' with a decimal '.'
>
> If you do the ',' with a '.' first then all of them will be '.' and you
> wont know which ones to change.
>

> Don't know, but you can replace the , to . within the ascii-file (sed,
> awk, ...).
>
>
> Andreas

Re: Import German Decimal Numbers

От
"Ken Allen"
Дата:
Well if your doing an update, do it column by column and when you do a
date column replace the '.' with '/'

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Jan Christian
Dittmer
Sent: Wednesday, May 07, 2008 10:16 AM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Import German Decimal Numbers

Thank you very much!
You have remind me that the our server runs under Linux and not under
Windows as our clients :-)
So indeed I can use a sed-pipe construct to switch '.' and ','.
But wait, there is just another problem then. Our date format is also
german :-( "DD.MM.YY" or
"DD.MM.YYYY". So if I just exchange '.' and ',' the date will be
unreadable for the import :-(
The (current) file is 1.4 GB so it will take ages to let awk chew on it
I guess.

    Christian

Ken Allen wrote:
> I would replace the ',' with something else such as a '#' first then
> replace the decimal with the ',' then replace the '#' with a decimal
'.'
>
> If you do the ',' with a '.' first then all of them will be '.' and
you
> wont know which ones to change.
>

> Don't know, but you can replace the , to . within the ascii-file (sed,
> awk, ...).
>
>
> Andreas

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
This message has been scanned by MailScanner


Re: Import German Decimal Numbers

От
Tino Wildenhain
Дата:
Jan Christian Dittmer wrote:
> Thank you very much!
> You have remind me that the our server runs under Linux and not under
> Windows as our clients :-)
> So indeed I can use a sed-pipe construct to switch '.' and ','.
> But wait, there is just another problem then. Our date format is also
> german :-( "DD.MM.YY" or
> "DD.MM.YYYY". So if I just exchange '.' and ',' the date will be
> unreadable for the import :-(
> The (current) file is 1.4 GB so it will take ages to let awk chew on it
> I guess.

If all fails you could copy into a another table where you use
text columns and then translate afterwards via to_date() and
to_number() / replace().

Cheers
Tino

Вложения

Re: Import German Decimal Numbers

От
Thomas Pundt
Дата:
On Mittwoch, 7. Mai 2008, Jan Christian Dittmer wrote:
| The (current) file is 1.4 GB so it will take ages to let awk chew on it
| I guess.

If you think awk is a bottleneck, I'd recommend using perl instead. It's waaay
faster and should process your file within minutes if not faster.

Ciao,
Thomas

--
Thomas Pundt <thomas.pundt@rp-online.de> ---- http://rp-online.de/ ----

Re: Import German Decimal Numbers

От
Tino Wildenhain
Дата:
Thomas Pundt wrote:
> On Mittwoch, 7. Mai 2008, Jan Christian Dittmer wrote:
> | The (current) file is 1.4 GB so it will take ages to let awk chew on it
> | I guess.
>
> If you think awk is a bottleneck, I'd recommend using perl instead. It's waaay
> faster and should process your file within minutes if not faster.
>
isnt perl just another frontend for awk? ;)) SCNR ;)

There is also http://pgfoundry.org/projects/pgloader/
and if not already implemented it should be fairly
easy to implement a data filter within this one.

Cheers
Tino


Вложения

Re: Import German Decimal Numbers

От
"William Temperley"
Дата:
> Jan Christian Dittmer wrote:
>
>
> >    Thank you very much!
> >    You have remind me that the our server runs under Linux and not under Windows as our clients :-)
> >    So indeed I can use a sed-pipe construct to switch '.' and ','.
> >    But wait, there is just another problem then. Our date format is also german :-( "DD.MM.YY" or
> >    "DD.MM.YYYY". So if I just exchange '.' and ',' the date will be unreadable for the import :-(
> >    The (current) file is 1.4 GB so it will take ages to let awk chew on it I guess.
> >
>
>
> If all fails you could copy into a another table where you use
> text columns and then translate afterwards via to_date() and
> to_number() / replace().
>
For my tuppence worth, this is the approach I take. Copy into a
sandbox table first, as the only reliable thing about external input
is it won't be as it's supposed to be.
This way you can copy everything into varchar columns, then run any
number of regexes over the data to ensure it's perfect, then do:
insert into livetable select * from temptable (with the appropriate
casts in place).

Bit of a tangent, but Is there any possibility of SQL injection via
data provided to copy?

Will T
-

Re: Import German Decimal Numbers

От
Tino Wildenhain
Дата:
William Temperley wrote:
...
> Bit of a tangent, but Is there any possibility of SQL injection via
> data provided to copy?

depends on how you call COPY, but usually not :-)

Cheers
Tino

Вложения

Re: Import German Decimal Numbers

От
Dimitri Fontaine
Дата:
Hi,

Le 7 mai 08 à 15:57, Tino Wildenhain a écrit :
> There is also http://pgfoundry.org/projects/pgloader/
> and if not already implemented it should be fairly
> easy to implement a data filter within this one.

pgloader indeed support user reformating modules, and comes with a
mysql to pgsql timestamp reformater. Adding a python .py module
containing one function to handle the change should be easy, the
documentation has needed details if you look for "reformat_path" and
"reformat" options.
   http://pgloader.projects.postgresql.org/

Plus, pgloader supports setting the DateStyle before running copy,
maybe this will be enough in your case? (didn't read all the thread).

If you need more help than current documentation to setup your
reformating module, please just ask!

Regards,
--
dim