Re: COPY enhancements

Поиск
Список
Период
Сортировка
От Andrew Dunstan
Тема Re: COPY enhancements
Дата
Msg-id 4AABBAA3.30604@dunslane.net
обсуждение исходный текст
Ответ на Re: COPY enhancements  (Greg Smith <gsmith@gregsmith.com>)
Ответы Re: COPY enhancements  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers

Greg Smith wrote:
>> After some thought, I think that Andrew's feature *is* generally
>> applicable, if done as IGNORE COLUMN COUNT (or, more likely,
>> column_count=ignore). I can think of a lot of data sets where column
>> count is jagged and you want to do ELT instead of ETL.
>
> Exactly, the ELT approach gives you so many more options for cleaning 
> up the data that I think it would be used more if it weren't so hard 
> to do in Postgres right now.
>
>

+1. That's exactly what my client wants to do. They know perfectly well 
that they get junk data. They want to get it into the database with a 
minimum of fuss where they will have the right tools for checking and 
cleaning it. If they have to spend effort whacking it into shape just to 
get it into the database, then their cleanup effort essentially has to 
be done in two pieces, part inside and part outside the database.


>
> While complicated, COPY is a pretty walled off command of around 3500 
> lines of code, and the hackery required here is pretty small. For 
> example, it turns out we do already have the code to get it to ignore 
> column overruns here, and it's all of 50 new lines--much of which is 
> shared with code that does other error ignoring bits too. It's easy to 
> make a case for a grand future extensibility cleanup here, but it's 
> really not necessary to provide a significant benefit here for the 
> cases I mentioned. And I would guess the maintenance burden of a more 
> general solution has to be higher than a simple implementation of the 
> feature list I gave in my last message.
>
> In short: there's a presumption that adding any error-ignoring code 
> would require significant contortions. I don't think that's really 
> true though, and would like to keep open the possibilty of accepting 
> some simple but useful ad-hoc features in this area, even if they 
> don't solve every possible problem in this space just yet.
>
>

Right. What I proposed would not have been terribly invasive or 
difficult, certainly less so than what seems to be our direction by an 
order of magnitude at least. I don't for a moment accept the assertion 
that we can get a general solution for the same effort.

cheers

andrew


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Grzegorz Jaskiewicz
Дата:
Сообщение: Re: clang's static checker report.
Следующее
От: Tom Lane
Дата:
Сообщение: Re: COPY enhancements