Обсуждение: pg_dump enhancement proposal

Поиск
Список
Период
Сортировка

pg_dump enhancement proposal

От
Mark Hammonds
Дата:
Hello Everyone,

I recently spent some time comparing the various techniques used to  
backup/export MySQL, SQLite, and PostgreSQL databases. So far, pg_dump  
is the best tool I've used. I was well pleased with its efficient  
syntax and functionality. However, I have two feature requests that I  
think can make it even better:

1. View Based Exports

In addition to being able to dump specific tables, it seems useful to  
be able to quickly specify and dump entire views.

2.  Custom Query Exports

In my use of mysqldump, I found one feature very useful: the ability  
to execute a custom SELECT. . .WHERE statement and then dump only the  
results. This feature currently provides MySQL users with the ability  
to quickly and easily export very granular data subsets, and I see no  
reason why PostgreSQL users wouldn't benefit from the same capability.  
While it is true that this functionality can already be achieved in  
PostgreSQL using Copy, it seems to me that it would logically fit well  
as an extension to pg_dump, especially since many beginning and even  
some intermediate PostgreSQL users aren't aware of the alternatives.

If the community finds value in the above features, I would be glad to  
begin developing them for possible inclusion in the next PostgreSQL  
release.

Best Regards,
Mark Hammonds
Web Engineer
OmniTI Computer Consulting
technology.markhammonds.ws


Re: pg_dump enhancement proposal

От
Tom Lane
Дата:
Mark Hammonds <mhammonds@omniti.com> writes:
> 1. View Based Exports

> In addition to being able to dump specific tables, it seems useful to  
> be able to quickly specify and dump entire views.

Isn't this pretty much the same thing as your #2?  A view in PG isn't
materialized, it's nothing but a custom SELECT.

> 2.  Custom Query Exports

> In my use of mysqldump, I found one feature very useful: the ability  
> to execute a custom SELECT. . .WHERE statement and then dump only the  
> results. This feature currently provides MySQL users with the ability  
> to quickly and easily export very granular data subsets, and I see no  
> reason why PostgreSQL users wouldn't benefit from the same capability.  
> While it is true that this functionality can already be achieved in  
> PostgreSQL using Copy, it seems to me that it would logically fit well  
> as an extension to pg_dump, especially since many beginning and even  
> some intermediate PostgreSQL users aren't aware of the alternatives.

As you say, we already have this using COPY, and I don't agree that
it would be a good idea to plaster it into pg_dump as well.  pg_dump
is intended for dumping and restoring data, not for ETL-type tasks.
Furthermore, pg_dump is a overly complex beast already --- much more
so than one could wish, for a tool that is absolutely fundamental to
database reliability.  Putting requirements on it that are well outside
its charter seems like a short route to maintenance disaster.

There has been some occasional chatter about developing one or more
tools focused on ETL rather than dump/restore, and my thought is that
this idea would fit better there.  An ETL tool would not have the
kind of requirements pg_dump has for coping with multiple server
versions and knowing everything there is to know about database
contents, so it seems like it could address new areas of functionality
without a complexity explosion.

You might want to check the archives for previous discussions ---
I think the last go-round started with someone wanting to add an
arbitrary WHERE filter to pg_dump dumps.
        regards, tom lane


Re: pg_dump enhancement proposal

От
daveg
Дата:
On Thu, Nov 12, 2009 at 04:31:37PM -0500, Tom Lane wrote:
> Mark Hammonds <mhammonds@omniti.com> writes:
> > 2.  Custom Query Exports
> 
> > In my use of mysqldump, I found one feature very useful: the ability  
> > to execute a custom SELECT. . .WHERE statement and then dump only the  
> > results. This feature currently provides MySQL users with the ability  
> > to quickly and easily export very granular data subsets, and I see no  
> > reason why PostgreSQL users wouldn't benefit from the same capability.  
> > While it is true that this functionality can already be achieved in  
> > PostgreSQL using Copy, it seems to me that it would logically fit well  
> > as an extension to pg_dump, especially since many beginning and even  
> > some intermediate PostgreSQL users aren't aware of the alternatives.
> 
> As you say, we already have this using COPY, and I don't agree that
> it would be a good idea to plaster it into pg_dump as well.  pg_dump
> is intended for dumping and restoring data, not for ETL-type tasks.
> Furthermore, pg_dump is a overly complex beast already --- much more
> so than one could wish, for a tool that is absolutely fundamental to
> database reliability.  Putting requirements on it that are well outside
> its charter seems like a short route to maintenance disaster.
> 
> There has been some occasional chatter about developing one or more
> tools focused on ETL rather than dump/restore, and my thought is that
> this idea would fit better there.  An ETL tool would not have the
> kind of requirements pg_dump has for coping with multiple server
> versions and knowing everything there is to know about database
> contents, so it seems like it could address new areas of functionality
> without a complexity explosion.
> 
> You might want to check the archives for previous discussions ---
> I think the last go-round started with someone wanting to add an
> arbitrary WHERE filter to pg_dump dumps.

Sorry I missed this thread.

Not only has there been previous discussion, there have been at least two,
and I seem to recall three, patches implementing this. None of the patches
was very large, and none of them impacted the basic "make a backup" paths
in pg_dump.

-dg

-- 
David Gould       daveg@sonic.net      510 536 1443    510 282 0869
If simplicity worked, the world would be overrun with insects.