Обсуждение: pg_dump enhancement proposal
Hello Everyone, I recently spent some time comparing the various techniques used to backup/export MySQL, SQLite, and PostgreSQL databases. So far, pg_dump is the best tool I've used. I was well pleased with its efficient syntax and functionality. However, I have two feature requests that I think can make it even better: 1. View Based Exports In addition to being able to dump specific tables, it seems useful to be able to quickly specify and dump entire views. 2. Custom Query Exports In my use of mysqldump, I found one feature very useful: the ability to execute a custom SELECT. . .WHERE statement and then dump only the results. This feature currently provides MySQL users with the ability to quickly and easily export very granular data subsets, and I see no reason why PostgreSQL users wouldn't benefit from the same capability. While it is true that this functionality can already be achieved in PostgreSQL using Copy, it seems to me that it would logically fit well as an extension to pg_dump, especially since many beginning and even some intermediate PostgreSQL users aren't aware of the alternatives. If the community finds value in the above features, I would be glad to begin developing them for possible inclusion in the next PostgreSQL release. Best Regards, Mark Hammonds Web Engineer OmniTI Computer Consulting technology.markhammonds.ws
Mark Hammonds <mhammonds@omniti.com> writes: > 1. View Based Exports > In addition to being able to dump specific tables, it seems useful to > be able to quickly specify and dump entire views. Isn't this pretty much the same thing as your #2? A view in PG isn't materialized, it's nothing but a custom SELECT. > 2. Custom Query Exports > In my use of mysqldump, I found one feature very useful: the ability > to execute a custom SELECT. . .WHERE statement and then dump only the > results. This feature currently provides MySQL users with the ability > to quickly and easily export very granular data subsets, and I see no > reason why PostgreSQL users wouldn't benefit from the same capability. > While it is true that this functionality can already be achieved in > PostgreSQL using Copy, it seems to me that it would logically fit well > as an extension to pg_dump, especially since many beginning and even > some intermediate PostgreSQL users aren't aware of the alternatives. As you say, we already have this using COPY, and I don't agree that it would be a good idea to plaster it into pg_dump as well. pg_dump is intended for dumping and restoring data, not for ETL-type tasks. Furthermore, pg_dump is a overly complex beast already --- much more so than one could wish, for a tool that is absolutely fundamental to database reliability. Putting requirements on it that are well outside its charter seems like a short route to maintenance disaster. There has been some occasional chatter about developing one or more tools focused on ETL rather than dump/restore, and my thought is that this idea would fit better there. An ETL tool would not have the kind of requirements pg_dump has for coping with multiple server versions and knowing everything there is to know about database contents, so it seems like it could address new areas of functionality without a complexity explosion. You might want to check the archives for previous discussions --- I think the last go-round started with someone wanting to add an arbitrary WHERE filter to pg_dump dumps. regards, tom lane
On Thu, Nov 12, 2009 at 04:31:37PM -0500, Tom Lane wrote: > Mark Hammonds <mhammonds@omniti.com> writes: > > 2. Custom Query Exports > > > In my use of mysqldump, I found one feature very useful: the ability > > to execute a custom SELECT. . .WHERE statement and then dump only the > > results. This feature currently provides MySQL users with the ability > > to quickly and easily export very granular data subsets, and I see no > > reason why PostgreSQL users wouldn't benefit from the same capability. > > While it is true that this functionality can already be achieved in > > PostgreSQL using Copy, it seems to me that it would logically fit well > > as an extension to pg_dump, especially since many beginning and even > > some intermediate PostgreSQL users aren't aware of the alternatives. > > As you say, we already have this using COPY, and I don't agree that > it would be a good idea to plaster it into pg_dump as well. pg_dump > is intended for dumping and restoring data, not for ETL-type tasks. > Furthermore, pg_dump is a overly complex beast already --- much more > so than one could wish, for a tool that is absolutely fundamental to > database reliability. Putting requirements on it that are well outside > its charter seems like a short route to maintenance disaster. > > There has been some occasional chatter about developing one or more > tools focused on ETL rather than dump/restore, and my thought is that > this idea would fit better there. An ETL tool would not have the > kind of requirements pg_dump has for coping with multiple server > versions and knowing everything there is to know about database > contents, so it seems like it could address new areas of functionality > without a complexity explosion. > > You might want to check the archives for previous discussions --- > I think the last go-round started with someone wanting to add an > arbitrary WHERE filter to pg_dump dumps. Sorry I missed this thread. Not only has there been previous discussion, there have been at least two, and I seem to recall three, patches implementing this. None of the patches was very large, and none of them impacted the basic "make a backup" paths in pg_dump. -dg -- David Gould daveg@sonic.net 510 536 1443 510 282 0869 If simplicity worked, the world would be overrun with insects.