Обсуждение: Extension table data

Поиск
Список
Период
Сортировка

Extension table data

От
Keith Fiske
Дата:
I've read the documentation for extensions and how their data is not
normally dumped by pg_dump and how to configure the table so it should
dump its data
http://www.postgresql.org/docs/9.1/static/extend-extensions.html
However, after setting this option for the tables, the data is not
being dumped when I do a pg_dump of either individual tables or the
schema I've installed the extension to. The database I'm working with
right now is far too big to do a full pg_dump or pg_dumpall, so I
haven't been able to test that. I would assume that doing just the
table or schema should work, though?

Extension I'm working on: https://github.com/omniti-labs/pg_jobmon

Anyone else having this issue or am I doing something wrong?

--
Keith Fiske
Database Administrator
OmniTI Computer Consulting, Inc.
443.325.1357 x251

Re: Extension table data

От
Tom Lane
Дата:
Keith Fiske <keith@omniti.com> writes:
> I've read the documentation for extensions and how their data is not
> normally dumped by pg_dump and how to configure the table so it should
> dump its data
> http://www.postgresql.org/docs/9.1/static/extend-extensions.html
> However, after setting this option for the tables, the data is not
> being dumped when I do a pg_dump of either individual tables or the
> schema I've installed the extension to.

IIRC, the design intention is that such a table's data would be dumped
when (and only when) the extension is dumped.  That is, I'd expect to
see a "CREATE EXTENSION foo;" and then data for the extension's tables.
The partial-dump scenarios you mention wouldn't dump extensions, hence
not extension table data either.

Whether this design is a good one is still under debate, but I think
pg_dump is operating as designed here ...

            regards, tom lane

Re: Extension table data

От
Keith Fiske
Дата:
With the current design, I understand what you're saying now.  Just
doing some more testing, I was able to do a pg_dump -Fc -s for the
entire database and looking through the resulting object list with
pg_restore -l I'm actually seeing the extension table data included in
the dump file. Doing a restore on the schema I put the extension in, I
see the COPY commands to restore the data. I think this is a serious
shortcoming, and a confusing state of affairs. The only way to get
this data out seems to be to do a full database dump and the
schema-only option to pg_dump is outputting data.

Looking at the docs, I think the extension authors may have only had
configuration data in mind for extension tables. I don't see any
reason why we shouldn't be able to put any sort of table in our
extensions, some having actual data, not just config. That's actually
what I'm doing with my pg_jobmon extension, which could potentially
have millions of rows over time. Not having the tables included in the
extension definition would just make setting the whole thing up more
difficult and error prone.

If extensions really are going to be self contained like this as far
as dumps, perhaps another option to pg_dump is needed, and have the
schema-only or data-only options be honored in that case as well.

--
Keith Fiske
Database Administrator
OmniTI Computer Consulting, Inc.
443.325.1357 x251


On Fri, Jun 8, 2012 at 11:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Keith Fiske <keith@omniti.com> writes:
>> I've read the documentation for extensions and how their data is not
>> normally dumped by pg_dump and how to configure the table so it should
>> dump its data
>> http://www.postgresql.org/docs/9.1/static/extend-extensions.html
>> However, after setting this option for the tables, the data is not
>> being dumped when I do a pg_dump of either individual tables or the
>> schema I've installed the extension to.
>
> IIRC, the design intention is that such a table's data would be dumped
> when (and only when) the extension is dumped.  That is, I'd expect to
> see a "CREATE EXTENSION foo;" and then data for the extension's tables.
> The partial-dump scenarios you mention wouldn't dump extensions, hence
> not extension table data either.
>
> Whether this design is a good one is still under debate, but I think
> pg_dump is operating as designed here ...
>
>                        regards, tom lane

Re: Extension table data

От
Keith Fiske
Дата:
Just found something else rather disturbing. If you try to exclude the
schema that the extension tables are in, their data is still output.
Explicitly naming other schemas doesn't seem to dump the extension
data. So the only way to avoid getting the extension data in a
schema-only dump is to explicitly name all schemas but the one your
extension is in, which I think is another bug you had actually fixed
for 9.1.3 where extension data was always being dumped.

--
Keith Fiske
Database Administrator
OmniTI Computer Consulting, Inc.
443.325.1357 x251


On Sat, Jun 9, 2012 at 2:56 AM, Keith Fiske <keith@omniti.com> wrote:
> With the current design, I understand what you're saying now.  Just
> doing some more testing, I was able to do a pg_dump -Fc -s for the
> entire database and looking through the resulting object list with
> pg_restore -l I'm actually seeing the extension table data included in
> the dump file. Doing a restore on the schema I put the extension in, I
> see the COPY commands to restore the data. I think this is a serious
> shortcoming, and a confusing state of affairs. The only way to get
> this data out seems to be to do a full database dump and the
> schema-only option to pg_dump is outputting data.
>
> Looking at the docs, I think the extension authors may have only had
> configuration data in mind for extension tables. I don't see any
> reason why we shouldn't be able to put any sort of table in our
> extensions, some having actual data, not just config. That's actually
> what I'm doing with my pg_jobmon extension, which could potentially
> have millions of rows over time. Not having the tables included in the
> extension definition would just make setting the whole thing up more
> difficult and error prone.
>
> If extensions really are going to be self contained like this as far
> as dumps, perhaps another option to pg_dump is needed, and have the
> schema-only or data-only options be honored in that case as well.
>
> --
> Keith Fiske
> Database Administrator
> OmniTI Computer Consulting, Inc.
> 443.325.1357 x251
>
>
> On Fri, Jun 8, 2012 at 11:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Keith Fiske <keith@omniti.com> writes:
>>> I've read the documentation for extensions and how their data is not
>>> normally dumped by pg_dump and how to configure the table so it should
>>> dump its data
>>> http://www.postgresql.org/docs/9.1/static/extend-extensions.html
>>> However, after setting this option for the tables, the data is not
>>> being dumped when I do a pg_dump of either individual tables or the
>>> schema I've installed the extension to.
>>
>> IIRC, the design intention is that such a table's data would be dumped
>> when (and only when) the extension is dumped.  That is, I'd expect to
>> see a "CREATE EXTENSION foo;" and then data for the extension's tables.
>> The partial-dump scenarios you mention wouldn't dump extensions, hence
>> not extension table data either.
>>
>> Whether this design is a good one is still under debate, but I think
>> pg_dump is operating as designed here ...
>>
>>                        regards, tom lane

Re: Extension table data

От
Yeb Havinga
Дата:
On 2012-06-09 08:56, Keith Fiske wrote:
> Looking at the docs, I think the extension authors may have only had
> configuration data in mind for extension tables. I don't see any
> reason why we shouldn't be able to put any sort of table in our
> extensions, some having actual data, not just config.

True. We've made several extensions that consist only of table data,
such as contents of medical codesystems, where the actual terminology
comes from another extension that has functions and empty codesystem
tables. I remember there were some issues making it. IIRC the table data
would not be dropped on extension drop.

regards,
Yeb

--
Yeb Havinga
http://www.mgrid.net/
Mastering Medical Data


Re: Extension table data

От
Tom Lane
Дата:
Keith Fiske <keith@omniti.com> writes:
> Just found something else rather disturbing. If you try to exclude the
> schema that the extension tables are in, their data is still output.

This is a common misconception: extensions do not live within schemas.
(An extension might own a schema, not the other way around.)  So a
dump with a -n switch is never going to select an extension.

By and large, if the current behavior bothers you, ISTM it probably
means you are using these tables in a way other than what the concept of
an extension configuration table was meant for: namely, to hold
configuration data that would be referenced by the functions in that
extension, but would not normally be considered part of the user's data.
There has been some talk of trying to cater for a more general notion of
tables created by extensions, but we do not have a design or even a
clear idea of a set of requirements for that.  Perhaps it would be good
if you explained what is your use-case --- why are you concerned about
being able to manage these tables as if they were regular data?

            regards, tom lane

Re: Extension table data

От
Chris Travers
Дата:
On Sat, Jun 9, 2012 at 5:33 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Keith Fiske <keith@omniti.com> writes:
>> Just found something else rather disturbing. If you try to exclude the
>> schema that the extension tables are in, their data is still output.
>
> This is a common misconception: extensions do not live within schemas.
> (An extension might own a schema, not the other way around.)  So a
> dump with a -n switch is never going to select an extension.
>
> By and large, if the current behavior bothers you, ISTM it probably
> means you are using these tables in a way other than what the concept of
> an extension configuration table was meant for: namely, to hold
> configuration data that would be referenced by the functions in that
> extension, but would not normally be considered part of the user's data.
> There has been some talk of trying to cater for a more general notion of
> tables created by extensions, but we do not have a design or even a
> clear idea of a set of requirements for that.  Perhaps it would be good
> if you explained what is your use-case --- why are you concerned about
> being able to manage these tables as if they were regular data?
>
Here's a use case I have been thinking a lot about lately.  I am not
sure that extensions is the right vehicle for it, but it may be a good
starting point.

LedgerSMB is moving towards a more modular structure and LedgerSMB 1.5
will probably require PostgreSQL 9.1 or higher (owing to the writable
CTE's).  I have been looking at how the extension system can be used
to simplify our maintenance and it helps quite a bit.  However, one
key aspect that would be really nice would be managing schema changes
along with changing versions of an extension.  For example, one might
have a fixed asset module, and that module might have a series of
stored procedures.  We would have some tables that store configuration
data regarding the module.  For example we might have a table that
stores info on stored procedures that track depreciation methods.  The
use of tables for configuration data in the current approach fits this
nicely.

However we might have other tables which store data, things like which
fixed assets get depreciated in which ways, and the like.  These may
need to have columns added from time to time, or have other alter
table operations performed.  It would be nice to be able to manage
these schema changes and upgrading the extension in the same
framework.

I am wondering if some sort of "database modules" framework might be
helpful with modules possibly having extensions, but also having data
tables.

Best Wishes,
Chris Travers

Re: Extension table data

От
Keith Fiske
Дата:
Along with Chris, the need for having tables defined in the extension
is about keeping control of the table structure so the extension
functions are guaranteed to work as intended and to make upgrading
versions easier. As an example, the fact that I can specifically name
constraints makes it easier to write an extension upgrade script if
that constraint needs to change because I know exactly what I called
it. Trying to support extensions and requiring that we write long,
explicit instructions for creating and maintaining the associated
tables is just asking for trouble. Especially when we can have the
control we need to avoid these issues.

Honestly, the big issue I have right now is that it is dumping data
with the schema-only option and only dumping that data if you do a
complete database dump. That is making it very difficult to manage
extensions even using them as they're documented now.

I think its time to recognize the extension system is more widely
usable than it was originally intended. And that's a good thing! We
just need to try to find ways to make the existing tools work in a
more predictable manner now.

--
Keith Fiske
Database Administrator
OmniTI Computer Consulting, Inc.
443.325.1357 x251


On Sat, Jun 9, 2012 at 9:56 AM, Chris Travers <chris.travers@gmail.com> wrote:
> On Sat, Jun 9, 2012 at 5:33 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Keith Fiske <keith@omniti.com> writes:
>>> Just found something else rather disturbing. If you try to exclude the
>>> schema that the extension tables are in, their data is still output.
>>
>> This is a common misconception: extensions do not live within schemas.
>> (An extension might own a schema, not the other way around.)  So a
>> dump with a -n switch is never going to select an extension.
>>
>> By and large, if the current behavior bothers you, ISTM it probably
>> means you are using these tables in a way other than what the concept of
>> an extension configuration table was meant for: namely, to hold
>> configuration data that would be referenced by the functions in that
>> extension, but would not normally be considered part of the user's data.
>> There has been some talk of trying to cater for a more general notion of
>> tables created by extensions, but we do not have a design or even a
>> clear idea of a set of requirements for that.  Perhaps it would be good
>> if you explained what is your use-case --- why are you concerned about
>> being able to manage these tables as if they were regular data?
>>
> Here's a use case I have been thinking a lot about lately.  I am not
> sure that extensions is the right vehicle for it, but it may be a good
> starting point.
>
> LedgerSMB is moving towards a more modular structure and LedgerSMB 1.5
> will probably require PostgreSQL 9.1 or higher (owing to the writable
> CTE's).  I have been looking at how the extension system can be used
> to simplify our maintenance and it helps quite a bit.  However, one
> key aspect that would be really nice would be managing schema changes
> along with changing versions of an extension.  For example, one might
> have a fixed asset module, and that module might have a series of
> stored procedures.  We would have some tables that store configuration
> data regarding the module.  For example we might have a table that
> stores info on stored procedures that track depreciation methods.  The
> use of tables for configuration data in the current approach fits this
> nicely.
>
> However we might have other tables which store data, things like which
> fixed assets get depreciated in which ways, and the like.  These may
> need to have columns added from time to time, or have other alter
> table operations performed.  It would be nice to be able to manage
> these schema changes and upgrading the extension in the same
> framework.
>
> I am wondering if some sort of "database modules" framework might be
> helpful with modules possibly having extensions, but also having data
> tables.
>
> Best Wishes,
> Chris Travers