Обсуждение: Constraint stuff

Поиск
Список
Период
Сортировка

Constraint stuff

От
Stephan Szabo
Дата:
I think my last message to Tom (and the list)
about the foreign key stuff and oids ended up 
in /dev/null due to a problem on the local 
mailer.

Tom had suggested storing a more 
understandable form of the foreign key constraint
to make dumping more reasonable in its own table.
I'd guess like the src stored for check constraints.
However, I noticed a few problems with this and
while thinking about it I had a few germs of
ideas which aren't any kind of proposal yet, but
I thought someone might be interested in them.

The problem with storing source is that it doesn't
get changed when things change.  Try altering
a column name that has a check constraint, then
dump the database.  I don't think this is the
responsibility of the dumper. If we store source
we should be guaranteeing it's correct.  
Plus, right now for FK constraints we do something
specific to keep track of the other table referenced
so we can remove the constraints if the table goes 
away.  But, what happens when we allow subqueries
in check constraints, etc...

So, what I was thinking is, that if we have another
table to store this kind of constraint info, it
should probably store information for all constraints.
I was thinking two tables, one (say pg_constraint)
which stores basic information about the constraint
(what type, the constraint name, primarily constraintd
table, maybe owner if constraints have owners in SQL)
and a source form (see more below).
The second table stores references from this constraint.
So any table, column, index, etc is stored here.
Probably something of the form constraintoid, 
type of thing being referenced (the oid of the table?),
the oid of the referenced thing and a number.

The number comes in to the source form thats stored.
Anywhere that we're referencing something that a name
is insufficient for (like a column name or table name)
we put something into the source for that says 
referncing column n of the referenced thing m.

Then we create something like 
format_constraint(constraintoid) which gives out
an SQL compliant version of the cconstraint.

And it means that if we deleted something, we know fairly 
easily whether or not it is being referenced by some
constraint somewhere without writing separate code for
fk constraints and check constraints, etc.. And
renaming wouldn't be a problem.

- There are some problems I see right off both conceptually
and implementation, but I thought someone might be able 
to come up with a better idea once it was presented (even 
if it's just a "not worth the effort" :) )

One of the problems I see is that if taken to its end,
would you store function oids here?  If so, that might
make it harder to allow a drop function/create function
to ever work transparently in the future.
Plus, I'm not even really sure if it would be reasonable
to get a source form like I was thinking of for check
constraints really.




Re: Constraint stuff

От
Philip Warner
Дата:
At 10:29 6/08/00 -0700, Stephan Szabo wrote:
>
>The problem with storing source is that it doesn't
>get changed when things change.  Try altering
>a column name that has a check constraint, then
>dump the database.

Or renaming a referenced table - I think the current constraint system will
handle this since OIDs don't change.


>So, what I was thinking is, that if we have another
>table to store this kind of constraint info, it
>should probably store information for all constraints.
>I was thinking two tables, one (say pg_constraint)
>which stores basic information about the constraint
>(what type, the constraint name, primarily constraintd
>table, maybe owner if constraints have owners in SQL)
>and a source form (see more below).

This sounds reasonable.


>The second table stores references from this constraint.
>So any table, column, index, etc is stored here.
>Probably something of the form constraintoid, 
>type of thing being referenced (the oid of the table?),
>the oid of the referenced thing and a number.

I would prefer to see this generalized: a dependencies table that lists
both the referrer OID *and* type, as well as the refrerenced thing oid &
type. This then allows things such as SQL functions to make entries in this
table as well as views etc etc. 


>The number comes in to the source form thats stored.
>Anywhere that we're referencing something that a name
>is insufficient for (like a column name or table name)
>we put something into the source for that says 
>referncing column n of the referenced thing m.

Don't know enough about the internals, but don't we have attr ids for this,
and/or won't OIDs work in most cases? Maybe I'm missing your point.


>- There are some problems I see right off both conceptually
>and implementation, but I thought someone might be able 
>to come up with a better idea once it was presented (even 
>if it's just a "not worth the effort" :) )

It seems to me that:

- 'format_constraint' is a good idea
- we need the dependency stuff
- dumping source in canonical form is best put in the backend
(philosophical point)
- I presume it's a first part of a full implementation of 'alter table
add/drop constraint...'

so I don't think it's a waste of time.


>One of the problems I see is that if taken to its end,
>would you store function oids here?  

Sounds sensible.


>If so, that might
>make it harder to allow a drop function/create function
>to ever work transparently in the future.

I *think* it doesn't work now; yes you can drop the function, but AFAIK,
the constraint references the old one (at least that's true for normal
triggers). What you are proposing makes people aware that they are about to
break more things than they know.


>Plus, I'm not even really sure if it would be reasonable
>to get a source form like I was thinking of for check
>constraints really.

I suspect it has to depend on how the constraint is acually checked. If
They are checkied by using table OIDs then you need to store the OID
dependency and *somehow* reconstruct the source. If they are checked using
table names (getting OID each time), then store the name and the raw source
(maybe). You need to handle renaming of tables referenced in CHECK clauses.
I hate rename (but I use it).

Maybe you can do something nasty like store the source with escape
characters and OIDs in place of names. This is not as bad as it sounds, I
think. It also gets around the problem that original source may be
unrecoverable (eg. COALESCE is translated to CASE in the parser, so a CHECK
clause that uses COALESCE will never be fully recoverable - although most
people would not see this as a problem). This messing around would have to
be done in the parser, I would guess. So:
  check exists(select * from reftbl where reffld=tbl.origfld)

might become:
  check exists(select * from %%table:<OID>%% where
%%table-attr:<Table-OID>,<Attd-ID>%%                   = %%table:<OID>%%.%%table-attr:<Table-OID>,<Attd-ID>%%

Looking at this, maybe it's not such a good idea after all...


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.C.N. 008 659 498)             |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


Re: Constraint stuff

От
Tom Lane
Дата:
Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> Tom had suggested storing a more 
> understandable form of the foreign key constraint
> to make dumping more reasonable in its own table.
> I'd guess like the src stored for check constraints.

I wasn't actually thinking of storing source, but rather precompiled
expressions (as I remarked awhile ago, I think pg_relcheck's rcsrc
column is dead weight; we could and should generate the value on demand
by reverse-listing rcbin instead).  This gets you away from
rename-induced problems since everything is table OIDs, attribute column
numbers, operator and function OIDs, etc.

However, digging those references out of the expression tree is a little
bit painful; you're right that we shouldn't expect applications to do
that for themselves.  We could store an additional list of referenced
items.  We wouldn't necessarily have to store that explicitly either,
though --- functions to say "is this OID referenced in this stored
expression" or perhaps "give me an array of all function OIDs in this
expression" would get the job done AFAICS.

> One of the problems I see is that if taken to its end,
> would you store function oids here?  If so, that might
> make it harder to allow a drop function/create function
> to ever work transparently in the future.

I don't think we should worry about that.  What's actually needed IMHO
is an "ALTER FUNCTION" command that allows you to replace the body of
an existing function, and perhaps change its name, but NOT its type
signature (result type and number/types of arguments).  Changing the
signature is inherently not a transparent operation because it'd
invalidate stored expressions that use the function.  ALTER would let
you make safe changes to a function without changing its OID and thus
without invalidating references-by-OID.
        regards, tom lane


Re: Constraint stuff

От
Stephan Szabo
Дата:
On Mon, 7 Aug 2000, Philip Warner wrote:

> >The second table stores references from this constraint.
> >So any table, column, index, etc is stored here.
> >Probably something of the form constraintoid, 
> >type of thing being referenced (the oid of the table?),
> >the oid of the referenced thing and a number.
> 
> I would prefer to see this generalized: a dependencies table that lists
> both the referrer OID *and* type, as well as the refrerenced thing oid &
> type. This then allows things such as SQL functions to make entries in this
> table as well as views etc etc. 

That makes more sense, yes. :)  Although not all of those things would
probably use it immediately.  

> 
> >The number comes in to the source form thats stored.
> >Anywhere that we're referencing something that a name
> >is insufficient for (like a column name or table name)
> >we put something into the source for that says 
> >referncing column n of the referenced thing m.
> 
> Don't know enough about the internals, but don't we have attr ids for this,
> and/or won't OIDs work in most cases? Maybe I'm missing your point.

I was thinking of it more for getting a textual representation back out
of the dependencies for constraints and thinking that I might want to 
reference something other than its name that's on something that's
referenced. And actually referncing column n, meant more like attrno
n of the row that m refers to. (Sort of like your thing below for
%%table:OID,attrno)

> >If so, that might
> >make it harder to allow a drop function/create function
> >to ever work transparently in the future.
> 
> I *think* it doesn't work now; yes you can drop the function, but AFAIK,
> the constraint references the old one (at least that's true for normal
> triggers). What you are proposing makes people aware that they are about to
> break more things than they know.

True, I just wanted to point it out in case someone had some thought on
changing it so that the system somehow fixed such references, but it
seems like alter function is more likely :)

> >Plus, I'm not even really sure if it would be reasonable
> >to get a source form like I was thinking of for check
> >constraints really.
> 
> I suspect it has to depend on how the constraint is acually checked. If
> They are checkied by using table OIDs then you need to store the OID
> dependency and *somehow* reconstruct the source. If they are checked using
> table names (getting OID each time), then store the name and the raw source
> (maybe). You need to handle renaming of tables referenced in CHECK clauses.
> I hate rename (but I use it).
>
> Maybe you can do something nasty like store the source with escape
> characters and OIDs in place of names. This is not as bad as it sounds, I
> think. It also gets around the problem that original source may be
> unrecoverable (eg. COALESCE is translated to CASE in the parser, so a CHECK
> clause that uses COALESCE will never be fully recoverable - although most
> people would not see this as a problem). This messing around would have to
> be done in the parser, I would guess. So:
> 
>    check exists(select * from reftbl where reffld=tbl.origfld)
> 
> might become:
> 
>    check exists(select * from %%table:<OID>%% where
> %%table-attr:<Table-OID>,<Attd-ID>%% 
>                    = %%table:<OID>%%.%%table-attr:<Table-OID>,<Attd-ID>%%
> 
> Looking at this, maybe it's not such a good idea after all...

:) Basically that's sort of what I was proposing with the %m.n above where
I was referencing a reference rather than the oid directly and the n was 
basically attd-id (and the reference stored the type so i didn't need
it. But if it had to be sent from the parser then your format probably
makes more sense.  I was thinking about reversing from the stored
expression in some fashion (but that wouldn't recover a coalesce or
something like that)



Re: Constraint stuff

От
Stephan Szabo
Дата:
On Mon, 7 Aug 2000, Tom Lane wrote:

> Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> > Tom had suggested storing a more 
> > understandable form of the foreign key constraint
> > to make dumping more reasonable in its own table.
> > I'd guess like the src stored for check constraints.
> 
> I wasn't actually thinking of storing source, but rather precompiled
> expressions (as I remarked awhile ago, I think pg_relcheck's rcsrc

I guess you could store the fk_constraint node that is generated for fk
constraints, but that's not really an expression... I think I must
be missing something, because I can't quite see what the precompiled
expression for an fk constraint would be...

> However, digging those references out of the expression tree is a little
> bit painful; you're right that we shouldn't expect applications to do
> that for themselves.  We could store an additional list of referenced
> items.  We wouldn't necessarily have to store that explicitly either,
> though --- functions to say "is this OID referenced in this stored
> expression" or perhaps "give me an array of all function OIDs in this
> expression" would get the job done AFAICS.

The reason I was thinking of storing things was also so you could do
things like: is this oid stored in any constraint.  For example,
I'm removing a column, is there any constraint that references this
column, etc, rather than having to code stuff for all of the special
cases in all places that might need it.



Re: Constraint stuff

От
Hannu Krosing
Дата:
Tom Lane wrote:
> 
> Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> > Tom had suggested storing a more
> > understandable form of the foreign key constraint
> > to make dumping more reasonable in its own table.
> > I'd guess like the src stored for check constraints.

...

> I don't think we should worry about that.  What's actually needed IMHO
> is an "ALTER FUNCTION" command that allows you to replace the body of
> an existing function, and perhaps change its name, but NOT its type
> signature (result type and number/types of arguments). 

IIRC Oracle allows the syntax CREATE OR REPLACE in many places, for 
example for changing VIEWS and PROCEDURES without affecting the things 
dependent on them.

CREATE OR REPLACE works also for not-yet-existing function which ALTER 
probably would not.

> Changing the
> signature is inherently not a transparent operation because it'd
> invalidate stored expressions that use the function.  ALTER would let
> you make safe changes to a function without changing its OID and thus
> without invalidating references-by-OID.
> 

----------
Hannu


Re: Constraint stuff

От
Don Baccus
Дата:
At 08:23 PM 8/7/00 +0300, Hannu Krosing wrote:

>IIRC Oracle allows the syntax CREATE OR REPLACE in many places

Yes, Oracle does allow this.  




- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: Constraint stuff

От
Tom Lane
Дата:
Jan Wieck <janwieck@Yahoo.com> writes:
>     PL/Tcl and PL/pgSQL will load a function's source only once a
>     session.   The  functions loaded are identified by OID, so if
>     you drop/create a function, the PL handler will simply load a
>     different  function too (from his point of view). At the time
>     we are able to ALTER a function, we  might  want  to  include
>     some version counter to pg_proc and in the fmgr_info?

More generally, the PL functions need to be able to deal with
recomputing saved plans after an ALTER of a table referenced by the
function.  I haven't really thought about how to do that ... but it
seems like Stephan's idea of a table showing referencers and referencees
might help detect the cases where plans have to be flushed.
        regards, tom lane


Re: Constraint stuff

От
Karel Zak
Дата:
On Mon, 7 Aug 2000, Tom Lane wrote:

> Jan Wieck <janwieck@Yahoo.com> writes:
> >     PL/Tcl and PL/pgSQL will load a function's source only once a
> >     session.   The  functions loaded are identified by OID, so if
> >     you drop/create a function, the PL handler will simply load a
> >     different  function too (from his point of view). At the time
> >     we are able to ALTER a function, we  might  want  to  include
> >     some version counter to pg_proc and in the fmgr_info?
> 
> More generally, the PL functions need to be able to deal with
> recomputing saved plans after an ALTER of a table referenced by the
> function.  I haven't really thought about how to do that ... but it
More and more generally, IMHO all saved plans need some validity checking and not only for ALTER, but also for all
operationthosechanging relevant system tables. And this is not problem for PL only,but for all what is based on SPI
(andVIEWs?).
 
IMHO correct solution is _one_ space and one method for plans saving= query/plan cache, and some common
validity-checkerthat will workover this cache. But how implement validity-checker is unknown...(? Call from all command
thatchanging system tables some handler, that check plans in a cache ?)
 
                    Karel

BTW. - first step in this problem is (or can be) on the way ---       query cache...