Обсуждение: postgresql table inheritance
Hi, Found this post on Slashdot which I found interesting, any comments? --- post follows --- by Anonymous Coward on Wed Nov 28, '07 03:23 PM (#21509173) Speak for your database -- postgresql does. Postgresql's "table inheritance" is a flawed concept and has nothing to do with the *type system*. Relations contain tuples, and tuples contain attributes, which are a name plus a VALUE. Those values are chosen from TYPES (sets of possible values). Those types are the TYPE SYSTEM. Table inheritence doesn't even make sense. Tables are analogous to relations. All relations are the same type, the relation type (think "set" or "array" to make it easier). How can one value of a type (one table) be a subtype of another value (another table)? That's like saying, "3" is a subtype of "5", if your types are integers. What if you use the expression "3+2" Is that "5" still the subtype of 3? likewise, when you make complex queries with a "base" table, does the result have any connection with the "sub" table? It's like gobbledygook, just mashing words together without any understanding. That's why the postgresql table inheritance concept doesn't see more widespread use. Many people quickly discover the limitations (and incorrectly think it's just "unfinished", when it actually is flawed). The correct way to store types and subtypes in the database is to store them in the columns. In other words, choose attribute VALUES from a TYPE SYSTEM. Nothing else in the relational model needs to be changed. Something like this, in hypothetical SQL-like language: CREATE TABLE People ( INT id, PERSON_CLASS person ) p1 = PERSON_CLASS.new(name: "joe", etc) p2 = CUSTOMER_CLASS.new(name: "bob", etc) // assume CUSTOMER_CLASS subclass of PERSON_CLASS INSERT INTO People VALUES (1, p1), (2, p2) SELECT person FROM People WHERE person.name = "bob" SELECT person, order FROM People JOIN Orders // can't do this in the typical "object database" This is a "solved problem" (see "The Third Manifesto"). It's just a matter of getting somebody to implement it. But the vendors are clueless, thinking object databases are a "different model" and not wanting to confuse programmers, and programmers are clueless, not even understanding SQL or types and values half the time, so they don't demand anything new from vendors... we never move forward.
On Fri, Nov 30, 2007 at 09:42:53PM +0800, Lincoln Yeoh wrote: > Found this post on Slashdot which I found interesting, any comments? I think this person is slightly confused. > Table inheritence doesn't even make sense. Tables are analogous to > relations. > All relations are the same type, the relation type (think "set" or "array" > to > make it easier). How can one value of a type (one table) be a subtype of > another value (another table)? Easy, by having the columns of one table be a subset of the columns or another table. Perhaps someone should point out an example, like a table with "people" and subtables "employees" and "customers". The subtables share the columns of the parent tables. This is nothing that any OO language doesn't do. > The correct way to store types and subtypes in the database is to store them > in the columns. In other words, choose attribute VALUES from a TYPE SYSTEM. > Nothing else in the relational model needs to be changed. Something like > this, in hypothetical SQL-like language: His example is a little wierd, but it is possible: test=# create type foo as (a text, b text); CREATE TYPE test=# create table test( id int4, vals foo ); CREATE TABLE test=# insert into test values ( 4, ROW('a', 'b')); INSERT 0 1 test=# select * from test; id | vals ----+------- 4 | (a,b) (1 row) The syntax is different but the ideas are there... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Those who make peaceful revolution impossible will make violent revolution inevitable. > -- John F Kennedy
Вложения
On Fri, 30 Nov 2007 21:42:53 +0800 Lincoln Yeoh <lyeoh@pop.jaring.my> wrote: > Hi, > > Found this post on Slashdot which I found interesting, any comments? I wrote a memo about inheritance at the bottom of which there are 2 links to good use of the feature: http://www.webthatworks.it/d1/page/postgresql_inheritance_surprises -- Ivan Sergio Borgonovo http://www.webthatworks.it
On 30/11/2007, Lincoln Yeoh <lyeoh@pop.jaring.my> wrote:
Hi,--- post follows ---
Found this post on Slashdot which I found interesting, any comments?
by Anonymous Coward on Wed Nov 28, '07 03:23 PM (#21509173)
Speak for your database -- postgresql does.
Postgresql's "table inheritance" is a flawed concept and has nothing to do
with the *type system*. Relations contain tuples, and tuples contain
attributes, which are a name plus a VALUE. Those values are chosen from TYPES
(sets of possible values). Those types are the TYPE SYSTEM.
Table inheritence doesn't even make sense. Tables are analogous to relations.
All relations are the same type, the relation type (think "set" or "array" to
make it easier). How can one value of a type (one table) be a subtype of
another value (another table)? That's like saying, "3" is a subtype of "5",
if your types are integers. What if you use the expression "3+2" Is that "5"
still the subtype of 3? likewise, when you make complex queries with a "base"
table, does the result have any connection with the "sub" table? It's like
gobbledygook, just mashing words together without any understanding. That's
why the postgresql table inheritance concept doesn't see more widespread use.
Many people quickly discover the limitations (and incorrectly think it's just
"unfinished", when it actually is flawed).
The correct way to store types and subtypes in the database is to store them
in the columns. In other words, choose attribute VALUES from a TYPE SYSTEM.
Nothing else in the relational model needs to be changed. Something like
this, in hypothetical SQL-like language:
CREATE TABLE People ( INT id, PERSON_CLASS person )
p1 = PERSON_CLASS.new(name: "joe", etc)
p2 = CUSTOMER_CLASS.new(name: "bob", etc) // assume CUSTOMER_CLASS
subclass of PERSON_CLASS
INSERT INTO People VALUES (1, p1), (2, p2)
SELECT person FROM People WHERE person.name = "bob"
SELECT person, order FROM People JOIN Orders // can't do this in the
typical "object database"
This is a "solved problem" (see "The Third Manifesto"). It's just a matter of
getting somebody to implement it. But the vendors are clueless, thinking
object databases are a "different model" and not wanting to confuse
programmers, and programmers are clueless, not even understanding SQL or
types and values half the time, so they don't demand anything new from
vendors... we never move forward.
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
Seams like two completely different concepts are getting confused. ie that of Table Inheritance and that of Type Inheritance. They are completely different concepts.
Table Inheritance is table structure ie a child table has all the same columns as the old one with some added columns that sore specialist items. This feature is used heavily used in Table Partitioning. Perhaps it should be renamed.
Type Inheritance is adding extra features to types eg
Varchar(5) is a child of text that adds a maximum length limit of 4 and char(5) is a type of text with a fixed length of 5. But they are all text. This is a very silly example.
Just thoughts.
Peter.
"Lincoln Yeoh" <lyeoh@pop.jaring.my> writes: > The correct way to store types and subtypes in the database is to store them > in the columns. In other words, choose attribute VALUES from a TYPE SYSTEM. > Nothing else in the relational model needs to be changed. Something like > this, in hypothetical SQL-like language: That's what we call "denormalized" data in the database world. This is why there's such a big impedance mismatch between procedural languages and relational databases. The natural place to end up in his world would be to have every table have precisely one column which is some kind of object. Then you access fields and methods on those objects. The problem is that then you don't have a relational database since it's awfully hard to tell the database you have a foreign key from whatever this method returns here to whatever that method returns there for some record somewhere... And it's awfully hard to index and join between complex expressions picking data out from inside objects on both sides, etc. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support!
On Fri, 2007-11-30 at 16:06 +0000, Gregory Stark wrote: > "Lincoln Yeoh" <lyeoh@pop.jaring.my> writes: > > > The correct way to store types and subtypes in the database is to store them > > in the columns. In other words, choose attribute VALUES from a TYPE SYSTEM. > > Nothing else in the relational model needs to be changed. Something like > > this, in hypothetical SQL-like language: > > That's what we call "denormalized" data in the database world. This is why > there's such a big impedance mismatch between procedural languages and > relational databases. > CJ Date believes that normalized relations can indeed contain complex types such as images or other relations. The argument is that atomicity has no absolute meaning, but only has meaning in the context of what you're trying to do with it in the database. A string can obviously be decomposed into its parts, as can a timestamp, etc., so those aren't exactly atomic, either. [1] The impedance mismatch has more to do with the fact that the meaning of an application's internal data structures changes frequently (through revisions of the code), while data in a database needs to be consistent across long periods of time. So, a well-designed database will hold facts that have meaning in the real world and from which inferences can be made. Mapping application data structures (which contain context- sensitive information and implementation artifacts) to real-world facts is the impedance mismatch. Regards, Jeff Davis [1] Paraphrased from "Database in Depth", C.J. Date, pp 29-32
On Fri, 2007-11-30 at 21:42 +0800, Lincoln Yeoh wrote: > --- post follows --- > by Anonymous Coward on Wed Nov 28, '07 03:23 PM (#21509173) > > Speak for your database -- postgresql does. > > Postgresql's "table inheritance" is a flawed concept and has nothing to do > with the *type system*. Relations contain tuples, and tuples contain > attributes, which are a name plus a VALUE. Those values are chosen from TYPES > (sets of possible values). Those types are the TYPE SYSTEM. "Foundation for Future Database Systems: The Third Manifesto" by C.J. Date and Hugh Darwen discusses this topic in Appendix E, and a related topic in Appendix D. They also propose, in detail with a lot of analysis, how they think that type inheritance should work in the set of chapters on the Inheritance Model, or "IM". Regards, Jeff Davis
On Fri, 2007-11-30 at 14:33 +0000, Peter Childs wrote: > Table Inheritance is table structure ie a child table has all the same > columns as the old one with some added columns that sore specialist > items. This feature is used heavily used in Table Partitioning. > Perhaps it should be renamed. I think that was the point of the slashdot post: it creates confusion to call two separate concepts by the same name. Regards, Jeff Davis
At 03:17 AM 12/1/2007, Jeff Davis wrote: >The impedance mismatch has more to do with the fact that the meaning of >an application's internal data structures changes frequently (through >revisions of the code), while data in a database needs to be consistent >across long periods of time. So, a well-designed database will hold >facts that have meaning in the real world and from which inferences can >be made. Mapping application data structures (which contain context- >sensitive information and implementation artifacts) to real-world facts >is the impedance mismatch. The people who try to make a database that maps so well with the objects in a single particular program are solving a very different problem from those of us who use a database partly as a "lingua franca" (or "vehicular language") for many different programs and people. The "impedance" then is unavoidable. It's not going to be easy to change a hundred other programs anyway - probably some unknown (till they inconveniently stop working because someone decided to "match the impedances" with some pet program ;) ). But anyway, I guess postgresql's "table inheritance" thing isn't broken then just misunderstood... Link. One man's impedance mismatch is another man's layer of abstraction or "comms protocol" :).
On Sat, 2007-12-01 at 04:16 +0800, Lincoln Yeoh wrote: > The people who try to make a database that maps so well with the > objects in a single particular program are solving a very different > problem from those of us who use a database partly as a "lingua > franca" (or "vehicular language") for many different programs and people. > Replace "in a single particular program" with "in a specific revision of a specific component of an application" ;) The reason I say this is because most people don't realize that, by just making their objects "persist", that all of their data is now very context sensitive (to specific revisions of specific parts of their code). Contrast that with storing real world facts, which are both context insensitive and time insensitive. I do see your point, but in this context I don't think the two uses are very different. In the first case you mention, you are using the database as a lingua franca between the application at time T and the application at time T + N years (which is, in reality, a different application); rather than in the second case, where it's a lingua franca between two different applications at the same time. > Link. > Did you intend to include a URL? > One man's impedance mismatch is another man's layer of abstraction or > "comms protocol" :). > Good point. Regards, Jeff Davis