Обсуждение: Re: D308-E9AF-4C11 : CONFIRM from pgsql-sql (subscribe)
A Question for those of you who consider yourself crack Database Designers. I am currently moving a large database(100+Tables) into pgSQL... with the intention of deploying against 'any' SQL databasein the future. The development side will be rigorously using Standard SQL constructs with no unique/proprietary extensions. My question concerns establishing the relationships. Currently Relationships between tables are established via a Unique Integer ID like this: *=APrimaryKey PartTypes Customer Parts --------- -------- ----- PartTypeID CustomerID PartID *PartType *Customer PartTypeID Address CustomerID *PartNumber(2FieldPrimaryKey) *PartRevision(2FieldPrimaryKey) PartName HOWEVER; I have read lots of texts describing the Relational Design should be instead like this: *=APrimaryKey PartTypes Customer Parts --------- -------- ----- *PartType *Customer PartType Address *PartNumber(2FieldPrimaryKey) *PartRevison(2FieldPrimaryKey) PartName Customer Both Techniques have a unique foreign key back to the parent tables but one uses No.Meaningful.Info.Integer.Data for theForeignKey while the second uses Human.Understandable.ForeignKeys Is one recommended over the other??? Sure appreciate the commentary before I get in too deep with all these tables. Thanks!
I prefer using unique integer ids generated from sequences rather than keys composed of meaningful values. Advantages: Client side applications can store/handle the unique integer ids more readily than having to deal with composite primary keys composed of varying data types. For example, I can stuff the id associated with a particular record easily in list boxes, combo boxes, edit controls, etc. via SetItemData() or some other appropriate method. Its a bit more complicated to track database records via composite keys of something like: part no, vendor no, vendor group. Updating the data doesn't require cascading updates. If you use keys with meaning, the referential integrity constraints must support cascading updates so if the key changes in the primary table the change is cascaded to all referencing tables as well. Earlier versions of most databases (Access, Oracle, etc.) only provided cascading deletes under the assumption you would be using sequence generated keys. Downside: Many queries might require more joins against the primary table to fetch the relevant information associated with the numerical id, whereas keys composed of solely the values with which they are associated might not require the joins, which will speed some applications. I now have some queries with 20-way joins. But PostgreSQL provides a way to explicitly set the path the planner will choose and so the execution of the query is instantaneous. I'm not sure about other databases. In earlier versions, I had to denormalize a bit solely for performance reasons. In the past, I used to use composite keys and switched to the purely sequence generated path and don't regret it at all. Of course, you'll still have a unique constraint on the what-would-have-been meaningful primary key. Hope that helps, Mike Mascari mascarm@mascari.com Gonzo Rock wrote: > > A Question for those of you who consider yourself crack Database Designers. > > I am currently moving a large database(100+Tables) into pgSQL... with the intention of deploying against 'any' SQL databasein the future. The development side will be rigorously using Standard SQL constructs with no unique/proprietary extensions. > > My question concerns establishing the relationships. > > Currently Relationships between tables are established via a Unique Integer ID like this: > > *=APrimaryKey > > PartTypes Customer Parts > --------- -------- ----- > PartTypeID CustomerID PartID > *PartType *Customer PartTypeID > Address CustomerID > *PartNumber(2FieldPrimaryKey) > *PartRevision(2FieldPrimaryKey) > PartName > > > HOWEVER; I have read lots of texts describing the Relational Design should be instead like this: > > *=APrimaryKey > > PartTypes Customer Parts > --------- -------- ----- > *PartType *Customer PartType > Address *PartNumber(2FieldPrimaryKey) > *PartRevison(2FieldPrimaryKey) > PartName > Customer > > Both Techniques have a unique foreign key back to the parent tables but one uses No.Meaningful.Info.Integer.Data for theForeignKey while the second uses Human.Understandable.ForeignKeys > > Is one recommended over the other??? Sure appreciate the commentary before I get in too deep with all these tables. > > Thanks!
OK... Fair Enough... Good Points indeed y'all. Well... What about the problem of users trying to Query the Database?? You know... like when using Crystal Reports or something?. SELECT * from HistoryTable WHERE PartID = SomeInteger vs SELECT * from HistoryTable WHERE PartNum = 12345636 AND PartRev = C How are they supposed to know What the PartID is ?? Anyway, that I why I was considering changing... current users always have trouble peering into the database... They don'tquite get it. At 02:31 PM 7/27/01 -0400, Mike Mascari wrote: >I prefer using unique integer ids generated from sequences rather than >keys composed of meaningful values. > >Advantages: > >Client side applications can store/handle the unique integer ids more >readily than having to deal with composite primary keys composed of >varying data types. For example, I can stuff the id associated with a >particular record easily in list boxes, combo boxes, edit controls, etc. >via SetItemData() or some other appropriate method. Its a bit more >complicated to track database records via composite keys of something >like: part no, vendor no, vendor group. > >Updating the data doesn't require cascading updates. If you use keys >with meaning, the referential integrity constraints must support >cascading updates so if the key changes in the primary table the change >is cascaded to all referencing tables as well. Earlier versions of most >databases (Access, Oracle, etc.) only provided cascading deletes under >the assumption you would be using sequence generated keys. > >Downside: > >Many queries might require more joins against the primary table to fetch >the relevant information associated with the numerical id, whereas keys >composed of solely the values with which they are associated might not >require the joins, which will speed some applications. I now have some >queries with 20-way joins. But PostgreSQL provides a way to explicitly >set the path the planner will choose and so the execution of the query >is instantaneous. I'm not sure about other databases. In earlier >versions, I had to denormalize a bit solely for performance reasons. > >In the past, I used to use composite keys and switched to the purely >sequence generated path and don't regret it at all. Of course, you'll >still have a unique constraint on the what-would-have-been meaningful >primary key. > >Hope that helps, > >Mike Mascari >mascarm@mascari.com > >Gonzo Rock wrote: >> >> A Question for those of you who consider yourself crack Database Designers. >> >> I am currently moving a large database(100+Tables) into pgSQL... with the intention of deploying against 'any' SQL databasein the future. The development side will be rigorously using Standard SQL constructs with no unique/proprietary extensions. >> >> My question concerns establishing the relationships. >> >> Currently Relationships between tables are established via a Unique Integer ID like this: >> >> *=APrimaryKey >> >> PartTypes Customer Parts >> --------- -------- ----- >> PartTypeID CustomerID PartID >> *PartType *Customer PartTypeID >> Address CustomerID >> *PartNumber(2FieldPrimaryKey) >> *PartRevision(2FieldPrimaryKey) >> PartName >> >> >> HOWEVER; I have read lots of texts describing the Relational Design should be instead like this: >> >> *=APrimaryKey >> >> PartTypes Customer Parts >> --------- -------- ----- >> *PartType *Customer PartType >> Address *PartNumber(2FieldPrimaryKey) >> *PartRevison(2FieldPrimaryKey) >> PartName >> Customer >> >> Both Techniques have a unique foreign key back to the parent tables but one uses No.Meaningful.Info.Integer.Data for theForeignKey while the second uses Human.Understandable.ForeignKeys >> >> Is one recommended over the other??? Sure appreciate the commentary before I get in too deep with all these tables. >> >> Thanks! >
Gonzo Rock wrote: >Is one recommended over the other??? Sure appreciate the commentary before I > get in too deep with all these tables. The second sounds OK, but only if the chosen field is truly a candidate key. "Customer" does not sound like one - suppose you have two 'John Smith's? This is why most real-world applications use unique numbers or codes. Of course you could (probably) differentiate the 'John Smith's by address, but then the address has to be typed in as well as the name. A code is much easier. It all depends on the nature of the data. -- Oliver Elphick Oliver.Elphick@lfix.co.uk Isle of Wight http://www.lfix.co.uk/oliver PGP: 1024R/32B8FAA1: 97 EA 1D 47 72 3F 28 47 6B 7E 39 CC 56 E4 C1 47 GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839 932A 614D 4C34 3E1D 0C1C ======================================== "But the wisdom that is from above is first pure, then peaceable, gentle, and easy to be intreated, full of mercy and good fruits, without partiality, and without hypocrisy." James 3:17
SELECT * from HistoryTable WHERE PartNum = 12345636 AND PartRev = C Is equal to: SELECT t1.* from HistoryTable t1, PartTable t2 WHERE t2.PartName = 'airplane' AND t1.PartRev = 'C' AND t2.PartNum = t1.PartNum You can create these joins for your users, and show them they only need to swap out the name. -r At 01:21 PM 7/27/01 -0700, Gonzo Rock wrote: >OK... Fair Enough... Good Points indeed y'all. > >Well... What about the problem of users trying to Query the Database?? > >You know... like when using Crystal Reports or something?. > >SELECT * from HistoryTable >WHERE PartID = SomeInteger > >vs > >SELECT * from HistoryTable >WHERE PartNum = 12345636 AND PartRev = C > >How are they supposed to know What the PartID is ?? > >Anyway, that I why I was considering changing... current users always have >trouble peering into the database... They don't quite get it. > > > > >At 02:31 PM 7/27/01 -0400, Mike Mascari wrote: > >I prefer using unique integer ids generated from sequences rather than > >keys composed of meaningful values. > > > >Advantages: > > > >Client side applications can store/handle the unique integer ids more > >readily than having to deal with composite primary keys composed of > >varying data types. For example, I can stuff the id associated with a > >particular record easily in list boxes, combo boxes, edit controls, etc. > >via SetItemData() or some other appropriate method. Its a bit more > >complicated to track database records via composite keys of something > >like: part no, vendor no, vendor group. > > > >Updating the data doesn't require cascading updates. If you use keys > >with meaning, the referential integrity constraints must support > >cascading updates so if the key changes in the primary table the change > >is cascaded to all referencing tables as well. Earlier versions of most > >databases (Access, Oracle, etc.) only provided cascading deletes under > >the assumption you would be using sequence generated keys. > > > >Downside: > > > >Many queries might require more joins against the primary table to fetch > >the relevant information associated with the numerical id, whereas keys > >composed of solely the values with which they are associated might not > >require the joins, which will speed some applications. I now have some > >queries with 20-way joins. But PostgreSQL provides a way to explicitly > >set the path the planner will choose and so the execution of the query > >is instantaneous. I'm not sure about other databases. In earlier > >versions, I had to denormalize a bit solely for performance reasons. > > > >In the past, I used to use composite keys and switched to the purely > >sequence generated path and don't regret it at all. Of course, you'll > >still have a unique constraint on the what-would-have-been meaningful > >primary key. > > > >Hope that helps, > > > >Mike Mascari > >mascarm@mascari.com > > > >Gonzo Rock wrote: > >> > >> A Question for those of you who consider yourself crack Database > Designers. > >> > >> I am currently moving a large database(100+Tables) into pgSQL... with > the intention of deploying against 'any' SQL database in the future. The > development side will be rigorously using Standard SQL constructs with no > unique/proprietary extensions. > >> > >> My question concerns establishing the relationships. > >> > >> Currently Relationships between tables are established via a Unique > Integer ID like this: > >> > >> *=APrimaryKey > >> > >> PartTypes Customer Parts > >> --------- -------- ----- > >> PartTypeID CustomerID PartID > >> *PartType *Customer PartTypeID > >> Address CustomerID > >> *PartNumber(2FieldPrimaryKey) > >> *PartRevision(2FieldPrimaryKey) > >> PartName > >> > >> > >> HOWEVER; I have read lots of texts describing the Relational Design > should be instead like this: > >> > >> *=APrimaryKey > >> > >> PartTypes Customer Parts > >> --------- -------- ----- > >> *PartType *Customer PartType > >> Address *PartNumber(2FieldPrimaryKey) > >> *PartRevison(2FieldPrimaryKey) > >> PartName > >> Customer > >> > >> Both Techniques have a unique foreign key back to the parent tables > but one uses No.Meaningful.Info.Integer.Data for the ForeignKey while the > second uses Human.Understandable.ForeignKeys > >> > >> Is one recommended over the other??? Sure appreciate the commentary > before I get in too deep with all these tables. > >> > >> Thanks! > > > >---------------------------(end of broadcast)--------------------------- >TIP 5: Have you checked our extensive FAQ? > >http://www.postgresql.org/users-lounge/docs/faq.html > > > >--- >Incoming mail is certified Virus Free. >Checked by AVG anti-virus system (http://www.grisoft.com). >Version: 6.0.251 / Virus Database: 124 - Release Date: 4/26/01 --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.251 / Virus Database: 124 - Release Date: 4/26/01
Gonzo Rock wrote: > > OK... Fair Enough... Good Points indeed y'all. > > Well... What about the problem of users trying to Query the Database?? > > You know... like when using Crystal Reports or something?. > > SELECT * from HistoryTable > WHERE PartID = SomeInteger > > vs > > SELECT * from HistoryTable > WHERE PartNum = 12345636 AND PartRev = C > > How are they supposed to know What the PartID is ?? > > Anyway, that I why I was considering changing... current users always have trouble peering into the database... They don'tquite get it. Depending upon the sophistication of your users, you might want to consider constructing a number of views where the data is pre-joined (totally denormalized). We essentially do the same thing for both the reasons you provide as well as for security purposes (row security) based upon the value of CURRENT_USER. Hope that helps, Mike Mascari mascarm@mascari.com