Обсуждение: are primary keys always 'needed'
Hi, I was wondering if primary keys are always needed. What I understand; - The value of a primary key is unique per table - Primary keys are unique and not null. I have a couple of tables. Some have a serial that is used as a foreign key in another table. Some tables consist of a combination of two foreign keys (that are unique together) and a field that is uniquely related to that combination (but is not necessarily unique within the table) Others have no unique field or combination at all For the tables that are only used in the foreign part of the relationship(s), is there an added value for user of a surrogate primary key, which will never be referenced. I did a lot of googling and found very varying opinions. Mainly: - Always use a primary key, no reason why, it's just 'better' - Use primary keys when it makes sense. How do I determine what 'sense' or 'better' mean? Thanks for all the feedback Regards, Serge Fonville -- http://www.sergefonville.nl Convince Google!! They need to support Adsense over SSL https://www.google.com/adsense/support/bin/answer.py?hl=en&answer=10528 http://www.google.com/support/forum/p/AdSense/thread?tid=1884bc9310d9f923&hl=en
Serge Fonville <serge.fonville@gmail.com> writes: > I was wondering if primary keys are always needed. As you say, there are different schools of thought on that. To my mind, the principal argument for "always have a primary key" is that it guarantees that no two rows in your table are exactly alike. The reason why that's a good idea is that SQL only allows you to reference rows by content. If you have two identical rows, and you wish to get rid of just one of them, how you gonna do that? If there is no difference you can use in DELETE WHERE ..., you're stuck. Now there is an "out" available in Postgres, because in PG there is always a hidden system column called "ctid" which is the row's physical location (to oversimplify a bit). So when your back is to the wall you can reference a specific row by ctid, even if you neglected to provide a primary key. But not all DBMSes have an equivalent concept. Anyway, I'm not in agreement with the school that says you must always have an explicitly named primary key. Sometimes there are other constraints that have the same effect of ensuring no duplicates, and sometimes you just plain don't care if there are duplicates. But it's always a good idea to think about that carefully. I believe that in *most* cases you want a primary key. regards, tom lane
Serge Fonville <serge.fonville@gmail.com> writes: > Some have a serial that is used as a foreign key in another table. > Some tables consist of a combination of two foreign keys (that are > unique together) and a field that is uniquely related to that > combination (but is not necessarily unique within the table) BTW, I forgot to mention that it's perfectly reasonable to have a multi-column primary key, which is what seems to be indicated in this type of example. I wouldn't advocate making up a surrogate primary key in a linking table, if the combination of its foreign keys can do the job. regards, tom lane
Thanks for the reply. >> Some have a serial that is used as a foreign key in another table. >> Some tables consist of a combination of two foreign keys (that are >> unique together) and a field that is uniquely related to that >> combination (but is not necessarily unique within the table) > > BTW, I forgot to mention that it's perfectly reasonable to have a > multi-column primary key, which is what seems to be indicated in > this type of example. I wouldn't advocate making up a surrogate > primary key in a linking table, if the combination of its foreign > keys can do the job. So when I have a table that exists only on the MANY-end of the relation and in now way is ever to be used as a an entity in the ONE-end of the relatonship. There are no benefits to specifying a primary key if a combination of two fields (that already have a unique not null constraint anyway) to replace those with a primary key? Are there any other benefits to a primary key other than unique not null constraints. For performance. For example, if I create a primary key that is never used in any query, but its just there' to make the row unique. Based on what you stated so far, I'd think: A primary key is not necessary, but useful in uniquely identifying a record. Thanks so far. Regards, Serge Fonville -- http://www.sergefonville.nl Convince Google!! They need to support Adsense over SSL https://www.google.com/adsense/support/bin/answer.py?hl=en&answer=10528 http://www.google.com/support/forum/p/AdSense/thread?tid=1884bc9310d9f923&hl=en
On Feb 28, 2010, at 11:38 , Serge Fonville wrote: > Hi, > > I was wondering if primary keys are always needed. > What I understand; > - The value of a primary key is unique per table > - Primary keys are unique and not null. Yes, in the sense that according to relational theory, a relation (table) contains unique rows, each of which is (uniquely) identified by its key. Note that a key is not necessarily a single attribute (column): a key may be a "compound" key made up of a number of columns. > I have a couple of tables. > Some have a serial that is used as a foreign key in another table. > Some tables consist of a combination of two foreign keys (that are > unique together) and a field that is uniquely related to that > combination (but is not necessarily unique within the table) > Others have no unique field or combination at all this "two foreign keys (that are unique together)" is an example of a compound key. > > For the tables that are only used in the foreign part of the > relationship(s), is there an added value for user of a surrogate > primary key, which will never be referenced. Most likely not. Note this question (is there added value of a surrogate primary key?) is different from your initial question: (are primary keys always needed?). Do not confuse primary keys with surrogate keys (often generated by an auto-incrementing value such as a serial). A surrogate is a meaningless attribute used only to uniquely identify the row. If you *do* use a surrogate key, you should also identify (and enforce) the "natural" (perhaps compound) key of the table. Otherwise you may very well get duplicates of the "natural" data even though the meaningless surrogate key is unique across the table. > I did a lot of googling and found very varying opinions. > Mainly: > - Always use a primary key, no reason why, it's just 'better' > - Use primary keys when it makes sense. > > How do I determine what 'sense' or 'better' mean? This depends on your application. Here are a few things to think about. * Surrogate keys require joins or lookups to retrieve the actual data values they represent. * Surrogate keys require additional space in the table that holds the actual data values and an extra index to enforce the uniqueness of these values. * Surrogate keys are often smaller in terms of size on disk than the data they refer to. This counts for both tables including the surrogate key and the indexes that include them. * Integer surrogate keys used to represent text data are often faster to look up, as integer comparisons are faster than string comparisons. This is not an exhaustive list. Some people are vehemently opposed to surrogate keys; others use them blindly. I think there are cases where using surrogate keys does make sense, taking into account the trade- offs when using them. Hope this helps. Michael Glaesemann grzm seespotcode net
"Michael Glaesemann" <grzm@seespotcode.net> wrote in message news:E93B93F4-8616-4A23-8AC5-84749581E067@seespotcode.net... > > On Feb 28, 2010, at 11:38 , Serge Fonville wrote: snip >> I did a lot of googling and found very varying opinions. >> Mainly: >> - Always use a primary key, no reason why, it's just 'better' >> - Use primary keys when it makes sense. >> >> How do I determine what 'sense' or 'better' mean? > > This depends on your application. Here are a few things to think about. > > * Surrogate keys require joins or lookups to retrieve the actual data > values they represent. > * Surrogate keys require additional space in the table that holds the > actual data values and an extra index to enforce the uniqueness of these > values. > * Surrogate keys are often smaller in terms of size on disk than the data > they refer to. This counts for both tables including the surrogate key > and the indexes that include them. > * Integer surrogate keys used to represent text data are often faster to > look up, as integer comparisons are faster than string comparisons. > > This is not an exhaustive list. Some people are vehemently opposed to > surrogate keys; others use them blindly. I think there are cases where > using surrogate keys does make sense, taking into account the trade- offs > when using them. > If you are planning on using an ORM framework such as "Hibernate" on top of this schema you will want to pay more attention to this PK issue (surrogate vs. natural) as it was a pain for us to setup primary keys on multi-columns initially. As well most ORM frameworks will want/require a PK on a table to operate properly. There was a post/link on tips to follow if using Hibernate/PostgreSQL which I saved somewhere but can't easily find right now. Regards, Eric
On 2010-02-28, Serge Fonville <serge.fonville@gmail.com> wrote: > Hi, > > I was wondering if primary keys are always needed. The answer is no. but they are usually a very good idea. > What I understand; > - The value of a primary key is unique per table > - Primary keys are unique and not null. > > I have a couple of tables. > Some have a serial that is used as a foreign key in another table. case (a) > Some tables consist of a combination of two foreign keys (that are > unique together) and a field that is uniquely related to that > combination (but is not necessarily unique within the table) case (b) > Others have no unique field or combination at all case (c) > For the tables that are only used in the foreign part of the > relationship(s), is there an added value for user of a surrogate > primary key, which will never be referenced. In case (b) you can use the pair of external key columns as the primary key. > I did a lot of googling and found very varying opinions. > - Always use a primary key, no reason why, it's just 'better' Some software requires a single-column primary key. > - Use primary keys when it makes sense. > How do I determine what 'sense' or 'better' mean? first you need to understand what the data and relationships represent. If you never need to uniqely identify a record you don't need a primary key. consider the following table fragment mapref | placename ---------+---------- 10,10 | New York 10,10 | Noo Yawk 10,10.01 | New York 10,10 | New York you may or may not want to allow data with repeats like that. and if you want to delete or update only one of the ('10,10','New York') rows you'll neet to take special care.