Обсуждение: Case Sensitivity

Поиск
Список
Период
Сортировка

Case Sensitivity

От
"Phillip Smith"
Дата:
Hi,
I'm moving over from MS SQL. I've been googling this for ages now and suprisingly cannot find a clear answer.
 
I want my data in tables to be case insensitive.
 
This is so i can:
1. Put keys on natural key fields, for instance a product part number. I product number 'ABC123' inserted i need to disallow 'abc123' to be inserted as a second row. Please don't tell me i have to add another column holding a lowered version of the product number. My database is littered with this need and i would end up bloating my table schema.
 
2.  I need to query case insensitively. SELECT * FROM product WHERE product_number = 'ABC123' should return the same row as SELECT * FROM product WHERE product_number = 'abc123'
 
Is there a database wide collation setting i can make, or a case insensitive character type. There are lots of online posts regarding using LOWER function for querying. This is a workaround for point (2) but does not remedy point (1) above.
 
Many thanks
 
Phillip
 
Phillip Smith
 

Re: Case Sensitivity

От
Pavel Stehule
Дата:
Hello

you should to use a citext datatype

http://www.postgresql.org/docs/9.0/interactive/citext.html

regards

Pavel Stehule

2011/1/13 Phillip Smith <phillip@softworks.com>:
> Hi,
> I'm moving over from MS SQL. I've been googling this for ages now and
> suprisingly cannot find a clear answer.
>
> I want my data in tables to be case insensitive.
>
> This is so i can:
> 1. Put keys on natural key fields, for instance a product part number. I
> product number 'ABC123' inserted i need to disallow 'abc123' to be inserted
> as a second row. Please don't tell me i have to add another column holding a
> lowered version of the product number. My database is littered with this
> need and i would end up bloating my table schema.
>
> 2.  I need to query case insensitively. SELECT * FROM product WHERE
> product_number = 'ABC123' should return the same row as SELECT * FROM
> product WHERE product_number = 'abc123'
>
> Is there a database wide collation setting i can make, or a case
> insensitive character type. There are lots of online posts regarding using
> LOWER function for querying. This is a workaround for point (2) but does not
> remedy point (1) above.
>
> Many thanks
>
> Phillip
>
> Phillip Smith
>

Re: Case Sensitivity

От
"Karsten Hilbert"
Дата:
> you should to use a citext datatype
>
> http://www.postgresql.org/docs/9.0/interactive/citext.html

Or:

- use a functional index with lower() to ensure uniqueness
- use a BEFORE trigger to lower() the inserted data

Karsten

--
GMX DSL Doppel-Flat ab 19,99 Euro/mtl.! Jetzt mit
gratis Handy-Flat! http://portal.gmx.net/de/go/dsl

Re: Case Sensitivity

От
"Phillip Smith"
Дата:
Thanks Pavel,
 
Is this the only way?. I would prefer to use a collation setting if this is possible. Do you know whether there would be a performance hit using non standard character strings?
 
Phillip
 
 

Re: Case Sensitivity

От
"Phillip Smith"
Дата:
Hi Karsten,
 
modifying user inputted data is not an option for me. I need to maintain the original data. I had read about indexing on a lower function. This all seems a bit of a bodgy workaround to me. A clean Case Insensitive collation setting is cleaner. Is this supported in postgreSQL 9?
 
Regards
 
Phillip
 
Phillip

Re: Case Sensitivity

От
Pavel Stehule
Дата:
2011/1/13 Phillip Smith <phillip@softworks.com>:
> Thanks Pavel,
>
> Is this the only way?. I would prefer to use a collation setting if this is
> possible. Do you know whether there would be a performance hit using non
> standard character strings?

It is preferred way. PostgreSQL doesn't support a collations, and what
I know collation in 9.1 will not allow a comparation between case
sensitive and case insensitive text, because collation in PostgreSQL
is based on unix locale, and there are not case insensitive locale.

I don't know about performance problems. It does same work like case
insensitive collation.

Regards

Pavel Stehule


>
> Phillip
>
>

Re: Case Sensitivity

От
Andrew Sullivan
Дата:
On Thu, Jan 13, 2011 at 03:29:03PM -0000, Phillip Smith wrote:

> modifying user inputted data is not an option for me. I need to
> maintain the original data. I had read about indexing on a lower
> function. This all seems a bit of a bodgy workaround to me. A clean
> Case Insensitive collation setting is cleaner. Is this supported in
> postgreSQL 9?

There isn't such a collation setting as far as I know, and it's
actually not as clean as you think.  It turns out to work reasonably
well in ASCII-land, but not very well in other circumstances.  In
general, the case-insensitive but case-preserving technique that
English-speaking computer users have come to find normal is a bad fit
for a wide variety of languages.  (Even "Latin" characterset-using
languages have trouble, because of historic ways of handling accents.
Is the capital version of é E or É?  Well, both, it turns out,
depending on whom you believe.  I could bore you about the effects of
this in the DNS all day long, but I'll resist the temptation.)

Putting an index on lower(column) and then doing all your comparisons
with lower(datum) works fairly well, and I don't see how it's any more
bodgy than a database-wide case insensitive collation.  For instance,
I can assure you that customers named Leblanc and LeBlanc care about
whether those two compare equally.  In your customer name field, if
you have a database-wide collation setting, you can't make the
distinction.

A

--
Andrew Sullivan
ajs@crankycanuck.ca