Обсуждение: BUG #3378: UTF-8 upper() and lower() don't work
The following bug has been logged online: Bug reference: 3378 Logged by: Kenneth Christensen Email address: kec@mediatorsystems.com PostgreSQL version: 8.2 Operating system: Mac OS 10.4.9 Description: UTF-8 upper() and lower() don't work Details: I have a DB where encoding is set to UTF-8. The DB have some tables where some of the columns (varchar) contains danish chars. It looks like lower() and upper() ignores the danish chars when I try to convert to lowercase or uppercase. E.g. Case 1: ------- Column 'name' contains 'Ãble, tørret': select upper(food_name.name) from food_name Result: ÃBLE, TøRRET Expected result: ÃBLE, TÃRRET or select lower(food_name.name) from food_name Result: Ãble, tørret Expected result: æble, tørret Case 2: ------- Column 'name' contains 'æøå': select upper(food_name.name) from food_name Result: æøå Expected result: ÃÃà Case 3: ------- Column 'name' contains 'ÃÃà ': select lower(food_name.name) from food_name Result: ÃÃà Expected result: æøå --- I can see I'm not alone with this kind of bug. This bug is really a big problem for me. I really don't want to replace PostgreSQL with MySQL ! Will this bug be fixed very soon? Best regards Kenneth Christensen
SGVsbG8sCgpZb3UgaGF2ZSB0byB3ZWxsIGluaXRpYWxpemVkIGRhdGFiYXNl IGNsdXN0ZXIgd2l0aCBjb3JyZWN0IGxvY2FsZXMuCgpJIGRvbid0IGtub3cg Z29vZCBkYW5pc2ggbG9jYWxlcywgYnV0IEkgZXhwZWN0IHNvIGl0IHdpbGwg YmUgc2ltaWxhciB3aXRoIGN6ZWNoLgoKbXkgZGF0YWJhc2UgY2x1c3RlciB3 YXMgaW5pdGlhbGlzZWQgd2l0aCBjc19DWi5VVEYtOCBhbmQgZGVmYXVsdApl bmNvZGluZyBpcyBVVEY4LgoKCiBwb3N0Z3Jlcz0jIHNlbGVjdCBsb3dlcign xb1MVVTDnSBLxa7FhycpLCB1cHBlcignxb5sdXTDvSBrxa/FiCcpOwogICBs b3dlciAgIHwgICB1cHBlcgotLS0tLS0tLS0tLSstLS0tLS0tLS0tLQogxb5s dXTDvSBrxa/FiCB8IMW9TFVUw50gS8WuxYcKCgpDaGVjayB5b3VyIGxvY2Fs ZXMsIGlmIGlzIFVURjguCgpwb3N0Z3Jlcz0jIFNIT1cgbGNfY29sbGF0ZSA7 CiBsY19jb2xsYXRlCi0tLS0tLS0tLS0tLS0KIGNzX0NaLlVURi04CigxIHJv dykKClJlZ2FycwpQYXZlbCBTdGVodWxlCgoKMjAwNy82LzEwLCBLZW5uZXRo IENocmlzdGVuc2VuIDxrZWNAbWVkaWF0b3JzeXN0ZW1zLmNvbT46Cj4KPiBU aGUgZm9sbG93aW5nIGJ1ZyBoYXMgYmVlbiBsb2dnZWQgb25saW5lOgo+Cj4g QnVnIHJlZmVyZW5jZTogICAgICAzMzc4Cj4gTG9nZ2VkIGJ5OiAgICAgICAg ICBLZW5uZXRoIENocmlzdGVuc2VuCj4gRW1haWwgYWRkcmVzczogICAgICBr ZWNAbWVkaWF0b3JzeXN0ZW1zLmNvbQo+IFBvc3RncmVTUUwgdmVyc2lvbjog OC4yCj4gT3BlcmF0aW5nIHN5c3RlbTogICBNYWMgT1MgMTAuNC45Cj4gRGVz Y3JpcHRpb246ICAgICAgICBVVEYtOCB1cHBlcigpIGFuZCBsb3dlcigpIGRv bid0IHdvcmsKPiBEZXRhaWxzOgo+Cj4gSSBoYXZlIGEgREIgd2hlcmUgZW5j b2RpbmcgaXMgc2V0IHRvIFVURi04Lgo+Cj4gVGhlIERCIGhhdmUgc29tZSB0 YWJsZXMgd2hlcmUgc29tZSBvZiB0aGUgY29sdW1ucyAodmFyY2hhcikgY29u dGFpbnMgZGFuaXNoCj4gY2hhcnMuCj4KPiBJdCBsb29rcyBsaWtlIGxvd2Vy KCkgYW5kIHVwcGVyKCkgaWdub3JlcyB0aGUgZGFuaXNoIGNoYXJzIHdoZW4g SSB0cnkgdG8KPiBjb252ZXJ0IHRvIGxvd2VyY2FzZSBvciB1cHBlcmNhc2Uu Cj4KPiBFLmcuCj4KPiBDYXNlIDE6Cj4gLS0tLS0tLQo+Cj4gQ29sdW1uICdu YW1lJyBjb250YWlucyAnw4ZibGUsIHTDuHJyZXQnOgo+Cj4gc2VsZWN0IHVw cGVyKGZvb2RfbmFtZS5uYW1lKSBmcm9tIGZvb2RfbmFtZQo+Cj4gUmVzdWx0 OiDDhkJMRSwgVMO4UlJFVAo+IEV4cGVjdGVkIHJlc3VsdDogw4ZCTEUsIFTD mFJSRVQKPgo+IG9yCj4KPiBzZWxlY3QgbG93ZXIoZm9vZF9uYW1lLm5hbWUp IGZyb20gZm9vZF9uYW1lCj4KPiBSZXN1bHQ6IMOGYmxlLCB0w7hycmV0Cj4g RXhwZWN0ZWQgcmVzdWx0OiDDpmJsZSwgdMO4cnJldAo+Cj4gQ2FzZSAyOgo+ IC0tLS0tLS0KPgo+IENvbHVtbiAnbmFtZScgY29udGFpbnMgJ8Omw7jDpSc6 Cj4KPiBzZWxlY3QgdXBwZXIoZm9vZF9uYW1lLm5hbWUpIGZyb20gZm9vZF9u YW1lCj4KPiBSZXN1bHQ6IMOmw7jDpQo+IEV4cGVjdGVkIHJlc3VsdDogw4bD mMOFCj4KPiBDYXNlIDM6Cj4gLS0tLS0tLQo+Cj4gQ29sdW1uICduYW1lJyBj b250YWlucyAnw4bDmMOFJzoKPgo+IHNlbGVjdCBsb3dlcihmb29kX25hbWUu bmFtZSkgZnJvbSBmb29kX25hbWUKPgo+IFJlc3VsdDogw4bDmMOFCj4gRXhw ZWN0ZWQgcmVzdWx0OiDDpsO4w6UKPgo+IC0tLQo+Cj4gSSBjYW4gc2VlIEkn bSBub3QgYWxvbmUgd2l0aCB0aGlzIGtpbmQgb2YgYnVnLiBUaGlzIGJ1ZyBp cyByZWFsbHkgYSBiaWcKPiBwcm9ibGVtIGZvciBtZS4KPgo+IEkgcmVhbGx5 IGRvbid0IHdhbnQgdG8gcmVwbGFjZSBQb3N0Z3JlU1FMIHdpdGggTXlTUUwg IQo+IFdpbGwgdGhpcyBidWcgYmUgZml4ZWQgdmVyeSBzb29uPwo+Cj4gQmVz dCByZWdhcmRzCj4KPiBLZW5uZXRoIENocmlzdGVuc2VuCj4KPiAtLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0oZW5kIG9mIGJyb2FkY2FzdCktLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0KPiBUSVAgOTogSW4gdmVyc2lvbnMgYmVs b3cgOC4wLCB0aGUgcGxhbm5lciB3aWxsIGlnbm9yZSB5b3VyIGRlc2lyZSB0 bwo+ICAgICAgICBjaG9vc2UgYW4gaW5kZXggc2NhbiBpZiB5b3VyIGpvaW5p bmcgY29sdW1uJ3MgZGF0YXR5cGVzIGRvIG5vdAo+ICAgICAgICBtYXRjaAo+ Cg==
hi kenneth these special characters work fine here: select lower('ÃÃà '), upper('æøå'), lower('Ãble, tørret'), upper('Ãble, tørret'); result: æøå ÃÃà æble, tørret ÃBLE, TÃRRET as pavel hinted, you probably aren't using the proper locale settings cheers, thomas -------- Original Message -------- Subject: Re:[BUGS] BUG #3378: UTF-8 upper() and lower() don't work From: Pavel Stehule <pavel.stehule@gmail.com> To: Kenneth Christensen <kec@mediatorsystems.com> Date: 10.06.2007 15:36 > Hello, > > You have to well initialized database cluster with correct locales. > > I don't know good danish locales, but I expect so it will be similar > with czech. > > my database cluster was initialised with cs_CZ.UTF-8 and default > encoding is UTF8. > > > postgres=# select lower('ŽLUTà KÅ®Å'), upper('žlutý kůÅ'); > lower | upper > -----------+----------- > žlutý kůŠ| ŽLUTà KŮŠ> > > Check your locales, if is UTF8. > > postgres=# SHOW lc_collate ; > lc_collate > ------------- > cs_CZ.UTF-8 > (1 row) > > Regars > Pavel Stehule > > > 2007/6/10, Kenneth Christensen <kec@mediatorsystems.com>: >> >> The following bug has been logged online: >> >> Bug reference: 3378 >> Logged by: Kenneth Christensen >> Email address: kec@mediatorsystems.com >> PostgreSQL version: 8.2 >> Operating system: Mac OS 10.4.9 >> Description: UTF-8 upper() and lower() don't work >> Details: >> >> I have a DB where encoding is set to UTF-8. >> >> The DB have some tables where some of the columns (varchar) contains >> danish >> chars. >> >> It looks like lower() and upper() ignores the danish chars when I try to >> convert to lowercase or uppercase. >> >> E.g. >> >> Case 1: >> ------- >> >> Column 'name' contains 'Ãble, tørret': >> >> select upper(food_name.name) from food_name >> >> Result: ÃBLE, TøRRET >> Expected result: ÃBLE, TÃRRET >> >> or >> >> select lower(food_name.name) from food_name >> >> Result: Ãble, tørret >> Expected result: æble, tørret >> >> Case 2: >> ------- >> >> Column 'name' contains 'æøå': >> >> select upper(food_name.name) from food_name >> >> Result: æøå >> Expected result: ÃÃà >> >> Case 3: >> ------- >> >> Column 'name' contains 'ÃÃà ': >> >> select lower(food_name.name) from food_name >> >> Result: ÃÃà >> Expected result: æøå >> >> --- >> >> I can see I'm not alone with this kind of bug. This bug is really a big >> problem for me. >> >> I really don't want to replace PostgreSQL with MySQL ! >> Will this bug be fixed very soon? >> >> Best regards >> >> Kenneth Christensen >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 9: In versions below 8.0, the planner will ignore your desire to >> choose an index scan if your joining column's datatypes do not >> match >> > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings