Обсуждение: Re: [COMMITTERS] pgsql: Explicitly bind gettext to the correct encoding on Windows.
mha@postgresql.org (Magnus Hagander) writes: > Explicitly bind gettext to the correct encoding on Windows. I have a couple of objections to this patch. First, what happens if it fails to find a matching table entry? (The existing answer is "nothing", but that doesn't seem right.) Second and more critical, it adds still another data structure that has to be maintained when the list of encodings changes, and it doesn't even live in the same file as any existing encoding-information table. What makes more sense to me is to add a table to encnames.c that provides the gettext name of every encoding that we support. regards, tom lane
Re: [COMMITTERS] pgsql: Explicitly bind gettext to the correct encoding on Windows.
От
Bruce Momjian
Дата:
Tom Lane wrote: > mha@postgresql.org (Magnus Hagander) writes: > > Explicitly bind gettext to the correct encoding on Windows. > > I have a couple of objections to this patch. First, what happens if > it fails to find a matching table entry? (The existing answer is > "nothing", but that doesn't seem right.) Second and more critical, > it adds still another data structure that has to be maintained when > the list of encodings changes, and it doesn't even live in the same > file as any existing encoding-information table. > > What makes more sense to me is to add a table to encnames.c that > provides the gettext name of every encoding that we support. Would someone please comment on Tom's questions above. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Re: [COMMITTERS] pgsql: Explicitly bind gettext to the correct encoding on Windows.
От
Magnus Hagander
Дата:
Tom Lane wrote: > mha@postgresql.org (Magnus Hagander) writes: >> Explicitly bind gettext to the correct encoding on Windows. > > I have a couple of objections to this patch. First, what happens if > it fails to find a matching table entry? (The existing answer is > "nothing", but that doesn't seem right.) Second and more critical, > it adds still another data structure that has to be maintained when > the list of encodings changes, and it doesn't even live in the same > file as any existing encoding-information table. > > What makes more sense to me is to add a table to encnames.c that > provides the gettext name of every encoding that we support. Do you mean a separate table there, or should we add a new column to one of the existing tables? //Magnus
Magnus Hagander <magnus@hagander.net> writes: > Tom Lane wrote: >> What makes more sense to me is to add a table to encnames.c that >> provides the gettext name of every encoding that we support. > Do you mean a separate table there, or should we add a new column to one > of the existing tables? Whichever seems to make more sense is fine with me. I just don't want add-an-encoding maintenance requirements spread across N different source files. regards, tom lane
Re: [COMMITTERS] pgsql: Explicitly bind gettext to the correct encoding on Windows.
От
Magnus Hagander
Дата:
Tom Lane wrote: > Magnus Hagander <magnus@hagander.net> writes: >> Tom Lane wrote: >>> What makes more sense to me is to add a table to encnames.c that >>> provides the gettext name of every encoding that we support. > >> Do you mean a separate table there, or should we add a new column to one >> of the existing tables? > > Whichever seems to make more sense is fine with me. I just don't want > add-an-encoding maintenance requirements spread across N different > source files. I was about to start looking at this when that other thread (http://archives.postgresql.org//pgsql-hackers/2009-03/msg01270.php) started about related issues on other platforms. Seems we should have a "coordinated fix" for this, so I'm going to want and see what come sout of that one. Unless I'm misunderstanding thigns and they're not related? //Magnus
Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext to the correct encoding on Windows.
От
Heikki Linnakangas
Дата:
Magnus Hagander wrote: > Tom Lane wrote: >> Magnus Hagander <magnus@hagander.net> writes: >>> Tom Lane wrote: >>>> What makes more sense to me is to add a table to encnames.c that >>>> provides the gettext name of every encoding that we support. >>> Do you mean a separate table there, or should we add a new column to one >>> of the existing tables? >> Whichever seems to make more sense is fine with me. I just don't want >> add-an-encoding maintenance requirements spread across N different >> source files. > > I was about to start looking at this when that other thread > (http://archives.postgresql.org//pgsql-hackers/2009-03/msg01270.php) > started about related issues on other platforms. Seems we should have a > "coordinated fix" for this, so I'm going to want and see what come sout > of that one. Unless I'm misunderstanding thigns and they're not related? I've committed a fairly trivial patch per Peter's suggestion to fix the other thread's issue. I left the table as is, so whatever refactorings were planned can now be applied. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext to the correct encoding on Windows.
От
Magnus Hagander
Дата:
Heikki Linnakangas wrote: > Magnus Hagander wrote: >> Tom Lane wrote: >>> Magnus Hagander <magnus@hagander.net> writes: >>>> Tom Lane wrote: >>>>> What makes more sense to me is to add a table to encnames.c that >>>>> provides the gettext name of every encoding that we support. >>>> Do you mean a separate table there, or should we add a new column to >>>> one >>>> of the existing tables? >>> Whichever seems to make more sense is fine with me. I just don't want >>> add-an-encoding maintenance requirements spread across N different >>> source files. >> >> I was about to start looking at this when that other thread >> (http://archives.postgresql.org//pgsql-hackers/2009-03/msg01270.php) >> started about related issues on other platforms. Seems we should have a >> "coordinated fix" for this, so I'm going to want and see what come sout >> of that one. Unless I'm misunderstanding thigns and they're not related? > > I've committed a fairly trivial patch per Peter's suggestion to fix the > other thread's issue. I left the table as is, so whatever refactorings > were planned can now be applied. Here's a patch that moves the table over to encnames.c, and renames it to look like the others. I don't know what it should be doing if it can't find a match, so I haven't changed that behavior. Comments? //Magnus *** a/src/backend/utils/mb/encnames.c --- b/src/backend/utils/mb/encnames.c *************** *** 431,436 **** pg_enc2name pg_enc2name_tbl[] = --- 431,478 ---- }; /* ---------- + * These are encoding names for gettext. + * ---------- + */ + pg_enc2gettext pg_enc2gettext_tbl[] = + { + {PG_UTF8, "UTF-8"}, + {PG_LATIN1, "LATIN1"}, + {PG_LATIN2, "LATIN2"}, + {PG_LATIN3, "LATIN3"}, + {PG_LATIN4, "LATIN4"}, + {PG_ISO_8859_5, "ISO-8859-5"}, + {PG_ISO_8859_6, "ISO_8859-6"}, + {PG_ISO_8859_7, "ISO-8859-7"}, + {PG_ISO_8859_8, "ISO-8859-8"}, + {PG_LATIN5, "LATIN5"}, + {PG_LATIN6, "LATIN6"}, + {PG_LATIN7, "LATIN7"}, + {PG_LATIN8, "LATIN8"}, + {PG_LATIN9, "LATIN-9"}, + {PG_LATIN10, "LATIN10"}, + {PG_KOI8R, "KOI8-R"}, + {PG_KOI8U, "KOI8-U"}, + {PG_WIN1250, "CP1250"}, + {PG_WIN1251, "CP1251"}, + {PG_WIN1252, "CP1252"}, + {PG_WIN1253, "CP1253"}, + {PG_WIN1254, "CP1254"}, + {PG_WIN1255, "CP1255"}, + {PG_WIN1256, "CP1256"}, + {PG_WIN1257, "CP1257"}, + {PG_WIN1258, "CP1258"}, + {PG_WIN866, "CP866"}, + {PG_WIN874, "CP874"}, + {PG_EUC_CN, "EUC-CN"}, + {PG_EUC_JP, "EUC-JP"}, + {PG_EUC_KR, "EUC-KR"}, + {PG_EUC_TW, "EUC-TW"}, + {PG_EUC_JIS_2004, "EUC-JP"} + }; + + + /* ---------- * Encoding checks, for error returns -1 else encoding id * ---------- */ *** a/src/backend/utils/mb/mbutils.c --- b/src/backend/utils/mb/mbutils.c *************** *** 890,936 **** cliplen(const char *str, int len, int limit) return l; } - #if defined(ENABLE_NLS) - static const struct codeset_map { - int encoding; - const char *codeset; - } codeset_map_array[] = { - {PG_UTF8, "UTF-8"}, - {PG_LATIN1, "LATIN1"}, - {PG_LATIN2, "LATIN2"}, - {PG_LATIN3, "LATIN3"}, - {PG_LATIN4, "LATIN4"}, - {PG_ISO_8859_5, "ISO-8859-5"}, - {PG_ISO_8859_6, "ISO_8859-6"}, - {PG_ISO_8859_7, "ISO-8859-7"}, - {PG_ISO_8859_8, "ISO-8859-8"}, - {PG_LATIN5, "LATIN5"}, - {PG_LATIN6, "LATIN6"}, - {PG_LATIN7, "LATIN7"}, - {PG_LATIN8, "LATIN8"}, - {PG_LATIN9, "LATIN-9"}, - {PG_LATIN10, "LATIN10"}, - {PG_KOI8R, "KOI8-R"}, - {PG_KOI8U, "KOI8-U"}, - {PG_WIN1250, "CP1250"}, - {PG_WIN1251, "CP1251"}, - {PG_WIN1252, "CP1252"}, - {PG_WIN1253, "CP1253"}, - {PG_WIN1254, "CP1254"}, - {PG_WIN1255, "CP1255"}, - {PG_WIN1256, "CP1256"}, - {PG_WIN1257, "CP1257"}, - {PG_WIN1258, "CP1258"}, - {PG_WIN866, "CP866"}, - {PG_WIN874, "CP874"}, - {PG_EUC_CN, "EUC-CN"}, - {PG_EUC_JP, "EUC-JP"}, - {PG_EUC_KR, "EUC-KR"}, - {PG_EUC_TW, "EUC-TW"}, - {PG_EUC_JIS_2004, "EUC-JP"} - }; - #endif /* ENABLE_NLS */ - void SetDatabaseEncoding(int encoding) { --- 890,895 ---- *************** *** 969,980 **** pg_bind_textdomain_codeset(const char *domainname) return; #endif ! for (i = 0; i < lengthof(codeset_map_array); i++) { ! if (codeset_map_array[i].encoding == encoding) { if (bind_textdomain_codeset(domainname, ! codeset_map_array[i].codeset) == NULL) elog(LOG, "bind_textdomain_codeset failed"); break; } --- 928,939 ---- return; #endif ! for (i = 0; pg_enc2gettext_tbl[i].name != NULL; i++) { ! if (pg_enc2gettext_tbl[i].encoding == encoding) { if (bind_textdomain_codeset(domainname, ! pg_enc2gettext_tbl[i].name) == NULL) elog(LOG, "bind_textdomain_codeset failed"); break; } *** a/src/include/mb/pg_wchar.h --- b/src/include/mb/pg_wchar.h *************** *** 262,267 **** typedef struct pg_enc2name --- 262,278 ---- extern pg_enc2name pg_enc2name_tbl[]; /* + * Encoding names for gettext + */ + typedef struct pg_enc2gettext + { + pg_enc encoding; + const char *name; + } pg_enc2gettext; + + extern pg_enc2gettext pg_enc2gettext_tbl[]; + + /* * pg_wchar stuff */ typedef int (*mb2wchar_with_len_converter) (const unsigned char *from,
Magnus Hagander <magnus@hagander.net> writes: > Tom Lane wrote: >>> What makes more sense to me is to add a table to encnames.c that >>> provides the gettext name of every encoding that we support. > Here's a patch that moves the table over to encnames.c, and renames it > to look like the others. I think you forgot to include the NULL terminating entry that the loop seems to be expecting. Also, why isn't the array "const"? > I don't know what it should be doing if it can't find a match, so I > haven't changed that behavior. As things stand, it should throw error, except in the case of SQL_ASCII; there is no excuse for any other database encoding to not be in the table. However, what seems more worrisome to me is the prospect already discussed that the codeset name we have in the table is not actually recognized by gettext/iconv. Did we have a solution for that? Anyway, this fixes my immediate concern about where the info is located, so you may as well apply it with the array-terminator fix. regards, tom lane
Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext to the correct encoding on Windows.
От
Heikki Linnakangas
Дата:
Tom Lane wrote: > However, what seems more worrisome to me is the prospect already > discussed that the codeset name we have in the table is not actually > recognized by gettext/iconv. Did we have a solution for that? You get English. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext to the correct encoding on Windows.
От
Magnus Hagander
Дата:
Tom Lane wrote: >> I don't know what it should be doing if it can't find a match, so I >> haven't changed that behavior. > > As things stand, it should throw error, except in the case of SQL_ASCII; > there is no excuse for any other database encoding to not be in the > table. However, what seems more worrisome to me is the prospect already > discussed that the codeset name we have in the table is not actually > recognized by gettext/iconv. Did we have a solution for that? > > Anyway, this fixes my immediate concern about where the info is located, > so you may as well apply it with the array-terminator fix. Done. //Magnus