Обсуждение: pgsql: Fix XML tag namespace change inadvertantly missed from previous
pgsql: Fix XML tag namespace change inadvertantly missed from previous
От
adunstan@postgresql.org (Andrew Dunstan)
Дата:
Log Message: ----------- Fix XML tag namespace change inadvertantly missed from previous fix. Add regression test for XML names and numeric entities. Modified Files: -------------- pgsql/src/backend/tsearch: wparser_def.c (r1.11 -> r1.12) (http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/tsearch/wparser_def.c?r1=1.11&r2=1.12) pgsql/src/test/regress/expected: tsearch.out (r1.9 -> r1.10) (http://developer.postgresql.org/cvsweb.cgi/pgsql/src/test/regress/expected/tsearch.out?r1=1.9&r2=1.10) pgsql/src/test/regress/sql: tsearch.sql (r1.4 -> r1.5) (http://developer.postgresql.org/cvsweb.cgi/pgsql/src/test/regress/sql/tsearch.sql?r1=1.4&r2=1.5)
adunstan@postgresql.org (Andrew Dunstan) writes: > Fix XML tag namespace change inadvertantly missed from previous fix. Add > regression test for XML names and numeric entities. Still one gripe: regression=# select * from ts_debug(' λ λ'); alias | description | token | dictionaries | dictionary | lexemes ---------+--------------------------+---------+--------------+------------+--------- blank | Space symbols | | {} | | entity | XML entity | λ | {} | | blank | Space symbols | | {} | | blank | Space symbols | | {} | | numword | Word, letters and digits | X3BB | {simple} | simple | {x3bb} blank | Space symbols | ; | {} | | (6 rows) Aren't hexadecimal entities supposed to be case-insensitive? regards, tom lane
Tom Lane wrote: > adunstan@postgresql.org (Andrew Dunstan) writes: > >> Fix XML tag namespace change inadvertantly missed from previous fix. Add >> regression test for XML names and numeric entities. >> > > Still one gripe: > > regression=# select * from ts_debug(' λ λ'); > alias | description | token | dictionaries | dictionary | lexemes > ---------+--------------------------+---------+--------------+------------+--------- > blank | Space symbols | | {} | | > entity | XML entity | λ | {} | | > blank | Space symbols | | {} | | > blank | Space symbols | | {} | | > numword | Word, letters and digits | X3BB | {simple} | simple | {x3bb} > blank | Space symbols | ; | {} | | > (6 rows) > > Aren't hexadecimal entities supposed to be case-insensitive? > > > The 'x' must be lower case, the hex digits can be upper or lower. The XML spec says: CharRef ::= '' [0-9]+ ';' | '' [0-9a-fA-F]+ ';' cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Tom Lane wrote: >> Aren't hexadecimal entities supposed to be case-insensitive? > The 'x' must be lower case, the hex digits can be upper or lower. The > XML spec says: But we're also interested in parsing HTML, and upper case X is allowed in HTML: http://www.w3.org/TR/REC-html40/charset.html#h-5.3.1 regards, tom lane
I wrote: > > > Tom Lane wrote: >> >> >> Aren't hexadecimal entities supposed to be case-insensitive? >> >> >> > > The 'x' must be lower case, the hex digits can be upper or lower. The > XML spec says: > > CharRef ::= '' [0-9]+ ';' > | '' [0-9a-fA-F]+ ';' > > But I also see that the HTML spec allows for 'X' as well as 'x', so I'll change it. cheers andrew