Re: [HACKERS] possible encoding issues with libxml2 functions

Поиск
Список
Период
Сортировка
От Pavel Stehule
Тема Re: [HACKERS] possible encoding issues with libxml2 functions
Дата
Msg-id CAFj8pRADS6yUVwziAS_v2c_oPEse6RZgk2WnSa8o3i-GwTKmaA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] possible encoding issues with libxml2 functions  (Noah Misch <noah@leadboat.com>)
Ответы Re: [HACKERS] possible encoding issues with libxml2 functions  (Noah Misch <noah@leadboat.com>)
Список pgsql-hackers


2017-03-12 0:56 GMT+01:00 Noah Misch <noah@leadboat.com>:
On Mon, Feb 20, 2017 at 07:48:18PM +0100, Pavel Stehule wrote:
> Today I played with xml_recv function and with xml processing functions.
>
> xml_recv function ensures correct encoding from document encoding to server
> encoding. But the decl section holds original encoding info - that should
> be obsolete after encoding. Sometimes we solve this issue by removing decl
> section - see the xml_out function.
>
> Sometimes we don't do it - lot of functions uses direct conversion from
> xmltype to xmlChar.

> There are possible two fixes
>
> a) clean decl on input - the encoding info can be removed from decl part
>
> b) use xml_out_internal everywhere before transformation to
> xmlChar. pg_xmlCharStrndup can be good candidate.

I'd prefer (a) if the xml type were a new feature, because no good can come of
storing an encoding in each xml field when we know the actual encoding is the
database encoding.  However, if you implemented (a), we'd still see untreated
values brought over via pg_upgrade.  Therefore, I would try (b) first.  I
suspect the intent of xml_parse() was to implement (b); it will be interesting
to see your test case that malfunctions.

I looked there again and I found so this issue is related to xpath function only
 
Functions based on xml_parse are working without problems. xpath_internal uses own direct xmlCtxtReadMemory without correct encoding sanitation.

so fix is pretty simple

 diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c

index f81cf489d2..89aae48cb3 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -3874,9 +3874,11 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
        ns_count = 0;
    }
 
-   datastr = VARDATA(data);
-   len = VARSIZE(data) - VARHDRSZ;
+   datastr = xml_out_internal(data, 0);
+   len = strlen(datastr);
+
    xpath_len = VARSIZE(xpath_expr_text) - VARHDRSZ;
+
    if (xpath_len == 0)
        ereport(ERROR,
                (errcode(ERRCODE_DATA_EXCEPTION),

Regards

Pavel

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: [HACKERS][REVIEW] macaddr 64 bit (EUI-64) datatype support