[PATCH] json_lex_string: don't overread on bad UTF8

Поиск
Список
Период
Сортировка
От Jacob Champion
Тема [PATCH] json_lex_string: don't overread on bad UTF8
Дата
Msg-id CAOYmi+ncM7pwLS3AnKCSmoqqtpjvA8wmCdoBtKA3ZrB2hZG6zA@mail.gmail.com
обсуждение исходный текст
Ответы Re: [PATCH] json_lex_string: don't overread on bad UTF8  (Michael Paquier <michael@paquier.xyz>)
Re: [PATCH] json_lex_string: don't overread on bad UTF8  (Peter Eisentraut <peter@eisentraut.org>)
Список pgsql-hackers
Hi all,

When json_lex_string() hits certain types of invalid input, it calls
pg_encoding_mblen_bounded(), which assumes that its input is
null-terminated and calls strnlen(). But the JSON lexer is constructed
with an explicit string length, and we don't ensure that the string is
null-terminated in all cases, so we can walk off the end of the
buffer. This isn't really relevant on the server side, where you'd
have to get a superuser to help you break string encodings, but for
client-side usage on untrusted input (such as my OAuth patch) it would
be more important.

Attached is a draft patch that explicitly checks against the
end-of-string pointer and clamps the token_terminator to it. Note that
this removes the only caller of pg_encoding_mblen_bounded() and I'm
not sure what we should do with that function. It seems like a
reasonable API, just not here.

The new test needs to record two versions of the error message, one
for invalid token and one for invalid escape sequence. This is
because, for smaller chunk sizes, the partial-token logic in the
incremental JSON parser skips the affected code entirely when it can't
find an ending double-quote.

Tangentially: Should we maybe rethink pieces of the json_lex_string
error handling? For example, do we really want to echo an incomplete
multibyte sequence once we know it's bad? It also looks like there are
places where the FAIL_AT_CHAR_END macro is called after the `s`
pointer has already advanced past the code point of interest. I'm not
sure if that's intentional.

Thanks,
--Jacob

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: A problem about partitionwise join
Следующее
От: Robert Haas
Дата:
Сообщение: Re: pg17 issues with not-null contraints