pgsql: Fix performance issue in new regex match-all detection code.

Поиск
Список
Период
Сортировка
От Tom Lane
Тема pgsql: Fix performance issue in new regex match-all detection code.
Дата
Msg-id E1ldaig-0008O7-SY@gemulon.postgresql.org
обсуждение исходный текст
Список pgsql-committers
Fix performance issue in new regex match-all detection code.

Commit 824bf7190 introduced a new search of the NFAs generated by
regex compilation.  I failed to think hard about the performance
characteristics of that search, with the predictable outcome
that it's bad: weird regexes can trigger exponential search time.
Worse, there's no check-for-interrupt in that code, so you can't
even cancel the query if this happens.

Fix by introducing memo-ization of the search results, so that any one
NFA state need be examined in detail just once.  This potentially uses
a lot of memory, but we can bound the memory usage by putting a limit
on the number of states for which we'll try to prove match-all-ness.
That is sane because we already have a limit (DUPINF) on the maximum
finite string length that a matchall regex can match; and patterns
that involve much more than DUPINF states would probably exceed that
limit anyway.

Also, rearrange the logic so that we check the basic is-the-graph-
all-RAINBOW-arcs property before we start the recursive search to
determine path lengths.  This will ensure that we fall out quickly
whenever the NFA couldn't possibly be matchall.

Also stick in a check-for-interrupt, just in case these measures
don't completely eliminate the risk of slowness.

Discussion: https://postgr.es/m/3483895.1619898362@sss.pgh.pa.us

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/f68970e33f4dc48094c24c78c452ad730ae9ae12

Modified Files
--------------
src/backend/regex/regc_nfa.c        | 481 ++++++++++++++++++++++++------------
src/backend/regex/regcomp.c         |   3 +-
src/test/regress/expected/regex.out |  37 +++
src/test/regress/sql/regex.sql      |   8 +
4 files changed, 366 insertions(+), 163 deletions(-)


В списке pgsql-committers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: pgsql: Prevent lwlock dtrace probes from unnecessary work
Следующее
От: Robert Haas
Дата:
Сообщение: pgsql: amcheck: Improve some confusing reports about TOAST problems.