Re: Statistics Import and Export

Поиск
Список
Период
Сортировка
От Corey Huinker
Тема Re: Statistics Import and Export
Дата
Msg-id CADkLM=c7sMxeOGMmV6MtsjFkNnji6XLfj2bTgy04f4vWWhv6PA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Statistics Import and Export  (Corey Huinker <corey.huinker@gmail.com>)
Ответы Re: Statistics Import and Export  (Stephen Frost <sfrost@snowman.net>)
Re: Statistics Import and Export  (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>)
Список pgsql-hackers
On Thu, Feb 15, 2024 at 4:09 AM Corey Huinker <corey.huinker@gmail.com> wrote:
Posting v5 updates of pg_import_rel_stats() and pg_import_ext_stats(), which address many of the concerns listed earlier.

Leaving the export/import scripts off for the time being, as they haven't  changed and the next likely change is to fold them into pg_dump.



v6 posted below.

Changes:

- Additional documentation about the overall process.
- Rewording of SGML docs.
- removed a fair number of columns from the transformation queries.
- enabled require_match_oids in extended statistics, but I'm having my doubts about the value of that.
- moved stats extraction functions to an fe_utils file stats_export.c that will be used by both pg_export_stats and pg_dump.
- pg_export_stats now generates SQL statements rather than a tsv, and has boolean flags to set the validate and require_match_oids parameters in the calls to pg_import_(rel|ext)_stats.
- pg_import_stats is gone, as importing can now be done with psql.

I'm hoping to get feedback on a few areas.

1. The checks for matching oids. On the one hand, in a binary upgrade situation, we would of course want the oid of the relation to match what was exported, as well as all of the atttypids of the attributes to match the type ids exported, same for collations, etc. However, the binary upgrade is the one place where there are absolutely no middle steps that could have altered either the stats jsons or the source tables. Given that and that oid simply will never match in any situation other than a binary upgrade, it may be best to discard those checks.

2. The checks for relnames matching, and typenames of attributes matching (they are already matched by name, so the column order can change without the import missing a beat) seem so necessary that there shouldn't be an option to enable/disable them. But if that's true, then the initial relation parameter becomes somewhat unnecessary, and anyone using these functions for tuning or FDW purposes could easily transform the JSON using SQL to put in the proper relname.

3. The data integrity validation functions may belong in a separate function rather than being a parameter on the existing import functions.

4. Lastly, pg_dump. Each relation object and extended statistics object will have a statistics import statement. From my limited experience with pg_dump, it seems like we would add an additional Stmt variable (statsStmt) to the TOC entry for each object created, and the restore process would check the value of --with-statistics and in cases where the statistics flag was set AND a stats import statement exists, then execute that stats statement immediately after the creation of the object. This assumes that there is no case where additional attributes are added to a relation after it's initial CREATE statement. Indexes are independent relations in this regard.
 
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: vignesh C
Дата:
Сообщение: Re: confirmed flush lsn seems to be move backward in certain error cases
Следующее
От: Bharath Rupireddy
Дата:
Сообщение: Re: Switching XLog source from archive to streaming when primary available