Re: Statistics Import and Export

Поиск
Список
Период
Сортировка
От Corey Huinker
Тема Re: Statistics Import and Export
Дата
Msg-id CADkLM=d3XhRSySPX+CHL7KmBCgt7ZpqL-0NLxdSWNYHx=MBjag@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Statistics Import and Export  (Stephen Frost <sfrost@snowman.net>)
Ответы Re: Statistics Import and Export  (Stephen Frost <sfrost@snowman.net>)
Список pgsql-hackers



Having thought about it a bit more, I generally like the idea of being
able to just update one stat instead of having to update all of them at
once (and therefore having to go look up what the other values currently
are...).  That said, per below, perhaps making it strict is the better
plan.

v8 has it as strict.
 

> > Also, in some cases we allow the function to be called with a
> > NULL but then make it a no-op rather than throwing an ERROR (eg, if the
> > OID ends up being NULL).
>
> Thoughts on it emitting a WARN or NOTICE before returning false?

Eh, I don't think so?

Where this is coming from is that we can often end up with functions
like these being called inside of larger queries, and having them spit
out WARN or NOTICE will just make them noisy.

That leads to my general feeling of just returning NULL if called with a
NULL OID, as we would get with setting the function strict.

In which case we're failing nearly silently, yes, there is a null returned, but we have no idea why there is a null returned. If I were using this function manually I'd want to know what I did wrong, what parameter I skipped, etc.
 
Well, that code is for pg_statistic while I was looking at pg_class (in
vacuum.c:1428-1443, where we track if we're actually changing anything
and only make the pg_class change if there's actually something
different):

I can do that, especially since it's only 3 tuples of known types, but my reservations are summed up in the next comment.

 
Not sure why we don't treat both the same way though ... although it's
probably the case that it's much less likely to have an entire
pg_statistic row be identical than the few values in pg_class.

That would also involve comparing ANYARRAY values, yuk. Also, a matched record will never be the case when used in primary purpose of the function (upgrades), and not a big deal in the other future cases (if we use it in ANALYZE on foreign tables instead of remote table samples, users experimenting with tuning queries under hypothetical workloads).


 
Hmm, that's a valid point, so a NULL passed in would need to set that
value actually to NULL, presumably.  Perhaps then we should have
pg_set_relation_stats() be strict and have pg_set_attribute_stats()
handles NULLs passed in appropriately, and return NULL if the relation
itself or attname, or other required (not NULL'able) argument passed in
cause the function to return NULL.

That's how I have relstats done in v8, and could make it do that for attr stats.

(What I'm trying to drive at here is a consistent interface for these
functions, but one which does a no-op instead of returning an ERROR on
values being passed in which aren't allowable; it can be quite
frustrating trying to get a query to work where one of the functions
decides to return ERROR instead of just ignoring things passed in which
aren't valid.)

I like the symmetry of a consistent interface, but we've already got an asymmetry in that the pg_class update is done non-transactionally (like ANALYZE does).

One persistent problem is that there is no _safe equivalent to ARRAY_IN, so that can always fail on us, though it should only do so if the string passed in wasn't a valid array input format, or the values in the array can't coerce to the attribute's basetype.

I should also point out that we've lost the ability to check if the export values were of a type, and if the destination column is also of that type. That's a non-issue in binary upgrades, but of course if a field changed from integers to text the histograms would now be highly misleading. Thoughts on adding a typname parameter that the function uses as a cheap validity check?

v8 attached, incorporating these suggestions plus those of Bharath and Bertrand. Still no pg_dump.

As for pg_dump, I'm currently leading toward the TOC entry having either a series of commands:

    SELECT pg_set_relation_stats('foo.bar'::regclass, ...); pg_set_attribute_stats('foo.bar'::regclass, 'id'::name, ...); ...

Or one compound command

    SELECT pg_set_relation_stats(t.oid, ...)
         pg_set_attribute_stats(t.oid, 'id'::name, ...),
         pg_set_attribute_stats(t.oid, 'last_name'::name, ...),
         ...
    FROM (VALUES('foo.bar'::regclass)) AS t(oid);

The second one has the feature that if any one attribute fails, then the whole update fails, except, of course, for the in-place update of pg_class. This avoids having an explicit transaction block, but we could get that back by having restore wrap the list of commands in a transaction block (and adding the explicit lock commands) when it is safe to do so.

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michał Kłeczek
Дата:
Сообщение: Alternative SAOP support for GiST
Следующее
От: "Andrey M. Borodin"
Дата:
Сообщение: Re: UUID v7