me@benjaminarai.com (Benjamin Arai) writes:
> If you are looking for a SoC idea, I have listed a couple below. I
> am not sure how good of an idea they are but I have ran into the
> following limitations and probably other people have as well in the
> past.
Actually, I have a thought on a SoC idea...
The general notion would be to try to come up with some more rational
information on setting the default column statistics width.
http://www.postgresql.org/docs/8.2/interactive/runtime-config-query.html#GUC-DEFAULT-STATISTICS-TARGET
http://www.postgresql.org/docs/8.2/interactive/planner-stats.html
Now, the default value has long been 10. There are cases where people
find they need to set it higher; that has always been pretty
trial-and-error.
My suspicion is that:
a) The default should probably be a bit higher than 10
b) Some analysis of stats and schema on an individual table could
perhaps provide more specific values for specific columns.
- Data type might provide guidance; there's little need for >3 values on
a binary column, for instance.
- If there is a NOT NULL UNIQUE constraint on a column, that might
suggest > 10 values
- If the column is known to have 150 unique values, that might
suggest SET STATISTICS 150
It might be worth looking at the *least* frequently occuring
values, and set stats high enough to make it likely that at least
one such value would be pulled in...
- Some kinds of values (dates, floats) are sorta continuous in value;
having 10 bins may be pretty OK for such
There are probably some other heuristics to be had; this is just some
ideas off the top of my head.
Nobody has gone through any sort of real analysis of this; there
likely is merit to doing so...
--
let name="cbbrowne" and tld="cbbrowne.com" in name ^ "@" ^ tld;;
http://cbbrowne.com/info/finances.html
Where do you *not* want to go today? "Confutatis maledictis, flammis
acribus addictis" (<http://www.hex.net/~cbbrowne/msprobs.html>