Обсуждение: planner, newly added records and most common values

Поиск
Список
Период
Сортировка

planner, newly added records and most common values

От
Jeff Amiel
Дата:
Ive got a scenario where I've got a 2 million row table.  Data from inbound files gets processed into it.
A new file might have 10 resulting rows in this table...might have 40K...depends on the source, day of month, etc.

I've got a process that parses the file and loads the records into the table...giving it a unique file_id for the
overallload and places that value on each record. 
Another process will perform a series of queries...joining against that table (for only records with that file_id).
The problem is that the planner has no idea how many records might exist for that file_id.
If I throw a file_id at the planner that is not in the most common value list, it picks a nice number like 384 as it's
rowcount estimate.   
So when I am referencing a new file_id (that obviously isn't IN the most common value list as yet..regardless of how
many 
records I just loaded because I haven't run analyze yet),  the planner dutifully estimates that I will get only 384
rows.  
For large files, this is off by 2 (or god forbid, 3) orders of magnitude.
That yields very bad overall plans (regardless of the fact that I have indexes on the file_id column)

It seems like I am in a no-win situation.  The query I am executing is fairly complex...and when the planner is off by
multipleorders of magnitude on a rowcount, it goes way off the tracks in terms of planning. 

I COULD do an analyze after loading the file...but there is no guarantee that the file I just loaded will end up in the
mostcommon value list....and I end up with bad plan. 

Any thoughts?


Re: planner, newly added records and most common values

От
Andrew Sullivan
Дата:
On Thu, Jan 19, 2012 at 05:36:26PM -0800, Jeff Amiel wrote:

> I COULD do an analyze after loading the file...but there is no guarantee that the file I just loaded will end up in
themost common value list....and I end up with bad plan. 
>

Sounds like you need to SET STATISTICS higher for that column and do
the analyse.  Have you tried that?

A

--
Andrew Sullivan
ajs@crankycanuck.ca