Обсуждение: optimizing impossible matches
Dear hackers: Is it reasonable or possible to have the optimizer filter out impossible string/numeric matches because of length/overflow conditions? For example: (on 7.4.1) select * from t where f = '1234567' -- f = char(6) explain says index/seq scan, depending on presence of index, when obviously there can be no matches. also, for the typical case for indexed field f,select * from t where f = f' or false generates an index scan but select * from t where f = f' or f = f'' generates a seq. scan with default settings. If f'' is not in the domain of f, the first case should apply. Is this really as simple as it seems? Merlin
"Merlin Moncure" <merlin.moncure@rcsonline.com> writes: > Is this really as simple as it seems? No. The optimizer has no knowledge of specific operators except what it finds in the system catalogs. It has no way in general to determine that a comparison involving nonconstant values must always fail. Even if we could do it, I am dubious that it would be worth expending the cycles on every query to determine whether the situation holds. AFAICS those would be wasted cycles on the huge majority of queries. BTW, your claim here is simply false: > also, for the typical case for indexed field f, > select * from t where f = f' or false > generates an index scan > but > select * from t where f = f' or f = f'' > generates a seq. scan with default settings. Depending on the statistics of the column f, the optimizer might think that a seqscan is better than a double indexscan, but it's not some kind of hardwired behavior. I don't even think it's "typical" behavior. regards, tom lane
Tom Lane wrote: > The optimizer has no knowledge of specific operators except what it > finds in the system catalogs. It has no way in general to determine > that a comparison involving nonconstant values must always fail. > Even if we could do it, I am dubious that it would be worth expending > the cycles on every query to determine whether the situation holds. > AFAICS those would be wasted cycles on the huge majority of queries. Ok, fair enough. And queries that did have this issue could be easily rewritten... Still, there is something that doesn't site quite right with me...my problems is with SQL really, not Postgres. For example, the server forbids 'abcd'::char(3) but allows 'abcd' > char(3) because the operator is not bound to the specific type, but to the general type and ignores type constraints. In other words, SQL implicitly allows comparison between objects of different domains if the domains differ only by constraint (including size). Anyways, thanks for taking the time to answer. Merlin