Обсуждение: contrib/intarray vs empty arrays
I just got rid of the contrib/intarray duplicates of <@ and @>, as we discussed here: http://archives.postgresql.org/message-id/17021.1234474178@sss.pgh.pa.us While I was testing this I realized that I wasn't getting quite the same answers :-(. In particular, it seems that the core operators consider an empty array to be contained in anything else, while intarray will only return true for two nonempty arrays. I would just go change that, except that the *real* problem is it means GIN index searches behave differently from the operators themselves. Using intarray's regression database, contrib_regression=# select * from test__int where a <@ array[1,2]; a -------{1,1} (1 row) contrib_regression=# drop index text_idx; DROP INDEX contrib_regression=# select * from test__int where a <@ array[1,2]; a -------{}{}{}{}{}{}{}{}{}{1,1} (10 rows) From what I understand of GIN's internal workings, this is unfixable because there is nothing to make an index entry on when looking at an empty array. Unless you know of a trick to get around that, we've got a problem here. [ pokes around ... ] Oh dear, it doesn't work with intarray's GIST opclasses either. Do we have to document these operators as being not quite real containment? regards, tom lane
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: [empty arrays and containment ops] Tom> From what I understand of GIN's internal workings, this isTom> unfixable because there is nothing to make an index entryonTom> when looking at an empty array. Unless you know of a trick toTom> get around that, we've got a problem here. Umm. In theory, could the extract function return some distinct value when applied to an empty array, and the extract_query function include the same value when extracting a query? It's not very clean, since it means that the element type for an index of sometype[] is no longer sometype, but it would allow the semantics to be kept consistent. [whether it is worth the pain of doing this is a separate question] -- Andrew (irc:RhodiumToad)
> While I was testing this I realized that I wasn't getting quite the same > answers :-(. In particular, it seems that the core operators consider > an empty array to be contained in anything else, while intarray will > only return true for two nonempty arrays. Urgh. We (with Oleg) digged a bit around that. 8.1 and earlier versions (contrib/intarray) assume that empty array neither contained in nor overlap with any array. _int_contains(PG_FUNCTION_ARGS) { .... if (ARRISVOID(a) || ARRISVOID(b)) return FALSE; _int_overlap(PG_FUNCTION_ARGS) { .... if (ARRISVOID(a) || ARRISVOID(b)) return FALSE; However, that versions have an *implementation* bug: ARRISVOID returns false for empty array :(. 8.2 and later versions have built-in assumption that empty array is contained in any array, although the bug in contrib/intarray was fixed while defending intarray from nulls in array: http://archives.postgresql.org/pgsql-committers/2005-11/msg00420.php But I don't remember any complaints about changed behaviour of contains operator. On other hand, geometrical containment operator guarantees that arguments will overlaps, whereas this is not true for arrays - empty array is contained in any array but doesn't overlap. Since results of operations with empty array are not documented and we didn't find definition of "contains" and "overlap" in the documentation, we suggest following strict definitions: contains - all elements of second array are contained in the first one. Empty array has no element, so it can't be contained. overlap - there is at least one common element in both arrays. Empty array has no element, so it can't overlap. Then we need to fix: - ARRISVOID in contrib/intarray in 8.1 and later - contains/contained in 8.2 and later > From what I understand of GIN's internal workings, this is unfixable Yes :( > [ pokes around ... ] Oh dear, it doesn't work with intarray's > GIST opclasses either. That's easy fixable -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
2009/4/9 Teodor Sigaev <teodor@sigaev.ru>: > contains - all elements of second array are contained in the first one. > Empty array has no element, so it can't be contained. That sounds wrong. A <contains> B should surely always be true if B is empty. ie "for all x, x in B implies x in A". Or put another way, "contains" just means "is a superset of" and all sets are supersets of the empty set (even the empty set). -- greg