Обсуждение: Custom type, operators and operator class not sorting/indexing correctly
Dear all, I've created a new domain (debversion) derived from TEXT, which includes its own operators (< <= = >= > and <>), and also its own operator class for BTREE indices. The operators function correctly when I test them by themselves, e.g. SELECT x < y; However, if I create a table with a column of this type, ORDER BY does not result in correct ordering. I have to explicitly add 'USING <' to the query, and even this fails to work if I haven't defined the operator class: # SELECT * FROM testv ORDER BY version ASC; version ------------------ 1.0.3-3 3.0.7+1-1 3.0.7+1-2 3.0.7+1-2~lenny2 (4 rows) # SELECT * FROM testv ORDER BY version USING <; version ------------------ 1.0.3-3 3.0.7+1-1 3.0.7+1-2~lenny2 3.0.7+1-2 (4 rows) The latter shows the correct ordering. The former appears to be using the lexical ordering of the TEXT type. Adding an index does not affect the ordering, even if I explictly make it use my operator class (it's also set as the default). The SQL code to create the type and demonstrate the problem follows at the end of this mail. It requires the PL/Perl and PL/pgSQL languages to be available. It shows example queries to demonstrate the ordering issue above. I thought that I had correctly defined the type, functions, operators and operator class in order for everything to function correctly, but I must be missing some final piece of the puzzle or some PostgreSQL subtlety I'm not aware of (this is my first attempt at defining operators, and I am also a newcomer to using procedural languages). Could anyone suggest what I've done wrong here? Many thanks, Roger Leigh -- .''`. Roger Leigh : :' : Debian GNU/Linux http://people.debian.org/~rleigh/ `. `' Printing on GNU/Linux? http://gutenprint.sourceforge.net/ `- GPG Public Key: 0x25BFB848 Please GPG sign your mail.
Вложения
Roger Leigh <rleigh@codelibre.net> writes: > I've created a new domain (debversion) derived from TEXT, which > includes its own operators (< <= = >= > and <>), and also its > own operator class for BTREE indices. You can't realistically attach such things to a domain; try making a separate type, perhaps with an implicit cast to text to allow use of text operators for other purposes. Elein was going to look into tweaking the coercion rules so that functions on domains had a non-negligible chance of being useful, but we've not heard much about that project lately. regards, tom lane
On Wed, Jan 21, 2009 at 02:03:03AM -0500, Tom Lane wrote: > Roger Leigh <rleigh@codelibre.net> writes: > > I've created a new domain (debversion) derived from TEXT, which > > includes its own operators (< <= = >= > and <>), and also its > > own operator class for BTREE indices. > > You can't realistically attach such things to a domain; try making > a separate type, perhaps with an implicit cast to text to allow > use of text operators for other purposes. Ah, thanks for the clarification. So I need to use CREATE TYPE rather than CREATE DOMAIN. Because I'm essentially just storing a text string with different operators, can I derive a type from TEXT (perhaps by reusing the same input, output, receive and send functions as TEXT?) I saw the textsend and textreceive functions, which I assume are the appropriate functions for send and receive? Are there any for input and output which I may reuse? Many thanks, Roger -- .''`. Roger Leigh : :' : Debian GNU/Linux http://people.debian.org/~rleigh/ `. `' Printing on GNU/Linux? http://gutenprint.sourceforge.net/ `- GPG Public Key: 0x25BFB848 Please GPG sign your mail.
Re: Custom type, operators and operator class not sorting/indexing correctly
От
Martijn van Oosterhout
Дата:
On Wed, Jan 21, 2009 at 10:48:09AM +0000, Roger Leigh wrote: > Ah, thanks for the clarification. So I need to use CREATE TYPE > rather than CREATE DOMAIN. Because I'm essentially just storing > a text string with different operators, can I derive a type from > TEXT (perhaps by reusing the same input, output, receive and send > functions as TEXT?) I saw the textsend and textreceive functions, > which I assume are the appropriate functions for send and receive? > Are there any for input and output which I may reuse? Yes, you can copy all the existing attributes. Use: select * from pg_type where typname='text'; To get the relevent info (upcoming 8.4 will have CREATE TYPE xxx (LIKE=text) ). Lookup the citext project for an example. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Please line up in a tree and maintain the heap invariant while > boarding. Thank you for flying nlogn airlines.
Вложения
On Wed, Jan 21, 2009 at 08:52:19PM +0100, Martijn van Oosterhout wrote: > On Wed, Jan 21, 2009 at 10:48:09AM +0000, Roger Leigh wrote: > > Ah, thanks for the clarification. So I need to use CREATE TYPE > > rather than CREATE DOMAIN. Because I'm essentially just storing > > a text string with different operators, can I derive a type from > > TEXT (perhaps by reusing the same input, output, receive and send > > functions as TEXT?) I saw the textsend and textreceive functions, > > which I assume are the appropriate functions for send and receive? > > Are there any for input and output which I may reuse? > > Yes, you can copy all the existing attributes. Use: > > select * from pg_type where typname='text'; > > To get the relevent info (upcoming 8.4 will have CREATE TYPE xxx > (LIKE=text) ). Lookup the citext project for an example. Many thanks for this suggestion; it has been incredibly useful. Using a pre-existing C implementation, plus looking at citext for how to write the operators correctly, I now have a working pure C (C++ implementation). http://git.debian.org/?p=users/rleigh/sbuild.git;a=tree;f=db;h=0a6d3348744c59b78322ec6a4855876875613239;hb=81fd39259953853632a7d0e2198cfc745d270fe3 I just have a few points I'd like to clarify. In http://git.debian.org/?p=users/rleigh/sbuild.git;a=blob;f=db/debversion.cc;h=980786ecf7a3b7fb5769d04b0952af300723c3b9;hb=81fd39259953853632a7d0e2198cfc745d270fe3 debversioncmp (lines 49-73), I'm duplicating text* to char* by hand. Is the text_to_cstring available internally also accessible by external modules (I didn't see it in any headers)? After every PG_GETARG_TEXT_PP, I've called PG_FREE_IF_COPY before returning. However, I saw in citext (the behaviour of which I duplicated in debversion_smaller and debversion_larger (lines 221-246)) that you *don't* use PG_FREE_IF_COPY before returning one of the two values. Is this not a potential memory leak? Don't you need to duplicate your chosen value, then free both of the temporary values? I also noticed that I couldn't include some postgres headers in my code (e.g. <access/builtins.h>) due to included headers <nodes/parsenodes.h> and <nodes/primnodes.h> using invalid C++ syntax due to use of "using" and "typename" which are reserved C++ keywords. The headers are also not wrapped with extern "C" blocks for direct inclusion in C++ sources, though this isn't as big a deal--you can just include them as I did within such a block in your own code. The former problem can't be worked around due to it being invalid code (this can be rectified by renaming to not use these specific keyword names). Many thanks for your help, Roger -- .''`. Roger Leigh : :' : Debian GNU/Linux http://people.debian.org/~rleigh/ `. `' Printing on GNU/Linux? http://gutenprint.sourceforge.net/ `- GPG Public Key: 0x25BFB848 Please GPG sign your mail.
Вложения
Re: Custom type, operators and operator class not sorting/indexing correctly
От
Martijn van Oosterhout
Дата:
On Sun, Jan 25, 2009 at 03:52:02PM +0000, Roger Leigh wrote: > Many thanks for this suggestion; it has been incredibly useful. No problem. > I just have a few points I'd like to clarify. In > > http://git.debian.org/?p=users/rleigh/sbuild.git;a=blob;f=db/debversion.cc;h=980786ecf7a3b7fb5769d04b0952af300723c3b9;hb=81fd39259953853632a7d0e2198cfc745d270fe3 > > debversioncmp (lines 49-73), I'm duplicating text* to char* by hand. > Is the text_to_cstring available internally also accessible by > external modules (I didn't see it in any headers)? A function like that exists, the "proper" way to do it is (untested): DatumGetCString( DirectFunctionCall1(textout, TextGetDatum(foo) ) It's used in various places, in the unreleased 8.4 there will be official functions like cstring_to_text and text_to_cstring. Many other modules already declare stuff like this. > After every PG_GETARG_TEXT_PP, I've called PG_FREE_IF_COPY before > returning. However, I saw in citext (the behaviour of which I > duplicated in debversion_smaller and debversion_larger (lines 221-246)) > that you *don't* use PG_FREE_IF_COPY before returning one of the two > values. Is this not a potential memory leak? Don't you need to > duplicate your chosen value, then free both of the temporary values? Memory leaks are not an issue generally, everything you allocate gets freed at the end of the statement, if not earlier. Normally you don't manually free at all, however code that might be used by indexes is somewhat of an exception, since they might be called often in a tight loop. smaller/larger are not index functions, but the cmp function is. Though your strings here are probably short enough you won't notice either way. > I also noticed that I couldn't include some postgres headers in > my code (e.g. <access/builtins.h>) due to included headers > <nodes/parsenodes.h> and <nodes/primnodes.h> using invalid C++ syntax C++ incompatability has been noted before and patches were posted. I don't remember right now what happened to them, check the archives. The usual workarond is to split the C++ specific stuff into a seperate file, but that's kind of ugly I agree. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Please line up in a tree and maintain the heap invariant while > boarding. Thank you for flying nlogn airlines.
Вложения
On Sun, Jan 25, 2009 at 05:08:54PM +0100, Martijn van Oosterhout wrote: > On Sun, Jan 25, 2009 at 03:52:02PM +0000, Roger Leigh wrote: > > > > I'm duplicating text* to char* by hand. > > Is the text_to_cstring available internally also accessible by > > external modules (I didn't see it in any headers)? > > A function like that exists, the "proper" way to do it is (untested): > > DatumGetCString( DirectFunctionCall1(textout, TextGetDatum(foo) ) > > It's used in various places, in the unreleased 8.4 there will be > official functions like cstring_to_text and text_to_cstring. Many other > modules already declare stuff like this. Great, it looks like I'll be able to make things much simpler with 8.4, though I will need to be backward compatible with 8.3 for some time. > > After every PG_GETARG_TEXT_PP, I've called PG_FREE_IF_COPY before > > returning. However, I saw in citext (the behaviour of which I > > duplicated in debversion_smaller and debversion_larger (lines 221-246)) > > that you *don't* use PG_FREE_IF_COPY before returning one of the two > > values. Is this not a potential memory leak? Don't you need to > > duplicate your chosen value, then free both of the temporary values? > > Memory leaks are not an issue generally, everything you allocate gets > freed at the end of the statement, if not earlier. Normally you don't > manually free at all, however code that might be used by indexes is > somewhat of an exception, since they might be called often in a tight > loop. smaller/larger are not index functions, but the cmp function is. > > Though your strings here are probably short enough you won't notice > either way. Thanks for the clarification, I wasn't aware everything was automatically freed. The stricter requirements for index operators does make sense. > > I also noticed that I couldn't include some postgres headers in > > my code (e.g. <access/builtins.h>) due to included headers > > <nodes/parsenodes.h> and <nodes/primnodes.h> using invalid C++ syntax > > C++ incompatability has been noted before and patches were posted. I > don't remember right now what happened to them, check the archives. Ah, thanks. I won't file a bug if it's already a known issue. Thanks again, Roger -- .''`. Roger Leigh : :' : Debian GNU/Linux http://people.debian.org/~rleigh/ `. `' Printing on GNU/Linux? http://gutenprint.sourceforge.net/ `- GPG Public Key: 0x25BFB848 Please GPG sign your mail.