Обсуждение: Working with PostgreSQL enums in C code
I encountered a situation while implementing JSON support where I needed to return an enum value from a C function. To clarify, here's the SQL: CREATE TYPE json_type_t AS ENUM ('null', 'string', 'number', 'bool', 'object', 'array'); CREATE OR REPLACE FUNCTION json_type(json) RETURNS json_type_t AS 'MODULE_PATHNAME','json_get_type' LANGUAGE C STRICT IMMUTABLE; I initially tried looking for another function returning an enum in the PostgreSQL source tree, but I couldn't find any. I guess this is because enums are a relatively new feature in PostgreSQL. I learned that to return an enum value from C, one needs to return the OID of the right row of the pg_enum table. I eventually managed to write the code below, which is mostly based on the enum_in function in src/backend/utils/adt/enum.c . #define PG_RETURN_ENUM(typname, label) return enumLabelToOid(typname, label) static Oid enumLabelToOid(const char *typname, const char *label) {Oid enumtypoid;HeapTuple tup;Oid ret;enumtypoid = TypenameGetTypid(typname);Assert(OidIsValid(enumtypoid));tup= SearchSysCache2(ENUMTYPOIDNAME, ObjectIdGetDatum(enumtypoid), CStringGetDatum(label));Assert(HeapTupleIsValid(tup));ret = HeapTupleGetOid(tup);ReleaseSysCache(tup);returnret; } Feel free to nitpick the code above, as I'm still learning. Note that I replaced the more robust validity checks of enum_in with (quicker?) asserts, with the assumption that correct programs would only pass valid values to PG_RETURN_ENUM . The code using the method above can be found here: http://git.postgresql.org/gitweb?p=json-datatype.git;a=tree;f=contrib/json;h=1dd813da4016b31f35cb39b01c6d5f0999da672e;hb=092fa046f95580dd7906a07370ca401692a1f818 . My testcases passed, so everything seems to work. I suppose my PG_RETURN_ENUM macro is nice and simple, except for the fact that the coder has to keep an enum names table in sync with the SQL code and the C code. However, going the other way around (PG_GETARG_ENUM) would need access to that enum names table. Hence, it'd make sense to have macros for defining this table so both PG_RETURN_ENUM and PG_GETARG_ENUM can reference it. I believe that these macros would be a useful addition to the PostgreSQL function manager API, as they would provide a decent way to receive and return custom enums from C code. Anyone agree/disagree? Joey Adams
On Fri, May 28, 2010 at 12:07 AM, Joseph Adams <joeyadams3.14159@gmail.com> wrote: > I learned that to return an enum value from C, one needs to return the > OID of the right row of the pg_enum table. I eventually managed to > write the code below, which is mostly based on the enum_in function in > src/backend/utils/adt/enum.c . PG_RETURN macros shouldn't do any nontrivial processing (see the existing ones for references). I assume you have the enum labels declared in pg_enum.h, so I think you can just return the correct OID values directly. Declare constants for them in pg_enum.h and then just do PG_RETURN_OID(whatever). #define JSONTypeNullOid ... #define JSONTypeStringOid ... #define JSONTypeNumberOid ... It really shouldn't be necessary to do a catalog lookup to retrieve a constant. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Robert Haas <robertmhaas@gmail.com> writes: > On Fri, May 28, 2010 at 12:07 AM, Joseph Adams > <joeyadams3.14159@gmail.com> wrote: >> I learned that to return an enum value from C, one needs to return the >> OID of the right row of the pg_enum table. �I eventually managed to >> write the code below, which is mostly based on the enum_in function in >> src/backend/utils/adt/enum.c . > PG_RETURN macros shouldn't do any nontrivial processing (see the > existing ones for references). Yeah, that was my first reaction too. If we don't already have one, it would be appropriate to provide a "lookup enum value" function (functionally about the same as enum_in, but designed to be called conveniently from C). Then, if you needed to work from a textual enum label, you'd call that function and then PG_RETURN_OID. However, for a built-in enum type, I agree with Robert's solution of just #define-ing fixed OIDs for the values of the type. regards, tom lane
On Fri, May 28, 2010 at 9:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Fri, May 28, 2010 at 12:07 AM, Joseph Adams >> <joeyadams3.14159@gmail.com> wrote: >>> I learned that to return an enum value from C, one needs to return the >>> OID of the right row of the pg_enum table. I eventually managed to >>> write the code below, which is mostly based on the enum_in function in >>> src/backend/utils/adt/enum.c . > >> PG_RETURN macros shouldn't do any nontrivial processing (see the >> existing ones for references). > > Yeah, that was my first reaction too. If we don't already have one, > it would be appropriate to provide a "lookup enum value" function > (functionally about the same as enum_in, but designed to be called > conveniently from C). Then, if you needed to work from a textual > enum label, you'd call that function and then PG_RETURN_OID. Here is the function I wrote to look up enum values: Oid enum_label_to_oid(const char *typname, const char *label){ Oid enumtypoid; HeapTuple tup; Oid ret; enumtypoid = TypenameGetTypid(typname); Assert(OidIsValid(enumtypoid)); tup = SearchSysCache2(ENUMTYPOIDNAME, ObjectIdGetDatum(enumtypoid), CStringGetDatum(label)); Assert(HeapTupleIsValid(tup)); ret = HeapTupleGetOid(tup); ReleaseSysCache(tup); returnret;} If this were added to PostgreSQL proper, what source/header files would it make sense to put it in? enum.c/builtins.h ? > However, for a built-in enum type, I agree with Robert's solution of > just #define-ing fixed OIDs for the values of the type. I don't know if the JSON datatype will be a contrib module or built-in yet, but if it were contrib, would it still be better to use fixed OIDs anyway? One issue with setting this precedent is that new contrib modules using enums wouldn't be compatible with older versions of PostgreSQL. One idea might be to insert extend CREATE TYPE name AS ENUM so OIDs can be specified explicitly, but then that could lead to OID clashes. That would be a really messy problem for users. By the way, just curious: why can't the internal representation of an enum just be an INT starting from 0 by default, like in C? That would make a heck of a lot more sense, in my opinion. It might also allow users to do things like this in the future: CREATE TYPE number AS ENUM ('one' = 1, 'two' = 2, 'three' = 3); Joey Adams
Excerpts from Joseph Adams's message of lun jun 07 17:16:12 -0400 2010: > > However, for a built-in enum type, I agree with Robert's solution of > > just #define-ing fixed OIDs for the values of the type. > > I don't know if the JSON datatype will be a contrib module or built-in > yet, but if it were contrib, would it still be better to use fixed > OIDs anyway? If it were contrib, fixed OIDs wouldn't cut it precisely for that reason. (A contrib module is not "built-in" for these purposes.) > One issue with setting this precedent is that new > contrib modules using enums wouldn't be compatible with older versions > of PostgreSQL. One idea might be to insert extend CREATE TYPE name AS > ENUM so OIDs can be specified explicitly, but then that could lead to > OID clashes. That would be a really messy problem for users. Yeah. We've just defined an interface for pg_migrator-only usage, allowing it to define the OID values of ENUMs; it wasn't considered a good idea to expose the details to the user. > By the way, just curious: why can't the internal representation of an > enum just be an INT starting from 0 by default, like in C? That would > make a heck of a lot more sense, in my opinion. It might also allow > users to do things like this in the future: > > CREATE TYPE number AS ENUM ('one' = 1, 'two' = 2, 'three' = 3); The problem is that the output function needs to be able to figure out the value starting with only the datum value. If it had only the "1" it couldn't know what enum it'd correspond to. The other alternative would have been to make enums 64 bits wide, carrying the enum OID in 32 bits and the value in the other 32. This was dismissed as too wasteful. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Mon, Jun 7, 2010 at 5:16 PM, Joseph Adams <joeyadams3.14159@gmail.com> wrote: > I don't know if the JSON datatype will be a contrib module or built-in > yet, but if it were contrib, would it still be better to use fixed > OIDs anyway? Part of the point is that EXPLAIN (FORMAT JSON) should return json, so this needs to be built-in. Otherwise, that won't work. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Joseph Adams wrote: > > By the way, just curious: why can't the internal representation of an > enum just be an INT starting from 0 by default, like in C? That would > make a heck of a lot more sense, in my opinion. It might also allow > users to do things like this in the future: > > > Please review the debates over the internal representation from several years ago when enums were implemented. Essentially the difficulty is that the output function needs to get nothing more than the value itself, and that means the representation needs to carry with it some information about *which* enum set it is in. cheers andrew