Обсуждение: libpq and Binary Data Formats
First of all, apologies if this was not meant to be a feedback/wishlist mailing list.<br /><br />Binary formats in libpqhas been (probably) a long issue (refer to the listings below) and I want to express my hope that the next revisionof PostgreSQL would have better support for binary data types in libpq. I am in no doubt that those binary vs. textdebates sprouted because of PostgreSQL's (or rather libpq's) ambiguity when it comes to binary data support. One instanceis the documentation itself: it didn't really say (correct me if I'm wrong) that binary data is poorly/not supportedand that textual data is preferred. Moreover, those ambiguities are only cleared up in mailing lists/irc/forumswhich make it seem that the arguments for text data is just an excuse to not have proper support for binarydata ( e.x. C:"Elephant doesn't support Hammer!" P: "You don't really need Hammer (we don't support it yet), you cando it with Screwdriver."). This is not meant to be a binary vs. text post so I'll reserve my comments for them. Nevertheless,they each have their own advantages and disadvantages especially when it comes to strongly typed languages thatneither shouldn't be ignored. <br /><br />I am well-aware of the problems associated with binary formats and backward/forwardcompatibility: <a href="http://archives.postgresql.org/pgsql-hackers/1999-08/msg00374.php">http://archives.postgresql.org/pgsql-hackers/1999-08/msg00374.php </a>but nevertheless, that shouldn't stop PostgreSQL/libpq's hardworking developers from coming up with a solution. The earlinglink showed the interest of using CORBA to handle PostgreSQL objects but I belive that it's an overkill and wouldlike to propose using ASN.1 instead. However, what's important is not really the binary/text representation. If we lookagain the the list below, not everyone need binary formats just for speed and efficiency, rather, they need it to beable to easily manipulate data. In other words, the interfaces to extract data is also important. <br /><br />Best wishes,<br/>Wil<br clear="all" /><br />NOTES/History of Posts:<br /><br />1: "Query regarding PostgreSQL date/time binaryformat for libpq" <<a href="http://archives.postgresql.org/pgsql-interfaces/2007-01/msg00040.php"> http://archives.postgresql.org/pgsql-interfaces/2007-01/msg00040.php</a>>One of the many (clueless) individuals who wantsto get the binary format of the date/time struct (I know that there's a way to do this be converting the time to epochusing extract(epoch from time) to convert it to somthing akin to time_t) <br />2. "Bytea network traffic: binary vstext result format" <<a href="http://archives.postgresql.org/pgsql-interfaces/2007-06/msg00000.php">http://archives.postgresql.org/pgsql-interfaces/2007-06/msg00000.php </a>>One of the many Binary vs. Text debates.<br />3. "How do you convert PostgreSQL internal binary field to C datatypes"<<a href="http://archives.postgresql.org/pgsql-interfaces/2007-05/msg00046.php">http://archives.postgresql.org/pgsql-interfaces/2007-05/msg00046.php </a>>An individual disgruntled because of the "half baked C API" of PostgreSQL. Although he may be wrong in some or manyaspects, he has a point with regards to the binary format support. Moreover, he is probably one of the many individualswho are disappointed on PostgreSQL because of this. <br />4. "Array handling in libpq" <<a href="http://archives.postgresql.org/pgsql-interfaces/2007-01/msg00027.php">http://archives.postgresql.org/pgsql-interfaces/2007-01/msg00027.php</a>> Oneof the common scenarios for the "need" of a binary format (or rather, a better interface): arrays. Also, the reply ofthis is one of the many/redundant assurances that the overhead of text is minimal. <br />5. "libpq PQexecParams and arrays"<<a href="http://archives.postgresql.org/pgsql-interfaces/2006-06/msg00008.php">http://archives.postgresql.org/pgsql-interfaces/2006-06/msg00008.php</a>> Anotherone of those array issues. This time, the poster/s have expressed that the documentation for binary formats is "poorlydocumented :-(" <br />6. "PQgetvalue failed to return column value for non-text data in binary format" <<a href="http://archives.postgresql.org/pgsql-interfaces/2007-05/msg00045.php">http://archives.postgresql.org/pgsql-interfaces/2007-05/msg00045.php </a>>Another issue about binary formats paired with the assurance (again) that the overhead of using text is minimal.<br/>-- <br />(<_<)(>_>)(>_<)(<.<)(>.>)(>.<)<br />Life is too short for dial-up.
Wilhansen Li wrote: > First of all, apologies if this was not meant to be a feedback/wishlist > mailing list. > > Binary formats in libpq has been (probably) a long > issue (refer to the listings below) and I want to express my hope that the > next revision of PostgreSQL would have better support for binary data types > in libpq. Um - speaking as a user, not a developer, I don't actually see a description of what problem(s) you are suggesting be solved. Are you saying there should be better documentation, or a new format? -- Richard Huxton Archonet Ltd
Basically, better support for binary formats which includes, but not limited to:
1) functions for converting to and from various datatypes
2) reducing the need to convert to and from network byte order
3) better documentation
My suggestion on using ASN.1 was merely a naive suggestion on how in can be implemented properly without breaking (future) compatibility because that seems to be the main problem which prevents the use of binary formats.
--
(<_<)(>_>)(>_<)(<.<)(>.>)(>.<)
Life is too short for dial-up.
1) functions for converting to and from various datatypes
2) reducing the need to convert to and from network byte order
3) better documentation
My suggestion on using ASN.1 was merely a naive suggestion on how in can be implemented properly without breaking (future) compatibility because that seems to be the main problem which prevents the use of binary formats.
On 6/5/07, Richard Huxton <dev@archonet.com> wrote:
Wilhansen Li wrote:
> First of all, apologies if this was not meant to be a feedback/wishlist
> mailing list.
>
> Binary formats in libpq has been (probably) a long
> issue (refer to the listings below) and I want to express my hope that the
> next revision of PostgreSQL would have better support for binary data types
> in libpq.
Um - speaking as a user, not a developer, I don't actually see a
description of what problem(s) you are suggesting be solved. Are you
saying there should be better documentation, or a new format?
--
Richard Huxton
Archonet Ltd
--
(<_<)(>_>)(>_<)(<.<)(>.>)(>.<)
Life is too short for dial-up.
Wilhansen Li wrote: > Basically, better support for binary formats which includes, but not > limited > to: > 1) functions for converting to and from various datatypes > 2) reducing the need to convert to and from network byte order > 3) better documentation > > My suggestion on using ASN.1 was merely a naive suggestion on how in can be > implemented properly without breaking (future) compatibility because that > seems to be the main problem which prevents the use of binary formats. Well, it sounds to me like this is two separate items: (1+2), (3). For (3) there is the pgsql-docs mailing list. If you have additions/changes, that's the place you want. Submissions in text are fine, you don't need to worry about SGML formatting, but do discuss them first. The documentation relies on people saying "I don't think this bit is clear", so help is always welcome. For (1+2) it sounds like what you actually want is a "native binary for my application" protocol rather than "internal binary format" which is sort of what's available now. Clearly "application binary" is an addition rather than a replacement (unless everyone using binary transfers thinks it's so much better they're happy to switch immediately). A few obvious questions leap out at me: 1. What languages are you seeking to target: just "C"? 2. What platforms are you seeking to target: intel 32 bit? 64 bit? powerpc? arm? 3. How much do I gain (and lose) over text transfer, and under what circumstances? 4. What will happen with custom/user-defined types? Will they need their own "adaptor" written to support this? Crucially, I think you want to demonstrate #3 - that there's a clear gain for all the work that's involved in defining a separate transfer encoding. If you can demonstrate the gains are felt by all the Perl/PHP/Java applications too that'd obviously help. Bear in mind I'm just another user of PostgreSQL, not a developer, so you could do everything I've said and still not interest core in making changes. However, I've seen a lot of changes come and go and I think you'll need to make progress on those 4 points to get anywhere. -- Richard Huxton Archonet Ltd
On 6/4/07, Wilhansen Li <willi.t1@gmail.com> wrote: > First of all, apologies if this was not meant to be a feedback/wishlist > mailing list. > > Binary formats in libpq has been (probably) a long issue (refer to the > listings below) and I want to express my hope that the next > revision of PostgreSQL would have better support for binary data types in > libpq. I am in no doubt that those binary vs. text debates sprouted because > of PostgreSQL's (or rather libpq's) ambiguity when it comes to binary data > support. One instance is the documentation itself: it didn't really say > (correct me if I'm wrong) that binary data is poorly/not supported and that > textual data is preferred. Moreover, those ambiguities are only cleared up > in mailing lists/irc/forums which make it seem that the arguments for text > data is just an excuse to not have proper support for binary data ( e.x. > C:"Elephant doesn't support Hammer!" P: "You don't really need Hammer (we > don't support it yet), you can do it with Screwdriver."). This is not meant > to be a binary vs. text post so I'll reserve my comments for them. > Nevertheless, they each have their own advantages and disadvantages > especially when it comes to strongly typed languages that neither shouldn't > be ignored. > > I am well-aware of the problems associated with binary formats and > backward/forward compatibility: > http://archives.postgresql.org/pgsql-hackers/1999-08/msg00374.php > but nevertheless, that shouldn't stop PostgreSQL/libpq's > hardworking developers from coming up with a solution. The > earling link showed the interest of using CORBA to handle PostgreSQL objects > but I belive that it's an overkill and would like to propose using ASN.1 > instead. However, what's important is not really the binary/text > representation. If we look again the the list below, not everyone need > binary formats just for speed and efficiency, rather, they need it to be > able to easily manipulate data. In other words, the interfaces to extract > data is also important. Personally, I wouldn't mind seeing the libpq API extended to support arrays and record structures. PostgreSQL 8.3 is bringing arrays of composite types and the lack of client side support of these structures is becoming increasingly glaring. If set up with text/binary switch, this would deal with at least part of your objections. I think most people here would agree that certain aspects of the documentation of binary formats are a bit weak and could use improvement (although, it's possible that certain formats were deliberately not documented because they may change). A classy move would be to make specific suggestions in -docs and produce a patch. ISTM to me that many if not most people who are looking at binary interfaces to the database are doing it for the wrong reasons and you should consider that when reviewing historical discussions :-). Also, dealing with large bytea types in the databases which is probably the most common use case, is pretty well covered in libpq documentation IMO. merlin
Richard Huxton <dev@archonet.com> writes: > Wilhansen Li wrote: >> Basically, better support for binary formats which includes, but not >> limited >> to: >> 1) functions for converting to and from various datatypes >> 2) reducing the need to convert to and from network byte order >> 3) better documentation > Well, it sounds to me like this is two separate items: (1+2), (3). I could see adding more support in libpq for converting native int and float types to and from the existing on-the-wire binary formats, rather than making applications do it for themselves as is the case now. But I think you've got 0 chance of persuading anyone that we should try to support platform-dependent on-the-wire formats --- the potential performance advantages are minimal and the added complexity large. IOW, 1, 3 yes, 2 no. regards, tom lane