Обсуждение: libpq and Binary Data Formats

Поиск
Список
Период
Сортировка

libpq and Binary Data Formats

От
"Wilhansen Li"
Дата:
First of all, apologies if this was not meant to be a feedback/wishlist mailing list.<br /><br />Binary formats in
libpqhas been (probably) a long issue (refer to the listings below) and I want to express my hope that the next
revisionof PostgreSQL would have better support for binary data types in libpq. I am in no doubt that those binary vs.
textdebates sprouted because of PostgreSQL's (or rather libpq's) ambiguity when it comes to binary data support. One
instanceis the documentation itself: it didn't really say (correct me if I'm wrong) that binary data is poorly/not
supportedand that textual data is preferred. Moreover, those ambiguities are only cleared up in mailing
lists/irc/forumswhich make it seem that the arguments for text data is just an excuse to not have proper support for
binarydata ( e.x. C:"Elephant doesn't support Hammer!" P: "You don't really need Hammer (we don't support it yet), you
cando it with Screwdriver."). This is not meant to be a binary vs. text post so I'll reserve my comments for them.
Nevertheless,they each have their own advantages and disadvantages especially when it comes to strongly typed languages
thatneither shouldn't be ignored. <br /><br />I am well-aware of the problems associated with binary formats and
backward/forwardcompatibility: <a
href="http://archives.postgresql.org/pgsql-hackers/1999-08/msg00374.php">http://archives.postgresql.org/pgsql-hackers/1999-08/msg00374.php
</a>but nevertheless, that shouldn't stop PostgreSQL/libpq's hardworking developers from coming up with a solution. The
earlinglink showed the interest of using CORBA to handle PostgreSQL objects but I belive that it's an overkill and
wouldlike to propose using ASN.1 instead. However, what's important is not really the binary/text representation. If we
lookagain the the list below, not everyone need binary formats just for speed and efficiency, rather, they need it to
beable to easily manipulate data. In other words, the interfaces to extract data is also important. <br /><br />Best
wishes,<br/>Wil<br clear="all" /><br />NOTES/History of Posts:<br /><br />1: "Query regarding PostgreSQL date/time
binaryformat for libpq" <<a href="http://archives.postgresql.org/pgsql-interfaces/2007-01/msg00040.php">
http://archives.postgresql.org/pgsql-interfaces/2007-01/msg00040.php</a>>One of the many (clueless) individuals who
wantsto get the binary format of the date/time struct (I know that there's a way to do this be converting the time to
epochusing extract(epoch from time) to convert it to somthing akin to time_t) <br />2. "Bytea network traffic: binary
vstext result format" <<a
href="http://archives.postgresql.org/pgsql-interfaces/2007-06/msg00000.php">http://archives.postgresql.org/pgsql-interfaces/2007-06/msg00000.php
</a>>One of the many Binary vs. Text debates.<br />3. "How do you convert PostgreSQL internal binary field to C
datatypes"<<a
href="http://archives.postgresql.org/pgsql-interfaces/2007-05/msg00046.php">http://archives.postgresql.org/pgsql-interfaces/2007-05/msg00046.php
</a>>An individual disgruntled because of the "half baked C API" of PostgreSQL. Although he may be wrong in some or
manyaspects, he has a point with regards to the binary format support. Moreover, he is probably one of the many
individualswho are disappointed on PostgreSQL because of this. <br />4. "Array handling in libpq" <<a
href="http://archives.postgresql.org/pgsql-interfaces/2007-01/msg00027.php">http://archives.postgresql.org/pgsql-interfaces/2007-01/msg00027.php</a>>
Oneof the common scenarios for the "need" of a binary format (or rather, a better interface): arrays. Also, the reply
ofthis is one of the many/redundant assurances that the overhead of text is minimal. <br />5. "libpq PQexecParams and
arrays"<<a
href="http://archives.postgresql.org/pgsql-interfaces/2006-06/msg00008.php">http://archives.postgresql.org/pgsql-interfaces/2006-06/msg00008.php</a>>
Anotherone of those array issues. This time, the poster/s have expressed that the documentation for binary formats is
"poorlydocumented :-(" <br />6. "PQgetvalue failed to return column value for non-text data in binary format" <<a
href="http://archives.postgresql.org/pgsql-interfaces/2007-05/msg00045.php">http://archives.postgresql.org/pgsql-interfaces/2007-05/msg00045.php
</a>>Another issue about binary formats paired with the assurance (again) that the overhead of using text is
minimal.<br/>-- <br />(<_<)(>_>)(>_<)(<.<)(>.>)(>.<)<br />Life is too short for
dial-up. 

Re: libpq and Binary Data Formats

От
Richard Huxton
Дата:
Wilhansen Li wrote:
> First of all, apologies if this was not meant to be a feedback/wishlist
> mailing list.
> 
> Binary formats in libpq has been (probably) a long
> issue (refer to the listings below) and I want to express my hope that the
> next revision of PostgreSQL would have better support for binary data types
> in libpq.

Um - speaking as a user, not a developer, I don't actually see a 
description of what problem(s) you are suggesting be solved. Are you 
saying there should be better documentation, or a new format?

--   Richard Huxton  Archonet Ltd


Re: libpq and Binary Data Formats

От
"Wilhansen Li"
Дата:
Basically, better support for binary formats which includes, but not limited to:
1) functions for converting to and from various datatypes
2) reducing the need to convert to and from network byte order
3) better documentation

My suggestion on using ASN.1 was merely a naive suggestion on how in can be implemented properly without breaking (future) compatibility because that seems to be the main problem which prevents the use of binary formats.

On 6/5/07, Richard Huxton <dev@archonet.com> wrote:
Wilhansen Li wrote:
> First of all, apologies if this was not meant to be a feedback/wishlist
> mailing list.
>
> Binary formats in libpq has been (probably) a long
> issue (refer to the listings below) and I want to express my hope that the
> next revision of PostgreSQL would have better support for binary data types
> in libpq.

Um - speaking as a user, not a developer, I don't actually see a
description of what problem(s) you are suggesting be solved. Are you
saying there should be better documentation, or a new format?

--
   Richard Huxton
   Archonet Ltd



--
(<_<)(>_>)(>_<)(<.<)(>.>)(>.<)
Life is too short for dial-up.

Re: libpq and Binary Data Formats

От
Richard Huxton
Дата:
Wilhansen Li wrote:
> Basically, better support for binary formats which includes, but not 
> limited
> to:
> 1) functions for converting to and from various datatypes
> 2) reducing the need to convert to and from network byte order
> 3) better documentation
> 
> My suggestion on using ASN.1 was merely a naive suggestion on how in can be
> implemented properly without breaking (future) compatibility because that
> seems to be the main problem which prevents the use of binary formats.

Well, it sounds to me like this is two separate items: (1+2), (3).

For (3) there is the pgsql-docs mailing list. If you have 
additions/changes, that's the place you want. Submissions in text are 
fine, you don't need to worry about SGML formatting, but do discuss them 
first. The documentation relies on people saying "I don't think this bit 
is clear", so help is always welcome.

For (1+2) it sounds like what you actually want is a "native binary for 
my application" protocol rather than "internal binary format" which is 
sort of what's available now. Clearly "application binary" is an 
addition rather than a replacement (unless everyone using binary 
transfers thinks it's so much better they're happy to switch immediately).

A few obvious questions leap out at me:
1. What languages are you seeking to target: just "C"?
2. What platforms are you seeking to target: intel 32 bit? 64 bit? 
powerpc? arm?
3. How much do I gain (and lose) over text transfer, and under what 
circumstances?
4. What will happen with custom/user-defined types? Will they need their 
own "adaptor" written to support this?

Crucially, I think you want to demonstrate #3 - that there's a clear 
gain for all the work that's involved in defining a separate transfer 
encoding. If you can demonstrate the gains are felt by all the 
Perl/PHP/Java applications too that'd obviously help.

Bear in mind I'm just another user of PostgreSQL, not a developer, so 
you could do everything I've said and still not interest core in making 
changes. However, I've seen a lot of changes come and go and I think 
you'll need to make progress on those 4 points to get anywhere.

--   Richard Huxton  Archonet Ltd


Re: libpq and Binary Data Formats

От
"Merlin Moncure"
Дата:
On 6/4/07, Wilhansen Li <willi.t1@gmail.com> wrote:
> First of all, apologies if this was not meant to be a feedback/wishlist
> mailing list.
>
> Binary formats in libpq has been (probably) a long issue (refer to the
> listings below) and I want to express my hope that the next
> revision of PostgreSQL would have better support for binary data types in
> libpq. I am in no doubt that those binary vs. text debates sprouted because
> of PostgreSQL's (or rather libpq's) ambiguity when it comes to binary data
> support. One instance is the documentation itself: it didn't really say
> (correct me if I'm wrong) that binary data is poorly/not supported and that
> textual data is preferred. Moreover, those ambiguities are only cleared up
> in mailing lists/irc/forums which make it seem that the arguments for text
> data is just an excuse to not have proper support for binary data ( e.x.
> C:"Elephant doesn't support Hammer!" P: "You don't really need Hammer (we
> don't support it yet), you can do it with Screwdriver."). This is not meant
> to be a binary vs. text post so I'll reserve my comments for them.
> Nevertheless, they each have their own advantages and disadvantages
> especially when it comes to strongly typed languages that neither shouldn't
> be ignored.
>
> I am well-aware of the problems associated with binary formats and
> backward/forward compatibility:
> http://archives.postgresql.org/pgsql-hackers/1999-08/msg00374.php
> but nevertheless, that shouldn't stop PostgreSQL/libpq's
> hardworking developers from coming up with a solution. The
> earling link showed the interest of using CORBA to handle PostgreSQL objects
> but I belive that it's an overkill and would like to propose using ASN.1
> instead. However, what's important is not really the binary/text
> representation. If we look again the the list below, not everyone need
> binary formats just for speed and efficiency, rather, they need it to be
> able to easily manipulate data. In other words, the interfaces to extract
> data is also important.

Personally, I wouldn't mind seeing the libpq API extended to support
arrays and record structures.  PostgreSQL 8.3 is bringing arrays of
composite types and the lack of client side support of these
structures is becoming increasingly glaring.  If set up with
text/binary switch, this would deal with at least part of your
objections.

I think most people here would agree that certain aspects of the
documentation of binary formats are a bit weak and could use
improvement (although, it's possible that certain formats were
deliberately not documented because they may change).   A classy move
would be to make specific suggestions in -docs and produce a patch.

ISTM to me that many if not most people who are looking at binary
interfaces to the database are doing it for the wrong reasons and you
should consider that when reviewing historical discussions :-).  Also,
dealing with large bytea types in the databases which is probably the
most common use case, is pretty well covered in libpq documentation
IMO.

merlin


Re: libpq and Binary Data Formats

От
Tom Lane
Дата:
Richard Huxton <dev@archonet.com> writes:
> Wilhansen Li wrote:
>> Basically, better support for binary formats which includes, but not 
>> limited
>> to:
>> 1) functions for converting to and from various datatypes
>> 2) reducing the need to convert to and from network byte order
>> 3) better documentation

> Well, it sounds to me like this is two separate items: (1+2), (3).

I could see adding more support in libpq for converting native int and
float types to and from the existing on-the-wire binary formats, rather
than making applications do it for themselves as is the case now.  But I
think you've got 0 chance of persuading anyone that we should try to
support platform-dependent on-the-wire formats --- the potential
performance advantages are minimal and the added complexity large.
IOW, 1, 3 yes, 2 no.
        regards, tom lane