Обсуждение: Priorities for 6.6

Поиск
Список
Период
Сортировка

Priorities for 6.6

От
Tom Lane
Дата:
Jan Wieck writes (over in pgsql-sql):
>     * WE STILL NEED THE GENERAL TUPLE SPLIT CAPABILITY!!! *

I've been thinking about making this post for a while ... with 6.5
almost out the door, I guess now is a good time.

I don't know what people have had in mind for 6.6, but I propose that
there ought to be three primary objectives for our next release:

1. Eliminate arbitrary restrictions on tuple size.

2. Eliminate arbitrary restrictions on query size (textual  length/complexity that is).

3. Cure within-statement memory leaks, so that processing large numbers  of tuples in one statement is reliable.

All of these are fairly major projects, and it might be that we get
little or nothing else done if we take these on.  But these are the
problems we've been hearing about over and over and over.  I think
fixing these would do more to improve Postgres than almost any other
work we might do.

Comments?  Does anyone have a different list of pet peeves?  Is there
any chance of getting everyone to subscribe to a master plan like this?
        regards, tom lane


Re: [HACKERS] Priorities for 6.6

От
Vadim Mikheev
Дата:
Tom Lane wrote:
> 
> I don't know what people have had in mind for 6.6, but I propose that
> there ought to be three primary objectives for our next release:
> 
> 1. Eliminate arbitrary restrictions on tuple size.

This is not primary for me -:) 
Though, it's required by PL/pgSQL and so... I agreed that
this problem must be resolved in some way. Related TODO items:

* Allow compression of large fields or a compressed field type
* Allow large text type to use large objects(Peter)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
I like it very much, though I don't like that LO are stored
in separate files. This is known as "multi-representation" feature
in Illustra.

> 2. Eliminate arbitrary restrictions on query size (textual
>    length/complexity that is).

Yes, this is quite annoyning thing.

> 3. Cure within-statement memory leaks, so that processing large numbers
>    of tuples in one statement is reliable.

Quite significant!

> All of these are fairly major projects, and it might be that we get
> little or nothing else done if we take these on.  But these are the
> problems we've been hearing about over and over and over.  I think
> fixing these would do more to improve Postgres than almost any other
> work we might do.
> 
> Comments?  Does anyone have a different list of pet peeves?  Is there
> any chance of getting everyone to subscribe to a master plan like this?

No chance -:))

This is what I would like to see in 6.6:

1. Referential integrity.
2. Dirty reads (will be required by 1. if we'll decide to follow  the way proposed by Jan - using rules, - though there
isanother  way I'll talk about later; dirty reads are useful anyway).
 
3. Savepoints (they are my primary wish-to-implement thing).
4. elog(ERROR) must return error-codes, not just messages!  This is very important for non-interactive application...
inconjuction with 3. -:)
 

Vadim


Re: [HACKERS] Priorities for 6.6

От
The Hermit Hacker
Дата:
On Fri, 4 Jun 1999, Vadim Mikheev wrote:

> * Allow compression of large fields or a compressed field type

This one looks cool...

> > All of these are fairly major projects, and it might be that we get
> > little or nothing else done if we take these on.  But these are the
> > problems we've been hearing about over and over and over.  I think
> > fixing these would do more to improve Postgres than almost any other
> > work we might do.
> > 
> > Comments?  Does anyone have a different list of pet peeves?  Is there
> > any chance of getting everyone to subscribe to a master plan like this?
> 
> No chance -:))

have to agree with Vadim here...the point that has *always* been stressed
here is that if something is important to you, fix it.  Don't expect
anyone else to fall into some sort of "party line" or scheduale, cause
then ppl lose the enjoyment in what they are doing *shrug*

for instance, out of the three things you listed, the only one that I'd
consider an issue is the third, as I've never hit the first two
limitations ...*shrug*

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



Re: [HACKERS] Priorities for 6.6

От
Bruce Momjian
Дата:
> This is what I would like to see in 6.6:
> 
> 1. Referential integrity.

Bingo.  Item #1.  Period.  End of story.  Everything else pales in
comparison.  We just get too many requests for this, though I think it
an insignificant feature myself.  Jan, I believe you have some ideas on
this.  (Like an elephant, I never forget.)


> 4. elog(ERROR) must return error-codes, not just messages!
>    This is very important for non-interactive application...
>    in conjuction with 3. -:)

Added to TODO.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Priorities for 6.6

От
Bruce Momjian
Дата:
> Jan Wieck writes (over in pgsql-sql):
> >     * WE STILL NEED THE GENERAL TUPLE SPLIT CAPABILITY!!! *
> 
> I've been thinking about making this post for a while ... with 6.5
> almost out the door, I guess now is a good time.
> 
> I don't know what people have had in mind for 6.6, but I propose that
> there ought to be three primary objectives for our next release:
> 
> 1. Eliminate arbitrary restrictions on tuple size.
> 
> 2. Eliminate arbitrary restrictions on query size (textual
>    length/complexity that is).
> 
> 3. Cure within-statement memory leaks, so that processing large numbers
>    of tuples in one statement is reliable.

I think the other hot item for 6.6 is outer joins.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Priorities for 6.6

От
Vadim Mikheev
Дата:
Bruce Momjian wrote:
> 
> I think the other hot item for 6.6 is outer joins.

I would like to have 48 hours in day -:)

Vadim


Re: [HACKERS] Priorities for 6.6

От
Bruce Momjian
Дата:
> Bruce Momjian wrote:
> > 
> > I think the other hot item for 6.6 is outer joins.
> 
> I would like to have 48 hours in day -:)
> 
> Vadim
> 

You and I are off the hook.  Jan volunteered for foreign keys, and
Thomas for outer joins.  We can relax.  :-)

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Priorities for 6.6

От
Vadim Mikheev
Дата:
Bruce Momjian wrote:
> 
> > Bruce Momjian wrote:
> > >
> > > I think the other hot item for 6.6 is outer joins.
> >
> > I would like to have 48 hours in day -:)
> >
> > Vadim
> >
> 
> You and I are off the hook.  Jan volunteered for foreign keys, and
> Thomas for outer joins.  We can relax.  :-)

I volunteered for savepoints -:))

Vadim


Re: [HACKERS] Priorities for 6.6

От
Bruce Momjian
Дата:
> Bruce Momjian wrote:
> > 
> > > Bruce Momjian wrote:
> > > >
> > > > I think the other hot item for 6.6 is outer joins.
> > >
> > > I would like to have 48 hours in day -:)
> > >
> > > Vadim
> > >
> > 
> > You and I are off the hook.  Jan volunteered for foreign keys, and
> > Thomas for outer joins.  We can relax.  :-)
> 
> I volunteered for savepoints -:))

Oh.

Hey, I thought you were going to sleep?

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Priorities for 6.6

От
Vadim Mikheev
Дата:
Bruce Momjian wrote:
> 
> > > > >
> > > > > I think the other hot item for 6.6 is outer joins.
> > > >
> > > > I would like to have 48 hours in day -:)
> > > >
> > > > Vadim
> > > >
> > >
> > > You and I are off the hook.  Jan volunteered for foreign keys, and
> > > Thomas for outer joins.  We can relax.  :-)
> >
> > I volunteered for savepoints -:))
> 
> Oh.
> 
> Hey, I thought you were going to sleep?

I just try to have at least 25 hours in day :)

Vadim


Re: [HACKERS] Priorities for 6.6

От
Tom Lane
Дата:
Vadim Mikheev <vadim@krs.ru> writes:
> Tom Lane wrote:
>> 1. Eliminate arbitrary restrictions on tuple size.

> This is not primary for me -:) 

Fair enough; it's not something I need either.  But I see complaints
about it constantly on the mailing lists; a lot of people do need it.

> * Allow large text type to use large objects(Peter)
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> I like it very much, though I don't like that LO are stored
> in separate files.

But, but ... if we fixed the tuple-size problem then people could stop
using large objects at all, and instead just put their data into tuples.
I hate to see work going into improving LO support when we really ought
to be phasing out the whole feature --- it's got *so* many conceptual
and practical problems ...

>> any chance of getting everyone to subscribe to a master plan like this?

> No chance -:))

Yeah, I know ;-).  But I was hoping to line up enough people so that
these things have some chance of getting done.  I doubt that any of
these projects can be implemented by just one or two people; they all
affect too much of the code.  (For instance, eliminating query-size
restrictions will require looking at all of the interface libraries,
psql, pg_dump, and probably other apps, even though the fixes in
the backend should be somewhat localized.)
        regards, tom lane


Re: [HACKERS] Priorities for 6.6

От
Don Baccus
Дата:
At 05:39 PM 6/3/99 -0400, Tom Lane wrote:

>But, but ... if we fixed the tuple-size problem then people could stop
>using large objects at all, and instead just put their data into tuples.
>I hate to see work going into improving LO support when we really ought
>to be phasing out the whole feature --- it's got *so* many conceptual
>and practical problems ...

Making them go away would be a real blessing.  Oracle folk
bitch about CLOBS and BLOBS and the like, too.  They're a 
pain.



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, and other goodies at
http://donb.photo.net


Re: [HACKERS] Priorities for 6.6

От
Vadim Mikheev
Дата:
Don Baccus wrote:
> 
> At 05:39 PM 6/3/99 -0400, Tom Lane wrote:
> 
> > * Allow large text type to use large objects(Peter)
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > I like it very much, though I don't like that LO are stored
> > in separate files. This is known as "multi-representation" feature
> > in Illustra.
> >
> >But, but ... if we fixed the tuple-size problem then people could stop
> >using large objects at all, and instead just put their data into tuples.
> >I hate to see work going into improving LO support when we really ought
> >to be phasing out the whole feature --- it's got *so* many conceptual
> >and practical problems ...
> 
> Making them go away would be a real blessing.  Oracle folk
> bitch about CLOBS and BLOBS and the like, too.  They're a
> pain.

Note: I told about "multi-representation" feature, not just about
LO/CLOBS/BLOBS support. "Multi-representation" means that server
stores tuple fields sometime inside the main relation file,
sometime outside of it, but this is hidden from user and so
people "just put their data into tuples". I think that putting
big fields outside of main relation file is very good thing.
BTW, this approach also allows what you are proposing - why not
put not too big field (~ 8K or so) to another block of main file?
BTW, I don't like using LOs as external storage.

Implementation seems easy:

struct varlena
{   int32       vl_len;   char        vl_dat[1];
};

1. make vl_len uint32;
2. use vl_len & 0x80000000 as flag that underlying data is  in another place;
3. put oid of external "relation" (where data is stored),  blocknumber and item position (something else?) to vl_dat.
...
...
...

Vadim


Re: [HACKERS] Priorities for 6.6

От
Bruce Momjian
Дата:
> Implementation seems easy:
> 
> struct varlena
> {
>     int32       vl_len;
>     char        vl_dat[1];
> };
> 
> 1. make vl_len uint32;
> 2. use vl_len & 0x80000000 as flag that underlying data is
>    in another place;
> 3. put oid of external "relation" (where data is stored),
>    blocknumber and item position (something else?) to vl_dat.
> ...

Yes, it would be very nice to have this.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Priorities for 6.6

От
Don Baccus
Дата:
At 10:56 AM 6/4/99 +0800, Vadim Mikheev wrote:

>Note: I told about "multi-representation" feature, not just about
>LO/CLOBS/BLOBS support. "Multi-representation" means that server
>stores tuple fields sometime inside the main relation file,
>sometime outside of it, but this is hidden from user and so
>people "just put their data into tuples".

OK, in my first response I didn't pick up on your generalization,
but I did respond with a generalization that implementation 
details should be hidden from the user.

Which is what you're saying.

As a compiler writer, this is more or less what I devoted my
life to 20 years ago...of course, reasonable efficiency is
a pre-condition if you're going to hide details from the
user...

I'll back off a bit, though, and say that a lot of DB users
really don't need an enterprise engine like Oracle (i.e.
something that requires a suite of $100K/yr DBAs :)

There's a niche for a solid reliable, rich feature set,
reasonably well-performing db out there, and this niche
is ever-growing with the web.

With $500 web servers sitting on $29.95/mo DSL lines,
as does mine (http://donb.photo.net/tweeterdom), who
wants to pay $6K to Oracle?  



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, and other goodies at
http://donb.photo.net


Re: [HACKERS] Priorities for 6.6

От
Don Baccus
Дата:
At 10:56 AM 6/4/99 +0800, Vadim Mikheev wrote:

>Note: I told about "multi-representation" feature, not just about
>LO/CLOBS/BLOBS support. "Multi-representation" means that server
>stores tuple fields sometime inside the main relation file,
>sometime outside of it, but this is hidden from user and so
>people "just put their data into tuples". I think that putting
>big fields outside of main relation file is very good thing.

Yes, it is, though "big" is relative (as computers grow).  The
key is to hide the details of where things are stored from the
user, so the user doesn't really have to know what is "big"
(today) vs. "small" (tomorrow or today, for that matter).  I
don't think it's so much the efficiency hit of having big
items stored outside the main relation file, as the need for
the user to know what's "big" and what's "small", that's the
problem.

I mean, my background is as a compiler writer for high-level
languages...call me a 1970's idealist if you will, but I
really think such things should be hidden from the user.



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, and other goodies at
http://donb.photo.net


Re: [HACKERS] Priorities for 6.6

От
Hannu Krosing
Дата:
Tom Lane wrote:
> 
> I don't know what people have had in mind for 6.6, but I propose that
> there ought to be three primary objectives for our next release:
> 
> 1. Eliminate arbitrary restrictions on tuple size.
> 
> 2. Eliminate arbitrary restrictions on query size (textual
>    length/complexity that is).
> 
> 3. Cure within-statement memory leaks, so that processing large numbers
>    of tuples in one statement is reliable.

I would add a few that I think would be important:

A. Add outer joins

B. Add the possibility to prepare statements and then execute them   with a set of arguments. This already exists in
SPIbut for many  C/S apps it would be desirable to have this in the fe/be protocol  as well
 

C. Look over the protocol and unify the _binary_ representations of  datatypes on wire. in fact each type already has
twosets of  in/out conversion functions in its definition tuple, one for disk and  another for net, it's only that
untilnow they are the same for  all types and thus probably used wromg in some parts of code.
 

D. After B. and C., add a possibility to insert binary data  in "(small)binary" field without relying on LOs or
expensive (4x the size) quoting. Allow any characters in said binary field
 

E. to make 2. and B., C, D. possible, some more fundamental changes in  fe/be-protocol may be needed. There seems to be
someeffort for a new  fe/be communications mechanism using CORBA.   But my proposal would be to adopt the X11 protocol
whichis quite
 
light  but still very clean, well understood and which can transfer
arbitrary  data in an efficient way.  There are even "low bandwidth" variants of it for using over  really slow links.
Alsosome kinds of "out of band" provisions exist,  that are used by window managers.  It should also be trivial to
adaptcrypto wrappers/proxies (such as
 
the  one in ssh)  The protocol is described in a document available from
http://www.x.org

F. As a lousy alternative to 1. fix the LO storage. Currently _all_ of  the LO files are kept in the same directory as
thetables and
 
indexes.  this can bog down the whole database quite fast if one lots of LOs
and  a file system that does linear scans on open (like ext2).  A sheme where LOs are kept in subdirectories based on
thehex  representation of their oids would avoid that (so LO with OID
 
0x12345678  would be stored in $PG_DATA/DBNAME/LO/12/34/56/78.lo or maybe
reversed  $PG_DATA/DBNAME/LO/78/56/34/12.lo to distribute them more evenly in  "buckets"

> All of these are fairly major projects, and it might be that we get
> little or nothing else done if we take these on.

But then, the other things to do _are_ little compared to these ;)

> But these are the problems we've been hearing about over and over and
> over.

The LO thing (and lack of decent full-text indexing) is what has kept me 
using hybrid solutions where I keep the LO data and home-grown full-text
indexes in file system outside of the database.

> I think fixing these would do more to improve Postgres than 
> almost any other work we might do.

Amen!

----------------
Hannu


Re: [HACKERS] Priorities for 6.6

От
Peter Galbavy
Дата:
On Thu, Jun 03, 1999 at 11:27:14PM -0400, Bruce Momjian wrote:
> > Implementation seems easy:
> > 
> > struct varlena
> > {
> >     int32       vl_len;
> >     char        vl_dat[1];
> > };
> > 
> > 1. make vl_len uint32;
> > 2. use vl_len & 0x80000000 as flag that underlying data is
> >    in another place;
> > 3. put oid of external "relation" (where data is stored),
> >    blocknumber and item position (something else?) to vl_dat.
> > ...
> 
> Yes, it would be very nice to have this.

I hate to be fussy - normally I am just watching, but could we
*please* keep any flag like above in another field. That way, when
the size of an object reaches 2^31 we will not have legacy problems..

struct varlena
{   size_t  vl_len;   int     vl_flags;   caddr_t vl_dat[1];
};

(Please:)

Regards,
-- 
Peter Galbavy
Knowledge Matters Ltd
http://www.knowledge.com/


Re: [HACKERS] Priorities for 6.6

От
Tom Lane
Дата:
Hannu Krosing <hannu@trust.ee> writes:
> E. to make 2. and B., C, D. possible, some more fundamental changes in
>    fe/be-protocol may be needed. There seems to be some effort for a new
>    fe/be communications mechanism using CORBA. 
>    But my proposal would be to adopt the X11 protocol which is quite
>    light but still very clean, well understood and which can transfer
>    arbitrary data in an efficient way.

... but no one uses it for database work.  If we're going to go to the
trouble of overhauling the fe/be protocol, I think we should adopt
something fairly standard, and that seems to mean CORBA.

> F. As a lousy alternative to 1. fix the LO storage. Currently _all_ of
>    the LO files are kept in the same directory as the tables and
>    indexes. this can bog down the whole database quite fast

Yes.  I was thinking last night that there's no good reason not to
just stick all the LOs into a single relation --- or actually two
relations, one having a row per LO (which would really just act to tell
you what LOs exist, and perhaps store access-privileges info) and one
that has a row per LO chunk, with columns LONumber, Offset, Data rather
than just Offset and Data as is done now.  The existing index on Offset
would be replaced by a multi-index on LONumber and Offset.  In this
scheme the LONumbers need not be tied hard-and-fast to OIDs, but could
actually be anything you wanted, which would be much nicer for
dump/reload purposes.

However, I am loathe to put *any* work into improving LOs, since I think
the right answer is to get rid of the need for the durn things by
eliminating the size restrictions on regular tuples.
        regards, tom lane


Re: [HACKERS] Priorities for 6.6

От
Tom Lane
Дата:
Vadim Mikheev <vadim@krs.ru> writes:
> Note: I told about "multi-representation" feature, not just about
> LO/CLOBS/BLOBS support. "Multi-representation" means that server
> stores tuple fields sometime inside the main relation file,
> sometime outside of it, but this is hidden from user and so
> people "just put their data into tuples". I think that putting
> big fields outside of main relation file is very good thing.

Ah, I see what you mean.  If you think that is easier than splitting
tuples, we could go that way.  We'd have a limit of about 500 fields in
a tuple (maybe less if the tuple contains "small" fields that are not
pushed to another place).  That's annoying if the goal is to eliminate
limits, but I think it would be unlikely to be a big problem in
practice.

Perhaps a better way is to imagine these "pointers to another place"
to be just part of the tuple structure on disk, without tying them to
individual fields.  In other words, the tuple's data is still a string
of fields, but now you can have that data either right there with the
tuple header, or pointed to by a list of "indirect links" that are
stored with the tuple header.  (Kinda like direct vs indirect blocks in
Unix filesystem.)  You can chop the tuple data into blocks without
regard for field boundaries if you do it that way.  I think that might
be better than altering the definition of varlena --- it'd be visible
only to the tuple read and write mechanisms, not to everything in the
executor that deals with varlena fields...
        regards, tom lane


Re: [HACKERS] Priorities for 6.6

От
Vince Vielhaber
Дата:
On 04-Jun-99 Tom Lane wrote:
> However, I am loathe to put *any* work into improving LOs, since I think
> the right answer is to get rid of the need for the durn things by
> eliminating the size restrictions on regular tuples.

Is this doable?  I just looked at the list of datatypes and didn't see
binary as one of them.  Imagining a Real Estate database with pictures
of homes (inside and out), etc. or an employee database with mugshots of
the employees, what datatype would you use to store the pictures (short 
of just storing a filename of the pic)?

Vince.
-- 
==========================================================================
Vince Vielhaber -- KA8CSH   email: vev@michvhf.com   flame-mail: /dev/null      # include <std/disclaimers.h>
       TEAM-OS2       Online Campground Directory    http://www.camping-usa.com      Online Giftshop Superstore
http://www.cloudninegifts.com
==========================================================================




Re: [HACKERS] Priorities for 6.6

От
Tom Lane
Дата:
Vince Vielhaber <vev@michvhf.com> writes:
> On 04-Jun-99 Tom Lane wrote:
>> However, I am loathe to put *any* work into improving LOs, since I think
>> the right answer is to get rid of the need for the durn things by
>> eliminating the size restrictions on regular tuples.

> Is this doable?  I just looked at the list of datatypes and didn't see
> binary as one of them.

bytea ... even if we didn't have one, inventing it would be trivial.
(Although I wonder whether pg_dump copes with arbitrary data in fields
properly ... I think there are still some issues about COPY protocol
not being fully 8-bit-clean...)

As someone else pointed out, you'd still want an equivalent of
lo_read/lo_write, but now it would mean fetch or put N bytes at an
offset of M bytes within the value of field X of tuple Y in some
relation.  Otherwise field X is pretty much like any other item in the
database.  I suppose it'd only make sense to allow random data to be
fetched/stored in a bytea field --- other datatypes would want to
constrain the data to valid values...
        regards, tom lane


Re: [HACKERS] Priorities for 6.6

От
Thomas Lockhart
Дата:
> > eliminating the size restrictions on regular tuples.
> Is this doable?

Presumably we would have to work out a "chunking" client/server
protocol to allow sending very large tuples. Also, it would need to
report the size of the tuple before it shows up, to allow very large
rows to be caught correctly.
                - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


Re: [HACKERS] Priorities for 6.6

От
Tom Lane
Дата:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
>>>> eliminating the size restrictions on regular tuples.
>> Is this doable?

> Presumably we would have to work out a "chunking" client/server
> protocol to allow sending very large tuples.

I don't really see a need to change the protocol.  It's true that
a single tuple containing a couple dozen megabytes (per someone's
recent example) would stress the system unpleasantly, but that would
be true in a *lot* of ways.  Perhaps we should plan on keeping the
LO feature to allow for really huge objects.

As far as I've seen, 99% of users are not interested in storing objects
that are so large that handling them as single tuples would pose serious
performance problems.  It's just that a hard limit at 8K (or any other
particular small number) is annoying.
        regards, tom lane


Re: [HACKERS] Priorities for 6.6

От
Bruce Momjian
Дата:
> C. Look over the protocol and unify the _binary_ representations of
>    datatypes on wire. in fact each type already has two sets of
>    in/out conversion functions in its definition tuple, one for disk and
>    another for net, it's only that until now they are the same for
>    all types and thus probably used wromg in some parts of code.

Added to TODO:
* remove duplicate type in/out functions for disk and net

> 
> D. After B. and C., add a possibility to insert binary data
>    in "(small)binary" field without relying on LOs or expensive
>    (4x the size) quoting. Allow any characters in said binary field

I will add this to the TODO list if you can tell me how does the user
pass this into the backend via a query?
* Add non-large-object binary field


> F. As a lousy alternative to 1. fix the LO storage. Currently _all_ of
>    the LO files are kept in the same directory as the tables and
> indexes.
>    this can bog down the whole database quite fast if one lots of LOs
> and
>    a file system that does linear scans on open (like ext2).
>    A sheme where LOs are kept in subdirectories based on the hex
>    representation of their oids would avoid that (so LO with OID
> 0x12345678
>    would be stored in $PG_DATA/DBNAME/LO/12/34/56/78.lo or maybe
> reversed
>    $PG_DATA/DBNAME/LO/78/56/34/12.lo to distribute them more evenly in
>    "buckets"

I have already added a TODO item to use hash directories for large
objects.  Probably single or double-level 256 directory buckets are
enough:
04/4A/file09/B3/file




--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Priorities for 6.6

От
Bruce Momjian
Дата:
OK, question answered, TODO item added:
* Add non-large-object binary field

> > Is this doable?  I just looked at the list of datatypes and didn't see
> > binary as one of them.
> 
> bytea ... even if we didn't have one, inventing it would be trivial.
> (Although I wonder whether pg_dump copes with arbitrary data in fields
> properly ... I think there are still some issues about COPY protocol
> not being fully 8-bit-clean...)
> 
> As someone else pointed out, you'd still want an equivalent of
> lo_read/lo_write, but now it would mean fetch or put N bytes at an
> offset of M bytes within the value of field X of tuple Y in some
> relation.  Otherwise field X is pretty much like any other item in the
> database.  I suppose it'd only make sense to allow random data to be
> fetched/stored in a bytea field --- other datatypes would want to
> constrain the data to valid values...
> 
>             regards, tom lane
> 
> 


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026