Обсуждение: Do we need use more meaningful variables to replace 0 in catalog head files?

Поиск
Список
Период
Сортировка

Do we need use more meaningful variables to replace 0 in catalog head files?

От
Hao Lee
Дата:
Hi guys,
   Although, usually, we do not change the system catalog or modify the catalog schema, or adding a new system catalog, but in these system catalog head files, such as pg_xxx.h, i think we should use more meaningful variables. As we known, in pg_xxx.h files, we insert some initial values into system catalog, as following shown in pg_class.h.

DATA(insert OID = 1247 (  pg_type PGNSP 71 0 PGUID 0 0 0 0 0 0 0 f f p r 30 0 t f f f f f f t n 3 1 _null_ _null_ ));
DESCR("");
DATA(insert OID = 1249 (  pg_attribute PGNSP 75 0 PGUID 0 0 0 0 0 0 0 f f p r 21 0 f f f f f f f t n 3 1 _null_ _null_ ));
DESCR("");

It's a tedious work to figure out these numbers real meaning. for example, if i want to know the value of '71'  represent what it is. I should go back to refer to definition of pg_class struct. It's a tedious work and it's not maintainable or readable.  I THINK WE SHOULD USE a meaningful variable instead of '71'. For Example:

#define PG_TYPE_RELTYPE 71



Regards,

Hom.

Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Michael Paquier
Дата:
On Tue, Nov 8, 2016 at 10:57 AM, Hao Lee <mixtrue@gmail.com> wrote:
> It's a tedious work to figure out these numbers real meaning. for example,
> if i want to know the value of '71'  represent what it is. I should go back
> to refer to definition of pg_class struct. It's a tedious work and it's not
> maintainable or readable.  I THINK WE SHOULD USE a meaningful variable
> instead of '71'. For Example:
>
> #define PG_TYPE_RELTYPE 71

You'd need to make genbki.pl smarter regarding the way to associate
those variables with the defined variables, greatly increasing the
amount of work it is doing as well as its maintenance (see for PGUID
handling for example). I am not saying that this is undoable, just
that the complexity may not be worth the potential readability gains.
-- 
Michael



Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Robert Haas
Дата:
On Mon, Nov 7, 2016 at 9:10 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Tue, Nov 8, 2016 at 10:57 AM, Hao Lee <mixtrue@gmail.com> wrote:
>> It's a tedious work to figure out these numbers real meaning. for example,
>> if i want to know the value of '71'  represent what it is. I should go back
>> to refer to definition of pg_class struct. It's a tedious work and it's not
>> maintainable or readable.  I THINK WE SHOULD USE a meaningful variable
>> instead of '71'. For Example:
>>
>> #define PG_TYPE_RELTYPE 71
>
> You'd need to make genbki.pl smarter regarding the way to associate
> those variables with the defined variables, greatly increasing the
> amount of work it is doing as well as its maintenance (see for PGUID
> handling for example). I am not saying that this is undoable, just
> that the complexity may not be worth the potential readability gains.

Most of these files don't have that many entries, and they're not
modified that often.  The elephant in the room is pg_proc.h, which is
huge, frequently-modified, and hard to decipher.  But I think that's
going to need more surgery than just introducing named constants -
which would also have the downside of making the already-long lines
even longer.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Hao Lee
Дата:
<div dir="ltr">yes, i agree with you. These catalogs are not modified often. As your said, the pg_proc modified often,
therefore,there are another issues, the dependency between these system catalogs and system views. it's hard to gain
maintenancethe consistency between these catalogs and views. It's need more cares when do modifying. So that i think
thatwhether there are some more smarter approaches to make it smarter or not.  </div><div class="gmail_extra"><br
/><divclass="gmail_quote">On Wed, Nov 9, 2016 at 6:33 AM, Robert Haas <span dir="ltr"><<a
href="mailto:robertmhaas@gmail.com"target="_blank">robertmhaas@gmail.com</a>></span> wrote:<br /><blockquote
class="gmail_quote"style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On Mon, Nov 7,
2016at 9:10 PM, Michael Paquier<br /> <<a href="mailto:michael.paquier@gmail.com">michael.paquier@gmail.com</a>>
wrote:<br/> > On Tue, Nov 8, 2016 at 10:57 AM, Hao Lee <<a
href="mailto:mixtrue@gmail.com">mixtrue@gmail.com</a>>wrote:<br /> >> It's a tedious work to figure out these
numbersreal meaning. for example,<br /> >> if i want to know the value of '71'  represent what it is. I should go
back<br/> >> to refer to definition of pg_class struct. It's a tedious work and it's not<br /> >>
maintainableor readable.  I THINK WE SHOULD USE a meaningful variable<br /> >> instead of '71'. For Example:<br
/>>><br /> >> #define PG_TYPE_RELTYPE 71<br /> ><br /> > You'd need to make <a
href="http://genbki.pl"rel="noreferrer" target="_blank">genbki.pl</a> smarter regarding the way to associate<br /> >
thosevariables with the defined variables, greatly increasing the<br /> > amount of work it is doing as well as its
maintenance(see for PGUID<br /> > handling for example). I am not saying that this is undoable, just<br /> > that
thecomplexity may not be worth the potential readability gains.<br /><br /></span>Most of these files don't have that
manyentries, and they're not<br /> modified that often.  The elephant in the room is pg_proc.h, which is<br /> huge,
frequently-modified,and hard to decipher.  But I think that's<br /> going to need more surgery than just introducing
namedconstants -<br /> which would also have the downside of making the already-long lines<br /> even longer.<br
/><spanclass="HOEnZb"><font color="#888888"><br /> --<br /> Robert Haas<br /> EnterpriseDB: <a
href="http://www.enterprisedb.com"rel="noreferrer" target="_blank">http://www.enterprisedb.com</a><br /> The Enterprise
PostgreSQLCompany<br /></font></span></blockquote></div><br /></div> 

Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Craig Ringer
Дата:
On 9 November 2016 at 10:20, Hao Lee <mixtrue@gmail.com> wrote:
> yes, i agree with you. These catalogs are not modified often. As your said,
> the pg_proc modified often, therefore, there are another issues, the
> dependency between these system catalogs and system views. it's hard to gain
> maintenance the consistency between these catalogs and views. It's need more
> cares when do modifying. So that i think that whether there are some more
> smarter approaches to make it smarter or not.
>
> On Wed, Nov 9, 2016 at 6:33 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>
>> On Mon, Nov 7, 2016 at 9:10 PM, Michael Paquier
>> <michael.paquier@gmail.com> wrote:
>> > On Tue, Nov 8, 2016 at 10:57 AM, Hao Lee <mixtrue@gmail.com> wrote:
>> >> It's a tedious work to figure out these numbers real meaning. for
>> >> example,
>> >> if i want to know the value of '71'  represent what it is. I should go
>> >> back
>> >> to refer to definition of pg_class struct. It's a tedious work and it's
>> >> not
>> >> maintainable or readable.  I THINK WE SHOULD USE a meaningful variable
>> >> instead of '71'. For Example:
>> >>
>> >> #define PG_TYPE_RELTYPE 71
>> >
>> > You'd need to make genbki.pl smarter regarding the way to associate
>> > those variables with the defined variables, greatly increasing the
>> > amount of work it is doing as well as its maintenance (see for PGUID
>> > handling for example). I am not saying that this is undoable, just
>> > that the complexity may not be worth the potential readability gains.
>>
>> Most of these files don't have that many entries, and they're not
>> modified that often.  The elephant in the room is pg_proc.h, which is
>> huge, frequently-modified, and hard to decipher.  But I think that's
>> going to need more surgery than just introducing named constants -
>> which would also have the downside of making the already-long lines
>> even longer.

I'd be pretty happy to see pg_proc.h in particular replaced with some
pg_proc.h.in with something sane doing the preprocessing. It's a
massive pain right now.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> Most of these files don't have that many entries, and they're not
> modified that often.  The elephant in the room is pg_proc.h, which is
> huge, frequently-modified, and hard to decipher.  But I think that's
> going to need more surgery than just introducing named constants -
> which would also have the downside of making the already-long lines
> even longer.

I don't think we need "named constants", especially not
manually-maintained ones.  The thing that would help in pg_proc.h is for
numeric type OIDs to be replaced by type names.  We talked awhile back
about introducing some sort of preprocessing step that would allow doing
that --- ie, it would look into some precursor file for pg_type.h and
extract the appropriate OID automatically.  I'm too tired to go find the
thread right now, but it was mostly about building the long-DATA-lines
representation from something easier to edit.
        regards, tom lane



Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Michael Paquier
Дата:
On Wed, Nov 9, 2016 at 1:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I don't think we need "named constants", especially not
> manually-maintained ones.  The thing that would help in pg_proc.h is for
> numeric type OIDs to be replaced by type names.  We talked awhile back
> about introducing some sort of preprocessing step that would allow doing
> that --- ie, it would look into some precursor file for pg_type.h and
> extract the appropriate OID automatically.  I'm too tired to go find the
> thread right now, but it was mostly about building the long-DATA-lines
> representation from something easier to edit.

You mean that I guess:
https://www.postgresql.org/message-id/4d191a530911041228v621286a7q6a98d9ab8a2ed734@mail.gmail.com
-- 
Michael



Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Tom Lane
Дата:
Michael Paquier <michael.paquier@gmail.com> writes:
> On Wed, Nov 9, 2016 at 1:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I don't think we need "named constants", especially not
>> manually-maintained ones.  The thing that would help in pg_proc.h is for
>> numeric type OIDs to be replaced by type names.  We talked awhile back
>> about introducing some sort of preprocessing step that would allow doing
>> that --- ie, it would look into some precursor file for pg_type.h and
>> extract the appropriate OID automatically.  I'm too tired to go find the
>> thread right now, but it was mostly about building the long-DATA-lines
>> representation from something easier to edit.

> You mean that I guess:
> https://www.postgresql.org/message-id/4d191a530911041228v621286a7q6a98d9ab8a2ed734@mail.gmail.com

Hmm, that's from 2009.  I thought I remembered something much more recent,
like last year or so.
        regards, tom lane



Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Amit Langote
Дата:
On Wed, Nov 9, 2016 at 11:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Michael Paquier <michael.paquier@gmail.com> writes:
>> On Wed, Nov 9, 2016 at 1:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I don't think we need "named constants", especially not
>>> manually-maintained ones.  The thing that would help in pg_proc.h is for
>>> numeric type OIDs to be replaced by type names.  We talked awhile back
>>> about introducing some sort of preprocessing step that would allow doing
>>> that --- ie, it would look into some precursor file for pg_type.h and
>>> extract the appropriate OID automatically.  I'm too tired to go find the
>>> thread right now, but it was mostly about building the long-DATA-lines
>>> representation from something easier to edit.
>
>> You mean that I guess:
>> https://www.postgresql.org/message-id/4d191a530911041228v621286a7q6a98d9ab8a2ed734@mail.gmail.com
>
> Hmm, that's from 2009.  I thought I remembered something much more recent,
> like last year or so.

This perhaps:

* Re: Bootstrap DATA is a pita *
https://www.postgresql.org/message-id/flat/CAOjayEfKBL-_Q9m3Jsv6V-mK1q8h%3Dca5Hm0fecXGxZUhPDN9BA%40mail.gmail.com

Thanks,
Amit



Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Tom Lane
Дата:
Amit Langote <amitlangote09@gmail.com> writes:
> On Wed, Nov 9, 2016 at 11:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Hmm, that's from 2009.  I thought I remembered something much more recent,
>> like last year or so.

> This perhaps:

> * Re: Bootstrap DATA is a pita *
> https://www.postgresql.org/message-id/flat/CAOjayEfKBL-_Q9m3Jsv6V-mK1q8h%3Dca5Hm0fecXGxZUhPDN9BA%40mail.gmail.com

Yeah, that's the thread I remembered.  I think the basic conclusion was
that we needed a Perl script that would suck up a bunch of data from some
representation that's more edit-friendly than the DATA lines, expand
symbolic representations (regprocedure etc) into numeric OIDs, and write
out the .bki script from that.  I thought some people had volunteered to
work on that, but we've seen no results ...
        regards, tom lane



Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Corey Huinker
Дата:
<div dir="ltr"><div class="gmail_extra"><br /></div><div class="gmail_extra"><div class="gmail_quote">On Wed, Nov 9,
2016at 10:47 AM, Tom Lane <span dir="ltr"><<a href="mailto:tgl@sss.pgh.pa.us"
target="_blank">tgl@sss.pgh.pa.us</a>></span>wrote:<br /><blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1pxsolid rgb(204,204,204);padding-left:1ex"><div class="gmail-m_-998365091969121879a3s
gmail-m_-998365091969121879aXjCHgmail-m_-998365091969121879m15849c69118e7648"
id="gmail-m_-998365091969121879:1lo">Yeah,that's the thread I remembered.  I think the basic conclusion was<br /> that
weneeded a Perl script that would suck up a bunch of data from some<br /> representation that's more edit-friendly than
theDATA lines, expand<br /> symbolic representations (regprocedure etc) into numeric OIDs, and write<br /> out the .bki
scriptfrom that.  I thought some people had volunteered to<br /> work on that, but we've seen no results
...</div></blockquote></div><br/>If there are no barriers to adding it to our toolchain, could that more-edit-friendly
representationbe a SQLite database?<br /> </div><div class="gmail_extra">I'm not suggesting we store a .sqlite file in
ourrepo. I'm suggesting that we store the dump-restore script in our repo, and the program that generates the .bki
scriptwould query the generated SQLite db.<br /><br />From that initial dump, any changes to pg_proc.h would be
appendedto the dumped script<br /><br /></div><blockquote style="margin:0 0 0 40px;border:none;padding:0px"><div
class="gmail_extra"><fontface="monospace, monospace">...</font></div></blockquote><blockquote style="margin:0 0 0
40px;border:none;padding:0px"><divclass="gmail_extra"><font face="monospace, monospace">/* add new frombozulation
feature*/</font></div></blockquote><blockquote style="margin:0 0 0 40px;border:none;padding:0px"><div
class="gmail_extra"><fontface="monospace, monospace">ALTER TABLE pg_proc_template ADD frombozulator text;<br />/*
bubblyfrombozulation is the default for volatile functions */</font></div><div class="gmail_extra"><font
face="monospace,monospace">UPDATE pg_proc_template SET frombozulator = 'bubbly' WHERE provolatile =
'v';</font></div></blockquote><blockquotestyle="margin:0 0 0 40px;border:none;padding:0px"><div
class="gmail_extra"><fontface="monospace, monospace">/* proposed new function */</font></div></blockquote><blockquote
style="margin:00 0 40px;border:none;padding:0px"><div class="gmail_extra"><font face="monospace, monospace">INSERT INTO
pg_proc_template(proname,proleakproof)VALUES ("new_func",'f');</font></div></blockquote><blockquote style="margin:0 0 0
40px;border:none;padding:0px"><divclass="gmail_extra"><font face="monospace, monospace"><br
/></font></div></blockquote><divclass="gmail_extra"><br />That'd communicate the meaning of our changes rather nicely.
Away to eat our own conceptual dogfood.<br /><br /></div><div class="gmail_extra">Eventually it'd get cluttered and
we'dreplace the populate script with a fresh ".dump". Maybe we do that as often as we reformat our C code.<br /><br />I
thinkStephen Frost suggested something like this a while back, but I couldn't find it after a short search.<br
/></div></div>

Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Tom Lane
Дата:
Corey Huinker <corey.huinker@gmail.com> writes:
> On Wed, Nov 9, 2016 at 10:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Yeah, that's the thread I remembered.  I think the basic conclusion was
>> that we needed a Perl script that would suck up a bunch of data from some
>> representation that's more edit-friendly than the DATA lines, expand
>> symbolic representations (regprocedure etc) into numeric OIDs, and write
>> out the .bki script from that.  I thought some people had volunteered to
>> work on that, but we've seen no results ...

> If there are no barriers to adding it to our toolchain, could that
> more-edit-friendly representation be a SQLite database?

I think you've fundamentally missed the point here.  A data dump from a
table would be semantically indistinguishable from the lots-o-DATA-lines
representation we have now.  What we want is something that isn't that.
In particular I don't see how that would let us have any extra level of
abstraction that's not present in the finished form of the catalog tables.
(An example that's already there is FLOAT8PASSBYVAL for the value of
typbyval appropriate to float8 and allied types.)

I'm not very impressed with the suggestion of making a competing product
part of our build dependencies, either.  If we wanted to get into build
dependency circularities, we could consider using a PG database in this
way ... but I prefer to leave such headaches to compiler authors for whom
it comes with the territory.
        regards, tom lane



Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Jan de Visser
Дата:
On 2016-11-09 10:47 AM, Tom Lane wrote:

> Amit Langote <amitlangote09@gmail.com> writes:
>> On Wed, Nov 9, 2016 at 11:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Hmm, that's from 2009.  I thought I remembered something much more recent,
>>> like last year or so.
>> This perhaps:
>> * Re: Bootstrap DATA is a pita *
>> https://www.postgresql.org/message-id/flat/CAOjayEfKBL-_Q9m3Jsv6V-mK1q8h%3Dca5Hm0fecXGxZUhPDN9BA%40mail.gmail.com
> Yeah, that's the thread I remembered.  I think the basic conclusion was
> that we needed a Perl script that would suck up a bunch of data from some
> representation that's more edit-friendly than the DATA lines, expand
> symbolic representations (regprocedure etc) into numeric OIDs, and write
> out the .bki script from that.  I thought some people had volunteered to
> work on that, but we've seen no results ...
>
>             regards, tom lane
>
>

Would a python script converting something like json or yaml be 
acceptable? I think right now only perl is used, so it would be a new 
build chain tool, albeit one that's in my (very humble) opinion much 
better suited to the task.




Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Corey Huinker
Дата:

On Thu, Nov 10, 2016 at 6:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I think you've fundamentally missed the point here.  A data dump from a
table would be semantically indistinguishable from the lots-o-DATA-lines
representation we have now.  What we want is something that isn't that.
In particular I don't see how that would let us have any extra level of
abstraction that's not present in the finished form of the catalog tables.

I was thinking several tables, with the central table having column values which we find semantically descriptive, and having lookup tables to map those semantically descriptive keys to the value we actually want in the pg_proc column. It'd be a tradeoff of macros for entries in lookup tables.
 
I'm not very impressed with the suggestion of making a competing product
part of our build dependencies, either. 

I don't see the products as competing, nor did the presenter of https://www.pgcon.org/2014/schedule/events/736.en.html (title: SQLite: Protégé of PostgreSQL). That talk made the case that SQLite's goal is to be the foundation of file formats, not an RDBMS. I do understand wanting to minimize build dependencies.
 
If we wanted to get into build
dependency circularities, we could consider using a PG database in this
way ... but I prefer to leave such headaches to compiler authors for whom
it comes with the territory.

Agreed, bootstrapping builds aren't fun. This suggestion was a way to have a self-contained format that uses concepts (joining a central table to lookup tables) already well understood in our community.

Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Magnus Hagander
Дата:
<p dir="ltr"><p dir="ltr">On Nov 11, 2016 00:53, "Jan de Visser" <<a
href="mailto:jan@de-visser.net">jan@de-visser.net</a>>wrote:<br /> ><br /> > On 2016-11-09 10:47 AM, Tom Lane
wrote:<br/> ><br /> >> Amit Langote <<a
href="mailto:amitlangote09@gmail.com">amitlangote09@gmail.com</a>>writes:<br /> >>><br /> >>> On
Wed,Nov 9, 2016 at 11:47 PM, Tom Lane <<a href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>> wrote:<br />
>>>><br/> >>>> Hmm, that's from 2009.  I thought I remembered something much more recent,<br />
>>>>like last year or so.<br /> >>><br /> >>> This perhaps:<br /> >>> * Re:
BootstrapDATA is a pita *<br /> >>> <a
href="https://www.postgresql.org/message-id/flat/CAOjayEfKBL-_Q9m3Jsv6V-mK1q8h%3Dca5Hm0fecXGxZUhPDN9BA%40mail.gmail.com">https://www.postgresql.org/message-id/flat/CAOjayEfKBL-_Q9m3Jsv6V-mK1q8h%3Dca5Hm0fecXGxZUhPDN9BA%40mail.gmail.com</a><br
/>>><br /> >> Yeah, that's the thread I remembered.  I think the basic conclusion was<br /> >> that
weneeded a Perl script that would suck up a bunch of data from some<br /> >> representation that's more
edit-friendlythan the DATA lines, expand<br /> >> symbolic representations (regprocedure etc) into numeric OIDs,
andwrite<br /> >> out the .bki script from that.  I thought some people had volunteered to<br /> >> work on
that,but we've seen no results ...<br /> >><br /> >>                         regards, tom lane<br />
>><br/> >><br /> ><br /> > Would a python script converting something like json or yaml be
acceptable?I think right now only perl is used, so it would be a new build chain tool, albeit one that's in my (very
humble)opinion much better suited to the task.<br /> ><br /><p dir="ltr">Python or perl is not what matters here
really.For something as simple as this (for the script) it doesn't make a real difference. I personally prefer python
overperl in most cases, but our standard is perl so we should stick to that. <p dir="ltr">The issues is coming up with
aformat that people like and think is an improvement. <p dir="ltr">If we have that and a python script for our, someone
wouldsurely volunteer to convert that part. But we need to start by solving the actual problem. <p dir="ltr">/Magnus
<br/> 

Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Andrew Dunstan
Дата:

On 11/11/2016 03:03 AM, Magnus Hagander wrote:
>
> On Nov 11, 2016 00:53, "Jan de Visser" <jan@de-visser.net 
> <mailto:jan@de-visser.net>> wrote:
> >
> > On 2016-11-09 10:47 AM, Tom Lane wrote:
> >
> >> Amit Langote <amitlangote09@gmail.com 
> <mailto:amitlangote09@gmail.com>> writes:
> >>>
> >>> On Wed, Nov 9, 2016 at 11:47 PM, Tom Lane <tgl@sss.pgh.pa.us 
> <mailto:tgl@sss.pgh.pa.us>> wrote:
> >>>>
> >>>> Hmm, that's from 2009.  I thought I remembered something much 
> more recent,
> >>>> like last year or so.
> >>>
> >>> This perhaps:
> >>> * Re: Bootstrap DATA is a pita *
> >>> 
> https://www.postgresql.org/message-id/flat/CAOjayEfKBL-_Q9m3Jsv6V-mK1q8h%3Dca5Hm0fecXGxZUhPDN9BA%40mail.gmail.com
> >>
> >> Yeah, that's the thread I remembered.  I think the basic conclusion was
> >> that we needed a Perl script that would suck up a bunch of data 
> from some
> >> representation that's more edit-friendly than the DATA lines, expand
> >> symbolic representations (regprocedure etc) into numeric OIDs, and 
> write
> >> out the .bki script from that.  I thought some people had 
> volunteered to
> >> work on that, but we've seen no results ...
> >>
> >>                         regards, tom lane
> >>
> >>
> >
> > Would a python script converting something like json or yaml be 
> acceptable? I think right now only perl is used, so it would be a new 
> build chain tool, albeit one that's in my (very humble) opinion much 
> better suited to the task.
> >
>
> Python or perl is not what matters here really. For something as 
> simple as this (for the script) it doesn't make a real difference. I 
> personally prefer python over perl in most cases, but our standard is 
> perl so we should stick to that.
>
> The issues is coming up with a format that people like and think is an 
> improvement.
>
> If we have that and a python script for our, someone would surely 
> volunteer to convert that part. But we need to start by solving the 
> actual problem.
>
>


+1. If we come up with an agreed format I will undertake to produce the 
requisite perl script. So let's reopen the debate on the data format. I 
want something that doesn't consume large numbers of lines per entry. If 
we remove defaults in most cases we should be able to fit a set of 
key/value pairs on just a handful of lines.

cheers

andrew




Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Tom Lane
Дата:
Andrew Dunstan <andrew@dunslane.net> writes:
> +1. If we come up with an agreed format I will undertake to produce the 
> requisite perl script. So let's reopen the debate on the data format. I 
> want something that doesn't consume large numbers of lines per entry. If 
> we remove defaults in most cases we should be able to fit a set of 
> key/value pairs on just a handful of lines.

The other reason for keeping the entries short is to prevent patch
misapplications: you want three or less lines of context to be enough
to uniquely identify which line you're changing.  So something with,
say, a bunch of <tag></tag> overhead, with that markup split onto
separate lines, would be a disaster.  This may mean that we can't
get too far away from the DATA-line approach :-(.

Or maybe what we need to do is ensure that there's identification info on
every line, something like (from the first entry in pg_proc.h)

boolin: OID=1242 proname=boolin proargtypes="cstring" prorettype=bool
boolin: prosrc=boolin provolatile=i proparallel=s

(I'm imagining the prefix as having no particular semantic significance,
except that identical values on successive lines denote fields for a
single catalog row.)

With this approach, even if you had blocks of boilerplate-y lines
that were the same for many successive functions, the prefixes would
keep them looking unique to "patch".

On the other hand, Andrew might be right that with reasonable defaults
available, the entries would mostly be short enough that there wouldn't
be much of a problem anyway.  This example certainly looks that way.
        regards, tom lane



Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Andrew Dunstan
Дата:

On 11/11/2016 11:10 AM, Tom Lane wrote:
> boolin: OID=1242 proname=boolin proargtypes="cstring" prorettype=bool
> boolin: prosrc=boolin provolatile=i proparallel=s
>
>


I have written a little perl script to turn the pg_proc DATA lines into
something like the format suggested. In order to keep the space used as
small as possible, I used a prefix based on the OID. See attached result.

Still plenty of work to go, e.g. grabbing the DESCR lines, and turning
this all back into DATA/DESCR lines, but I wanted to get this out there
before going much further.

The defaults I used are below (commented out keys are not defaulted,
they are just there for completeness).

    my %defaults = (
    #                oid =>
    #                name =>
                     namespace => 'PGNSP',
                     owner => 'PGUID',
                     lang => '12',
                     cost => '1',
                     rows => '0',
                     variadic => '0',
                     transform => '0',
                     isagg => 'f',
                     iswindow => 'f',
                     secdef => 'f',
                     leakproof => 'f',
                     isstrict => 'f',
                     retset => 'f',
                     volatile => 'v',
                     parallel => 'u',
    #                nargs =>
                     nargdefaults => '0',
    #                rettype =>
    #                argtypes =>
                     allargtypes => '_null_',
                     argmodes => '_null_',
                     argnames => '_null_',
                     argdefaults => '_null_',
                     trftypes => '_null_',
    #                src =>
                     bin => '_null_',
                     config => '_null_',
                     acl => '_null_',
    );


cheers

andrew


Вложения

Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Peter Eisentraut
Дата:
On 11/11/16 11:10 AM, Tom Lane wrote:
> boolin: OID=1242 proname=boolin proargtypes="cstring" prorettype=bool
> boolin: prosrc=boolin provolatile=i proparallel=s

Then we're not very far away from just using CREATE FUNCTION SQL commands.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Andres Freund
Дата:
On 2016-11-13 00:20:22 -0500, Peter Eisentraut wrote:
> On 11/11/16 11:10 AM, Tom Lane wrote:
> > boolin: OID=1242 proname=boolin proargtypes="cstring" prorettype=bool
> > boolin: prosrc=boolin provolatile=i proparallel=s
> 
> Then we're not very far away from just using CREATE FUNCTION SQL commands.

Well, those do a lot of syscache lookups, which in turn do lookups for
functions...

Andres



Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Andrew Dunstan
Дата:

On 11/13/2016 04:54 AM, Andres Freund wrote:
> On 2016-11-12 12:30:45 -0500, Andrew Dunstan wrote:
>>
>> On 11/11/2016 11:10 AM, Tom Lane wrote:
>>> boolin: OID=1242 proname=boolin proargtypes="cstring" prorettype=bool
>>> boolin: prosrc=boolin provolatile=i proparallel=s
>>>
>>>
>>
>> I have written a little perl script to turn the pg_proc DATA lines into
>> something like the format suggested. In order to keep the space used as
>> small as possible, I used a prefix based on the OID. See attached result.
>>
>> Still plenty of work to go, e.g. grabbing the DESCR lines, and turning this
>> all back into DATA/DESCR lines, but I wanted to get this out there before
>> going much further.
>>
>> The defaults I used are below (commented out keys are not defaulted, they
>> are just there for completeness).
> In the referenced thread I'd started to work on something like this,
> until other people also said they'd be working on it.  I chose a
> different output format (plain Data::Dumper), but I'd added the parsing
> of DATA/DESCR and such to genbki.
>
> Note that I found that initdb performance is greatly increased *and*
> legibility is improvided, if types and such in the data files are
> expanded, and converted to their oids when creating postgres.bki.


Yeah, I have the type name piece, it was close to trivial. I just read 
in pg_type.h and stored the names/oids in a hash.

Data::Dumper is too wasteful of space. The thing I like about Tom's 
format is that it's nice and concise.

I'm not convinced the line prefix part is necessary, though. What I'm 
thinking of is something like this:

PROCDATA( oid=1242 name=boolin isstrict=t volatile=i parallel=s nargs=1    rettype=bool argtypes="cstring" src=boolin
);

Teaching Catalog.pm how to parse that and turn the type names back into 
oids won't be difficult. I already have code for the prefix version, and 
this would be easier since there is an end marker.

I'd actually like to roll up the DESCR lines in pg_proc.h into this too, 
they strike me as a bit of a wart. But I'm flexible on that.

If we can generalize this to other catalogs, then that will be good, but 
my inclination is to handle the elephant in the room (pg_proc.h) and 
worry about the gnats later.

>
> I basically made genbki/catalog.pm accept text whenever a column is of
> type regtype/regprocedure/. To then make use of that I converted a bunch
> of plain oid columns to their their reg* equivalent. That's also nice
> for just plain qureying of the catalogs ;)
>
> I don't think the code is going to be much use for you directlky, but it
> might be worthwhile to crib some stuff from the 0002 of the attached
> patches (based on 74811c4050921959d54d42e2c15bb79f0e2c37f3).


Thanks, I will take a look.

cheers

andrew





Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Tom Lane
Дата:
Andrew Dunstan <andrew@dunslane.net> writes:
> I'm not convinced the line prefix part is necessary, though. What I'm 
> thinking of is something like this:

> PROCDATA( oid=1242 name=boolin isstrict=t volatile=i parallel=s nargs=1
>      rettype=bool argtypes="cstring" src=boolin );

We could go in that direction too, but the apparent flexibility to split
entries into multiple lines is an illusion, at least if you try to go
beyond a few lines; you'd end up with duplicated line sequences in
different entries and thus ambiguity for patch(1).  I don't have any
big objection to the above, but it's not obviously better either.

Some things we should try to resolve before settling definitively on
a data representation:

1. Are we going to try to keep these things in the .h files, or split
them out?  I'd like to get them out, as that eliminates both the need
to keep the things looking like macro calls, and the need for the data
within the macro call to be at least minimally parsable as C.

2. Andrew's example above implies some sort of mapping between the
keywords and the actual column names (or at least column positions).
Where and how is that specified?

3. Also where are we going to provide the per-column default values?
How does the converter script know which columns to convert to type oids,
proc oids, etc?  Is it going to do any data validation beyond that, and
if so on what basis?

4. What will we do about the #define's that some of the .h files provide
for (some of) their object OIDs?  I assume that we want to move in the
direction of autogenerating those macros a la fmgroids.h, but this needs
a concrete spec as well.  If we don't want this change to result in a big
hit to the source code, we're probably going to need to be able to specify
the macro names to generate in the data files.

5. One of the requirements that was mentioned in previous discussions
was to make it easier to add new columns to catalogs.  This format
does that only to the extent that you don't have to touch entries that
can use the default value for such a column.  Is that good enough, and
if not, what might we be able to do to make it better?


> I'd actually like to roll up the DESCR lines in pg_proc.h into this too, 
> they strike me as a bit of a wart. But I'm flexible on that.

+1, if we can come up with a better syntax.  This together with the
OID-macro issue suggests that there will be items in each data entry that
correspond to something other than columns of the target catalog.  But
that seems fine.

> If we can generalize this to other catalogs, then that will be good, but 
> my inclination is to handle the elephant in the room (pg_proc.h) and 
> worry about the gnats later.

I think we want to do them all.  pg_proc.h is actually one of the easier
catalogs to work on presently, IMO, because the only kind of
cross-references it has are type OIDs.  Things like pg_amop are a mess.
And I really don't want to be dealing with multiple notations for catalog
data.  Also I think this will be subject to Polya's paradox: designing a
general solution will be easier and cleaner than a hack that works only
for one catalog.
        regards, tom lane



Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2016-11-13 00:20:22 -0500, Peter Eisentraut wrote:
>> Then we're not very far away from just using CREATE FUNCTION SQL commands.

> Well, those do a lot of syscache lookups, which in turn do lookups for
> functions...

We can't use CREATE FUNCTION as the representation in the .bki file,
because of the circularities involved (you can't fill pg_proc before
pg_type nor vice versa).  But I think Peter was suggesting that the
input to the bki-generator script could look like CREATE commands.
That's true, but I fear it would greatly increase the complexity
of the script for not much benefit.  It does little for the question of
"how do you update the data when adding a new pg_proc column", for
instance.  And you'd still need some non-SQL warts, like how to specify
manually-assigned OIDs for types and functions.  (I'm not sure whether
we could get away with dropping fixed assignments of function OIDs,
but we absolutely can't do so for types.  Lots of client code knows
that text is oid 25, for example.)
        regards, tom lane



Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Andres Freund
Дата:
On 2016-11-13 11:11:37 -0500, Tom Lane wrote:
> 1. Are we going to try to keep these things in the .h files, or split
> them out?  I'd like to get them out, as that eliminates both the need
> to keep the things looking like macro calls, and the need for the data
> within the macro call to be at least minimally parsable as C.

I vote for splitting them out.


> 2. Andrew's example above implies some sort of mapping between the
> keywords and the actual column names (or at least column positions).
> Where and how is that specified?

I don't know what andrew was planning, but before I stopped I had a 1:1
mapping beteween column names and keywords. Catalog.pm parses the
pg_*.h headers and thus knows the table definition via the CATALOG()
stuff.


> 3. Also where are we going to provide the per-column default values?

That's a good question, I suspect we should move that knowledge to the
headers as well. Possibly using something like BKI_DEFAULT(...)?


> How does the converter script know which columns to convert to type oids,
> proc oids, etc?

I simply had that based on the underlying reg* type. I.e. if a column
was regtype the script would map it to type oids and so on.  That
required some type changes, which does have some compatibility concerns.


> Is it going to do any data validation beyond that, and if so on what basis?

Hm, not sure if we really need something.


> 4. What will we do about the #define's that some of the .h files provide
> for (some of) their object OIDs?  I assume that we want to move in the
> direction of autogenerating those macros a la fmgroids.h, but this needs
> a concrete spec as well.

I suspect at least type oids we'll continue to have to maintain
manually. A good number of things rely on the builtin type oids being
essentially stable.


> > If we can generalize this to other catalogs, then that will be good, but 
> > my inclination is to handle the elephant in the room (pg_proc.h) and 
> > worry about the gnats later.
> 
> I think we want to do them all.

+1


Greetings,

Andres Freund



Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Andres Freund
Дата:
On 2016-11-13 11:23:09 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2016-11-13 00:20:22 -0500, Peter Eisentraut wrote:
> >> Then we're not very far away from just using CREATE FUNCTION SQL commands.
>
> > Well, those do a lot of syscache lookups, which in turn do lookups for
> > functions...
>
> We can't use CREATE FUNCTION as the representation in the .bki file,
> because of the circularities involved (you can't fill pg_proc before
> pg_type nor vice versa).  But I think Peter was suggesting that the
> input to the bki-generator script could look like CREATE commands.
> That's true, but I fear it would greatly increase the complexity
> of the script for not much benefit.

It'd also be very pg_proc specific, which isn't where I think this
should go..

Greetings,

Andres Freund



Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2016-11-13 11:23:09 -0500, Tom Lane wrote:
>> We can't use CREATE FUNCTION as the representation in the .bki file,
>> because of the circularities involved (you can't fill pg_proc before
>> pg_type nor vice versa).  But I think Peter was suggesting that the
>> input to the bki-generator script could look like CREATE commands.
>> That's true, but I fear it would greatly increase the complexity
>> of the script for not much benefit.

> It'd also be very pg_proc specific, which isn't where I think this
> should go..

The presumption is that we have a CREATE command for every type of
object that we need to put into the system catalogs.  But yes, the
other problem with this approach is that you need to do a lot more
work per-catalog to build the converter script.  I'm not sure how
much of that could be imported from gram.y, but I'm afraid the
answer would be "not enough".
        regards, tom lane



Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Andres Freund
Дата:
On 2016-11-12 12:30:45 -0500, Andrew Dunstan wrote:
>
>
> On 11/11/2016 11:10 AM, Tom Lane wrote:
> > boolin: OID=1242 proname=boolin proargtypes="cstring" prorettype=bool
> > boolin: prosrc=boolin provolatile=i proparallel=s
> >
> >
>
>
> I have written a little perl script to turn the pg_proc DATA lines into
> something like the format suggested. In order to keep the space used as
> small as possible, I used a prefix based on the OID. See attached result.
>
> Still plenty of work to go, e.g. grabbing the DESCR lines, and turning this
> all back into DATA/DESCR lines, but I wanted to get this out there before
> going much further.
>
> The defaults I used are below (commented out keys are not defaulted, they
> are just there for completeness).

In the referenced thread I'd started to work on something like this,
until other people also said they'd be working on it.  I chose a
different output format (plain Data::Dumper), but I'd added the parsing
of DATA/DESCR and such to genbki.

Note that I found that initdb performance is greatly increased *and*
legibility is improvided, if types and such in the data files are
expanded, and converted to their oids when creating postgres.bki.

I basically made genbki/catalog.pm accept text whenever a column is of
type regtype/regprocedure/. To then make use of that I converted a bunch
of plain oid columns to their their reg* equivalent. That's also nice
for just plain qureying of the catalogs ;)

I don't think the code is going to be much use for you directlky, but it
might be worthwhile to crib some stuff from the 0002 of the attached
patches (based on 74811c4050921959d54d42e2c15bb79f0e2c37f3).

Greetings,

Andres Freund

Вложения

Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Andrew Dunstan
Дата:

On 11/13/2016 11:11 AM, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> I'm not convinced the line prefix part is necessary, though. What I'm
>> thinking of is something like this:
>> PROCDATA( oid=1242 name=boolin isstrict=t volatile=i parallel=s nargs=1
>>       rettype=bool argtypes="cstring" src=boolin );
> We could go in that direction too, but the apparent flexibility to split
> entries into multiple lines is an illusion, at least if you try to go
> beyond a few lines; you'd end up with duplicated line sequences in
> different entries and thus ambiguity for patch(1).  I don't have any
> big objection to the above, but it's not obviously better either.


Yeah, I looked and there are too many cases where the name would be 
outside the normal 3 lines of context.


>
> Some things we should try to resolve before settling definitively on
> a data representation:
>
> 1. Are we going to try to keep these things in the .h files, or split
> them out?  I'd like to get them out, as that eliminates both the need
> to keep the things looking like macro calls, and the need for the data
> within the macro call to be at least minimally parsable as C.


That would work fine for pg_proc.h, less so for pg_type.h where we have 
a whole lot of
   #define FOOOID nn

directives in among the data lines. Moving these somewhere remote from 
the catalog lines they relate to seems like quite a bad idea.


>
> 2. Andrew's example above implies some sort of mapping between the
> keywords and the actual column names (or at least column positions).
> Where and how is that specified?


There are several possibilities. The one I was leaning towards was to 
parse out the Anum_pg_foo_* definitions.


>
> 3. Also where are we going to provide the per-column default values?
> How does the converter script know which columns to convert to type oids,
> proc oids, etc?  Is it going to do any data validation beyond that, and
> if so on what basis?


a) something like DATA_DEFAULTS( foo=bar );
b) something like DATA_TYPECONV ( rettype argtypes allargtypes );


Hadn't thought about procoids, but something similar.

>
> 4. What will we do about the #define's that some of the .h files provide
> for (some of) their object OIDs?  I assume that we want to move in the
> direction of autogenerating those macros a la fmgroids.h, but this needs
> a concrete spec as well.  If we don't want this change to result in a big
> hit to the source code, we're probably going to need to be able to specify
> the macro names to generate in the data files.


Yeah, as I noted above it's a bit messy,


>
> 5. One of the requirements that was mentioned in previous discussions
> was to make it easier to add new columns to catalogs.  This format
> does that only to the extent that you don't have to touch entries that
> can use the default value for such a column.  Is that good enough, and
> if not, what might we be able to do to make it better?


I think it is good enough, at least for a first cut.

>
>> I'd actually like to roll up the DESCR lines in pg_proc.h into this too,
>> they strike me as a bit of a wart. But I'm flexible on that.
> +1, if we can come up with a better syntax.  This together with the
> OID-macro issue suggests that there will be items in each data entry that
> correspond to something other than columns of the target catalog.  But
> that seems fine.
>
>> If we can generalize this to other catalogs, then that will be good, but
>> my inclination is to handle the elephant in the room (pg_proc.h) and
>> worry about the gnats later.
> I think we want to do them all.  pg_proc.h is actually one of the easier
> catalogs to work on presently, IMO, because the only kind of
> cross-references it has are type OIDs.  Things like pg_amop are a mess.
> And I really don't want to be dealing with multiple notations for catalog
> data.  Also I think this will be subject to Polya's paradox: designing a
> general solution will be easier and cleaner than a hack that works only
> for one catalog.


I don't know that we need to handle everything at once, as long as the 
solution is sufficiently general.



cheers

andrew



Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Tom Lane
Дата:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 11/13/2016 11:11 AM, Tom Lane wrote:
>> 1. Are we going to try to keep these things in the .h files, or split
>> them out?  I'd like to get them out, as that eliminates both the need
>> to keep the things looking like macro calls, and the need for the data
>> within the macro call to be at least minimally parsable as C.

> That would work fine for pg_proc.h, less so for pg_type.h where we have 
> a whole lot of
>     #define FOOOID nn
> directives in among the data lines. Moving these somewhere remote from 
> the catalog lines they relate to seems like quite a bad idea.

We certainly don't want multiple files to be sources of truth for that.
What I was anticipating is that those #define's would also be generated
from the same input files, much as fmgroids.h is handled today.  We
could imagine driving the creation of a macro off an additional, optional
field in the data entries, say "macro=FOOOID", if we want only selected
entries to have #defines.  Or we could do like we do with pg_proc.h and
generate macros for everything according to some fixed naming rule.
I could see approaching pg_type that way, but am less excited about
pg_operator, pg_opclass, etc, where we only need macros for a small
fraction of the entries.

>> 2. Andrew's example above implies some sort of mapping between the
>> keywords and the actual column names (or at least column positions).
>> Where and how is that specified?

> There are several possibilities. The one I was leaning towards was to 
> parse out the Anum_pg_foo_* definitions.

I'm okay with that if the field labels used in the data entries are to be
exactly the same as the column names.  Your example showed abbreviated
names (omitting "pro"), which is something I doubt we want to try to
hard-wire a rule for.  Also, if we are going to abbreviate at all,
I think it might be useful to abbreviate *a lot*, say like "v" for
"provolatile", and that would be something that ought to be set up with
some explicit manually-provided declarations.

>> 3. Also where are we going to provide the per-column default values?
>> How does the converter script know which columns to convert to type oids,
>> proc oids, etc?  Is it going to do any data validation beyond that, and
>> if so on what basis?

> a) something like DATA_DEFAULTS( foo=bar );
> b) something like DATA_TYPECONV ( rettype argtypes allargtypes );

I'm thinking a bit about per-column declarations in the input file,
along the line of this for provolatile:

declare v col=15 type=char default='v'

Some of those items could be gotten out of pg_proc.h, but not all.
I guess a second alternative would be to add the missing info to
pg_proc.h and have the conversion script parse it out of there.

>> I think we want to do them all.  pg_proc.h is actually one of the easier
>> catalogs to work on presently, IMO, because the only kind of
>> cross-references it has are type OIDs.  Things like pg_amop are a mess.
>> And I really don't want to be dealing with multiple notations for catalog
>> data.  Also I think this will be subject to Polya's paradox: designing a
>> general solution will be easier and cleaner than a hack that works only
>> for one catalog.

> I don't know that we need to handle everything at once, as long as the 
> solution is sufficiently general.

Well, we could convert the catalogs one at a time if that seems useful,
but I don't want to be rewriting the bki-generation script repeatedly.
        regards, tom lane



Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Robert Haas
Дата:
On Sun, Nov 13, 2016 at 9:48 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
> I'm not convinced the line prefix part is necessary, though. What I'm
> thinking of is something like this:
>
> PROCDATA( oid=1242 name=boolin isstrict=t volatile=i parallel=s nargs=1
>     rettype=bool argtypes="cstring" src=boolin );

I liked Tom's format a lot better.  If we put this in a separate file
rather than in the header, which I favor, the PROCDATA stuff is just
noise.  On the other hand, having the name as the first thing on the
line seems *excellent* for readability.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Greg Stark
Дата:
On Tue, Nov 15, 2016 at 4:50 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sun, Nov 13, 2016 at 9:48 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> I'm not convinced the line prefix part is necessary, though. What I'm
>> thinking of is something like this:
>>
>> PROCDATA( oid=1242 name=boolin isstrict=t volatile=i parallel=s nargs=1
>>     rettype=bool argtypes="cstring" src=boolin );
>
> I liked Tom's format a lot better.  If we put this in a separate file
> rather than in the header, which I favor, the PROCDATA stuff is just
> noise.  On the other hand, having the name as the first thing on the
> line seems *excellent* for readability.


Just throwing this out there....

It would be neat if the file format was precisely a tab or comma
separated file suitable for loading into the appropriate table with
COPY or loading into a spreadsheet. Then we might be able to maintain
it by editing the table using SQL updates and/or other tools without
having to teach them a particular input format.

The trick would then be to have a preprocessing step in the build
which loaded the CSV/TSV files into hash tables and replaced all the
strings or other tokens with OIDs and magic values.


-- 
greg



Re: Re: Do we need use more meaningful variables to replace 0 in catalog head files?

От
Tom Lane
Дата:
Greg Stark <stark@mit.edu> writes:
> Just throwing this out there....

> It would be neat if the file format was precisely a tab or comma
> separated file suitable for loading into the appropriate table with
> COPY or loading into a spreadsheet.

Actually, I'd say that's a very accurate description of what we DO NOT
want.  That has all of the disadvantages of the DATA-line format, and
more besides, namely that we don't even get any macro-substitution-like
abilities.  The right way to think about this, IMO, is that we want to
abstract the representation as much as we easily can.  We definitely
need a concept of default values for omitted columns, and we need at least
some limited ability to insert symbolic values that will be resolved at
compile or initdb time (see FLOAT8PASSBYVAL and PGUID for existing
examples in that line).  And we want symbolic representations for OID
references, whether the associated column is declared as reg-something
or just a plain OID column.  (I don't want to end up having to invent
a reg-something type for every system catalog, so the idea that was
mentioned upthread of having that be driven off the declared column
type seems like a nonstarter to me, even if we were willing to take
the compatibility hit of changing the declared column types of a lot
of system catalog columns.)

Now, some of that could be gotten by brute force in a CSV-type file
format, but I do not see how replacing

DATA(insert OID = 1242 (  boolin           PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 16 "2275" _null_ _null_ _null_
_null__null_ boolin _null_ _null_ _null_ )); 

with

1242,boolin,PGNSP,PGUID,internal,1,0,0,0,f,f,f,f,t,f,i,s,1,0,bool,{cstring},,,,,,boolin,,,

is any real improvement --- it certainly isn't making it any more readily
editable --- and replacing most of those fields with some spelling of
"default" isn't much better.

I follow the point about wishing that you could do bulk transformations in
some kind of SQL environment, but I think that direction leads to the same
sort of fruitless discussions we've had about adopting tooling for images
in the SGML docs.  Namely that any tooling you do like that is probably
going to have a hard time producing reproducible reductions to text form,
which is going to create issues when reviewing and when tracking git
changes.  I think our reference representation needs to be what's in git,
not some theoretically-equivalent form in a database somewhere.
        regards, tom lane



Re: [HACKERS] Do we need use more meaningful variables to replace 0in catalog head files?

От
Peter Eisentraut
Дата:
On 11/13/16 12:19 PM, Tom Lane wrote:
>> It'd also be very pg_proc specific, which isn't where I think this
>> should go..
> 
> The presumption is that we have a CREATE command for every type of
> object that we need to put into the system catalogs.  But yes, the
> other problem with this approach is that you need to do a lot more
> work per-catalog to build the converter script.  I'm not sure how
> much of that could be imported from gram.y, but I'm afraid the
> answer would be "not enough".

I'd think about converting about 75% of what is currently in the catalog
headers into some sort of built-in extension that is loaded via an SQL
script.  There are surely some details about that that would need to be
worked out, but I think that's a more sensible direction than inventing
another custom format.

I wonder how big the essential bootstrap set of pg_proc.h would be and
how manageable the file would be if it were to be reduced like that.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services