Обсуждение: documentation structure

Поиск
Список
Период
Сортировка

documentation structure

От
Robert Haas
Дата:
I was looking at the documentation index this morning[1], and I can't
help feeling like there are some parts of it that are over-emphasized
and some parts that are under-emphasized. I'm not sure what we can do
about this exactly, but I thought it worth writing an email and seeing
what other people think.

The two sections of the documentation that seem really
under-emphasized to me are the GUC documentation and the SQL
reference. The GUC documentation is all buried under "20. Server
Configuration" and the SQL command reference is under "I. SQL
commands". For reasons that I don't understand, all chapters except
for those in "VI. Reference" are numbered, but the chapters in that
section have Roman numerals instead.

I don't know what other people's experience is, but for me, wanting to
know what a command does or what a setting does is extremely common.
Therefore, I think these chapters are disproportionately important and
should be emphasized more. In the case of the GUC reference, one idea
I have is to split up "III. Server Administration". My proposal is
that we divide it into three sections. The first would be called "III.
Server Installation" and would cover chapters 16 (installation from
binaries) through 19 (server setup and operation). The second would be
called "IV. Server Configuration" -- so every section that's currently
a subsection of "server configuration" would become a top-level
chapter. The third division would be "V. Server Administration," and
would cover the current chapters 21-33. This is probably far from
perfect, but it seems like a relatively simple change and better than
what we have now.

I don't know what to do about "I. SQL commands". It's obviously
impractical to promote that to a top-level section, because it's got a
zillion sub-pages which I don't think we want in the top-level
documentation index. But having it as one of several unnumbered
chapters interposed between 51 and 52 doesn't seem great either.

The stuff that I think is over-emphasized is as follows: (a) chapters
1-3, the tutorial; (b) chapters 4-6, which are essentially a
continuation of the tutorial, and not at all similar to chapters 8-11
which are chalk-full of detailed technical information; (c) chapters
43-46, one per procedural language; perhaps these could just be
demoted to sub-sections of chapter 42 on procedural languages; (d)
chapters 47 (server programming interface), 50 (replication progress
tracking), and 51 (archive modules), all of which are important to
document but none of which seem important enough to put them in the
top-level documentation index; and (e) large parts of section "VII.
Internals," which again contain tons of stuff of very marginal
interest. The first ~4 chapters of the internals section seem like
they might be mainstream enough to justify the level of prominence
that we give them, but the rest has got to be of interest to a tiny
minority of readers.

I think it might be possible to consolidate the internals section by
grouping a bunch of existing entries together by category. Basically,
after the first few chapters, you've got stuff that is of interest to
C programmers writing core or extension code; and you've got
explainers on things like GEQO and index op-classes and support
functions which might be of interest even to non-programmers. I think
for example that we don't need separate top-level chapters on writing
procedural language handlers, FDWs, tablesample methods, custom scan
providers, table access methods, index access methods, and WAL
resource managers. Some or all of those could be grouped under a
single chapter, perhaps, e.g. Using PostgreSQL Extensibility
Interfaces.

Thoughts? I realize that this topic is HIGHLY prone to ENDLESS
bikeshedding, and it's inevitable that not everybody is going to
agree. But I hope we can agree that it's completely silly that it's
vastly easier to find the documentation about the backup manifest
format than it is to find the documentation on CREATE TABLE or
shared_buffers, and if we can agree on that, then perhaps we can agree
on some way to make things better.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

[1] https://www.postgresql.org/docs/16/index.html



Re: documentation structure

От
Matthias van de Meent
Дата:
On Mon, 18 Mar 2024 at 15:12, Robert Haas <robertmhaas@gmail.com> wrote:

I'm not going into detail about the other docs comments, I don't have
much of an opinion either way on the mentioned sections. You make good
arguments; yet I don't usually use those sections of the docs but
rather do code searches.

> I don't know what to do about "I. SQL commands". It's obviously
> impractical to promote that to a top-level section, because it's got a
> zillion sub-pages which I don't think we want in the top-level
> documentation index. But having it as one of several unnumbered
> chapters interposed between 51 and 52 doesn't seem great either.

Could "SQL Commands" be a top-level construct, with subsections for
SQL/DML, SQL/DDL, SQL/Transaction management, and PG's
extensions/administrative/misc features? I sometimes find myself
trying to mentally organize what SQL commands users can use vs those
accessible to database owners and administrators, which is not
currently organized as such in the SQL Commands section.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)



Re: documentation structure

От
Robert Haas
Дата:
On Mon, Mar 18, 2024 at 10:55 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
> > I don't know what to do about "I. SQL commands". It's obviously
> > impractical to promote that to a top-level section, because it's got a
> > zillion sub-pages which I don't think we want in the top-level
> > documentation index. But having it as one of several unnumbered
> > chapters interposed between 51 and 52 doesn't seem great either.
>
> Could "SQL Commands" be a top-level construct, with subsections for
> SQL/DML, SQL/DDL, SQL/Transaction management, and PG's
> extensions/administrative/misc features? I sometimes find myself
> trying to mentally organize what SQL commands users can use vs those
> accessible to database owners and administrators, which is not
> currently organized as such in the SQL Commands section.

Yeah, I wondered about that, too. Or for example you could group all
CREATE commands together, all ALTER commands together, all DROP
commands together, etc. But I can't really see a future in such
schemes, because having a single page that links to the reference
documentation for every single command we have in alphabetical order
is incredibly handy, or at least I have found it so. So my feeling -
at least at present - is that it's more fruitful to look into cutting
down the amount of clutter that appears in the top-level documentation
index, and maybe finding ways to make important sections like the SQL
reference more prominent.

Given how much documentation we have, it's just not going to be
possible to make everything that matters conveniently visible at the
top level. I think if people have to click down a level for the SQL
reference, that's fine, as long as the link they need to click on is
reasonably visible. What annoys me about the present structure is that
it isn't. You don't get any visual clue that the "SQL Commands" page
with ~100 subpages is more important than "51. Archive Modules" or
"33. Regression Tests" or "58. Writing a Procedural Language Handler,"
but it totally is.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: documentation structure

От
Roberto Mello
Дата:
On Mon, Mar 18, 2024 at 10:12 AM Robert Haas <robertmhaas@gmail.com> wrote:
I was looking at the documentation index this morning[1], and I can't
help feeling like there are some parts of it that are over-emphasized
and some parts that are under-emphasized. I'm not sure what we can do
about this exactly, but I thought it worth writing an email and seeing
what other people think.

I agree, and my usage patterns of the docs are similar.

As the project progresses and more features are added and tacked on to existing docs, things can get
murky or buried. I imagine that web access and search logs could paint a picture of documentation usage.

I don't know what other people's experience is, but for me, wanting to
know what a command does or what a setting does is extremely common.
Therefore, I think these chapters are disproportionately important and
should be emphasized more. In the case of the GUC reference, one idea

+1

I have is to split up "III. Server Administration". My proposal is
that we divide it into three sections. The first would be called "III.
Server Installation" and would cover chapters 16 (installation from
binaries) through 19 (server setup and operation). The second would be
called "IV. Server Configuration" -- so every section that's currently
a subsection of "server configuration" would become a top-level
chapter. The third division would be "V. Server Administration," and
would cover the current chapters 21-33. This is probably far from
 
I like all of those.
 
I don't know what to do about "I. SQL commands". It's obviously
impractical to promote that to a top-level section, because it's got a
zillion sub-pages which I don't think we want in the top-level
documentation index. But having it as one of several unnumbered
chapters interposed between 51 and 52 doesn't seem great either.

I think it'd be easier to read if current "VI. Reference" came right after "Server Administration",
ahead of "Client Interfaces" and "Server Programming", which are of interest to a much smaller
subset of users.

Also if the subchapters were numbered like the rest of them. I don't think the roman numerals are
particularly helpful.

The stuff that I think is over-emphasized is as follows: (a) chapters
1-3, the tutorial; (b) chapters 4-6, which are essentially a
...

Also +1

Thoughts? I realize that this topic is HIGHLY prone to ENDLESS
bikeshedding, and it's inevitable that not everybody is going to
agree. But I hope we can agree that it's completely silly that it's
vastly easier to find the documentation about the backup manifest
format than it is to find the documentation on CREATE TABLE or
shared_buffers, and if we can agree on that, then perhaps we can agree
on some way to make things better.

Impossible to please everyone, but I'm sure we can improve things.

I've contributed to different parts of the docs over the years, and would be happy
to help with this work.

Roberto

Re: documentation structure

От
Laurenz Albe
Дата:
On Mon, 2024-03-18 at 10:11 -0400, Robert Haas wrote:
> The two sections of the documentation that seem really
> under-emphasized to me are the GUC documentation and the SQL
> reference. The GUC documentation is all buried under "20. Server
> Configuration" and the SQL command reference is under "I. SQL
> commands". For reasons that I don't understand, all chapters except
> for those in "VI. Reference" are numbered, but the chapters in that
> section have Roman numerals instead.

That last fact is very odd indeed and could be easily fixed.

> I don't know what other people's experience is, but for me, wanting to
> know what a command does or what a setting does is extremely common.
> Therefore, I think these chapters are disproportionately important and
> should be emphasized more. In the case of the GUC reference, one idea
> I have is to split up "III. Server Administration". My proposal is
> that we divide it into three sections. The first would be called "III.
> Server Installation" and would cover chapters 16 (installation from
> binaries) through 19 (server setup and operation). The second would be
> called "IV. Server Configuration" -- so every section that's currently
> a subsection of "server configuration" would become a top-level
> chapter. The third division would be "V. Server Administration," and
> would cover the current chapters 21-33. This is probably far from
> perfect, but it seems like a relatively simple change and better than
> what we have now.

I'm fine with splitting up "Server Administration" into three sections
like you propose.

> I don't know what to do about "I. SQL commands". It's obviously
> impractical to promote that to a top-level section, because it's got a
> zillion sub-pages which I don't think we want in the top-level
> documentation index. But having it as one of several unnumbered
> chapters interposed between 51 and 52 doesn't seem great either.

I think that both the GUCs and the SQL reference could be top-level
sections.  For the GUCs there is an obvious split in sub-chapters,
and the SQL reference could be a top-level section without any chapters
under it.

> The stuff that I think is over-emphasized is as follows: (a) chapters
> 1-3, the tutorial; (b) chapters 4-6, which are essentially a
> continuation of the tutorial, and not at all similar to chapters 8-11
> which are chalk-full of detailed technical information; (c) chapters
> 43-46, one per procedural language; perhaps these could just be
> demoted to sub-sections of chapter 42 on procedural languages; (d)
> chapters 47 (server programming interface), 50 (replication progress
> tracking), and 51 (archive modules), all of which are important to
> document but none of which seem important enough to put them in the
> top-level documentation index; and (e) large parts of section "VII.
> Internals," which again contain tons of stuff of very marginal
> interest. The first ~4 chapters of the internals section seem like
> they might be mainstream enough to justify the level of prominence
> that we give them, but the rest has got to be of interest to a tiny
> minority of readers.

I disagree that the tutorial is over-emphasized.

I also disagree that chapters 4 to 6 are a continuation of the tutorial.
Or at least, they shouldn't be.
When I am looking for a documentation reference on something like
security considerations of SECURITY DEFINER functions, my first
impulse is to look in chapter 5 (Data Definition) or in chapter 38
(Extending SQL), and I am surprised to find it discussed in the
SQL reference of CREATE FUNCTION.

Another case in point is the "Notes" section for CREATE VIEW.  Why is
that not somewhere under "Data Definition"?

For me, the reference should be terse and focused on the syntax.

Changing that is probably a lost cause by now, but I feel that we need
not encourage that development any more by playing down the earlier
chapters.

> I think it might be possible to consolidate the internals section by
> grouping a bunch of existing entries together by category. Basically,
> after the first few chapters, you've got stuff that is of interest to
> C programmers writing core or extension code; and you've got
> explainers on things like GEQO and index op-classes and support
> functions which might be of interest even to non-programmers. I think
> for example that we don't need separate top-level chapters on writing
> procedural language handlers, FDWs, tablesample methods, custom scan
> providers, table access methods, index access methods, and WAL
> resource managers. Some or all of those could be grouped under a
> single chapter, perhaps, e.g. Using PostgreSQL Extensibility
> Interfaces.

I have no strong feelings about that.

Yours,
Laurenz Albe



Re: documentation structure

От
Tom Lane
Дата:
Laurenz Albe <laurenz.albe@cybertec.at> writes:
> On Mon, 2024-03-18 at 10:11 -0400, Robert Haas wrote:
>> I don't know what to do about "I. SQL commands". It's obviously
>> impractical to promote that to a top-level section, because it's got a
>> zillion sub-pages which I don't think we want in the top-level
>> documentation index. But having it as one of several unnumbered
>> chapters interposed between 51 and 52 doesn't seem great either.

> I think that both the GUCs and the SQL reference could be top-level
> sections.  For the GUCs there is an obvious split in sub-chapters,
> and the SQL reference could be a top-level section without any chapters
> under it.

I'd be in favor of promoting all three of the "Reference" things to
the top level, except that as Robert says, it seems likely that that
would end in having a hundred individual command reference pages
visible in the topmost table of contents.  Also, if we manage to
suppress that, did we really make it any more prominent?  Not sure.

Making "SQL commands" top-level with half a dozen subsections would
solve the visibility problem, but I'm not real eager to go there,
because I foresee endless arguments about which subsection a given
command goes in.  Robert's point about wanting a single alphabetized
list is valid too (although you could imagine that being a list in an
introductory section, similar to what we have for system catalogs).

This might be a silly suggestion, but: could we just render the
"most important" chapter titles in a larger font?

            regards, tom lane



Re: documentation structure

От
Robert Haas
Дата:
On Mon, Mar 18, 2024 at 6:51 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> This might be a silly suggestion, but: could we just render the
> "most important" chapter titles in a larger font?

It's not the silliest suggestion ever -- you could have proposed
<blink>! -- but I also suspect it's not the right answer. Of course,
varying the font size can be a great way of emphasizing some things
more than others, but it doesn't usually work out well to just take a
document that was designed to be displayed in a uniform font size and
enlarge bits of text here and there. You usually want to have some
kind of overall plan of which font size is a single component.

For example, on a corporate home page, it's quite common to have two
nav bars, the larger of which has entries that correspond to the
company's product offerings and/or marketing materials, and the
smaller of which has "utility functions" like "login", "contact us",
and "search". Font size can be an effective tool for emphasizing the
relative importance of one nav bar versus the other, but you don't
start by deciding which things are going to get displayed in a larger
font. You start with an overall idea of the layout and then the font
size flows out of that.

Just riffing a bit, you could imagine adding a nav bar to our
documentation, either across the top or along the side, that is always
there on every page of the documentation and contains those links that
we want to make sure are always visible. Necessarily, these must be
limited in number. Then on the home page you could have the whole
table of contents as we do today, and you use that to navigate to
everything that isn't one of the quick links.

Or you can imagine that the home page of our documentation isn't just
a tree view like it is today; it might instead be written in paragraph
form. "Welcome to the PostgreSQL documentation! If you're new here,
check out our <link>tutorial</link>! Otherwise, you might be
interested in our <link>SQL reference</link>, our <link>configuration
reference</link>, or our <link>banana plantation</link>. If none of
those sound like what you want, check out the <link>documentation
index</link>." Obviously in order to actually work, something like
this would need to be expanded into enough paragraphs to actually
cover all of the important sections of the documentation, and probably
not mention banana plantations. Or maybe it wouldn't be just
paragraphs, but a two-column table, with each row of the table having
a main title and link in the narrower lefthand column and a blurb with
more links in the wider righthand column.

I'm sure there are a lot of other ways to do this, too. Our main
documentation page is very old-school, and there are probably a bunch
of ways to do better.

But I'm not sure how easy it would be to get agreement on something
specific, and I don't know how well our toolchain can support anything
other than what we've already got. I've also learned from painful
experience that you can't fix bad content with good markup. I think it
is worth spending some effort on trying to beat the existing format
into submission, promoting things that seem to deserve it and demoting
those that seem to deserve that. At some point, we'll probably reach a
point of diminishing returns, either because we all agree we've done
as well as we can, or because we can't agree on what else to do, and
maybe at that point the only way to improve further is with better web
design and/or a different documentation toolchain. But I think it's
fairly clear that we're not at that point now.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: documentation structure

От
Daniel Gustafsson
Дата:
> On 18 Mar 2024, at 22:40, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> On Mon, 2024-03-18 at 10:11 -0400, Robert Haas wrote:

>> For reasons that I don't understand, all chapters except
>> for those in "VI. Reference" are numbered, but the chapters in that
>> section have Roman numerals instead.
>
> That last fact is very odd indeed and could be easily fixed.

It's actually not very odd, the reference section is using <reference> elements
and we had missed the arabic numerals setting on those.  The attached fixes
that for me.  That being said, we've had roman numerals for the reference
section since forever (all the way down to the 7.2 docs online has it) so maybe
it was intentional?  Or no one managed to see it until Robert did, I've
certainly never noticed it until now.

--
Daniel Gustafsson


Вложения

Re: documentation structure

От
Tom Lane
Дата:
Daniel Gustafsson <daniel@yesql.se> writes:
> It's actually not very odd, the reference section is using <reference> elements
> and we had missed the arabic numerals setting on those.  The attached fixes
> that for me.  That being said, we've had roman numerals for the reference
> section since forever (all the way down to the 7.2 docs online has it) so maybe
> it was intentional?

I'm quite sure it *was* intentional.  Maybe it was a bad idea, but
it's not that way simply because nobody thought about it.

            regards, tom lane



Re: documentation structure

От
Andrew Dunstan
Дата:


On Mon, Mar 18, 2024 at 10:12 AM Robert Haas <robertmhaas@gmail.com> wrote:
I was looking at the documentation index this morning[1], and I can't
help feeling like there are some parts of it that are over-emphasized
and some parts that are under-emphasized. I'm not sure what we can do
about this exactly, but I thought it worth writing an email and seeing
what other people think.

The two sections of the documentation that seem really
under-emphasized to me are the GUC documentation and the SQL
reference. The GUC documentation is all buried under "20. Server
Configuration" and the SQL command reference is under "I. SQL
commands". For reasons that I don't understand, all chapters except
for those in "VI. Reference" are numbered, but the chapters in that
section have Roman numerals instead.

I don't know what other people's experience is, but for me, wanting to
know what a command does or what a setting does is extremely common.
Therefore, I think these chapters are disproportionately important and
should be emphasized more. In the case of the GUC reference, one idea
I have is to split up "III. Server Administration". My proposal is
that we divide it into three sections. The first would be called "III.
Server Installation" and would cover chapters 16 (installation from
binaries) through 19 (server setup and operation). The second would be
called "IV. Server Configuration" -- so every section that's currently
a subsection of "server configuration" would become a top-level
chapter. The third division would be "V. Server Administration," and
would cover the current chapters 21-33. This is probably far from
perfect, but it seems like a relatively simple change and better than
what we have now.

I don't know what to do about "I. SQL commands". It's obviously
impractical to promote that to a top-level section, because it's got a
zillion sub-pages which I don't think we want in the top-level
documentation index. But having it as one of several unnumbered
chapters interposed between 51 and 52 doesn't seem great either.

The stuff that I think is over-emphasized is as follows: (a) chapters
1-3, the tutorial; (b) chapters 4-6, which are essentially a
continuation of the tutorial, and not at all similar to chapters 8-11
which are chalk-full of detailed technical information; (c) chapters
43-46, one per procedural language; perhaps these could just be
demoted to sub-sections of chapter 42 on procedural languages; (d)
chapters 47 (server programming interface), 50 (replication progress
tracking), and 51 (archive modules), all of which are important to
document but none of which seem important enough to put them in the
top-level documentation index; and (e) large parts of section "VII.
Internals," which again contain tons of stuff of very marginal
interest. The first ~4 chapters of the internals section seem like
they might be mainstream enough to justify the level of prominence
that we give them, but the rest has got to be of interest to a tiny
minority of readers.

I think it might be possible to consolidate the internals section by
grouping a bunch of existing entries together by category. Basically,
after the first few chapters, you've got stuff that is of interest to
C programmers writing core or extension code; and you've got
explainers on things like GEQO and index op-classes and support
functions which might be of interest even to non-programmers. I think
for example that we don't need separate top-level chapters on writing
procedural language handlers, FDWs, tablesample methods, custom scan
providers, table access methods, index access methods, and WAL
resource managers. Some or all of those could be grouped under a
single chapter, perhaps, e.g. Using PostgreSQL Extensibility
Interfaces.

Thoughts? I realize that this topic is HIGHLY prone to ENDLESS
bikeshedding, and it's inevitable that not everybody is going to
agree. But I hope we can agree that it's completely silly that it's
vastly easier to find the documentation about the backup manifest
format than it is to find the documentation on CREATE TABLE or
shared_buffers, and if we can agree on that, then perhaps we can agree
on some way to make things better.



+many for improving the index.

My own pet docs peeve is a purely editorial one: func.sgml is a 30k line beast, and I think there's a good case for splitting out at least the larger chunks of it.

cheers

andrew

Re: documentation structure

От
Robert Haas
Дата:
On Mon, Mar 18, 2024 at 5:40 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> I also disagree that chapters 4 to 6 are a continuation of the tutorial.
> Or at least, they shouldn't be.
> When I am looking for a documentation reference on something like
> security considerations of SECURITY DEFINER functions, my first
> impulse is to look in chapter 5 (Data Definition) or in chapter 38
> (Extending SQL), and I am surprised to find it discussed in the
> SQL reference of CREATE FUNCTION.

I looked at this a bit more closely. There's actually a lot of
detailed technical information in chapters 4 and 5, but chapter 6 is
extremely short and mostly recapitulates chapter 2.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: documentation structure

От
Robert Haas
Дата:
On Tue, Mar 19, 2024 at 5:39 PM Andrew Dunstan <andrew@dunslane.net> wrote:
> +many for improving the index.

Here's a series of four patches. Taken together, they cut down the
number of numbered chapters from 76 to 68. I think we could easily
save that much again if I wrote a few more patches along similar
lines, but I'm posting these first to see what people think.

0001 removes the "Installation from Binaries" chapter. The whole thing
is four sentences. I moved the most important information into the
"Installation from Source Code" chapter and retitled it
"Installation".

0002 removes the "Monitoring Disk Usage" chapter by folding it into
the immediately-preceding "Monitoring Database Activity" chapter. I
kind of feel like the "Monitoring Disk Usage" chapter might be in need
of a bigger rewrite or just outright removal, but there's surely not
enough content here to justify making it a top-level chapter.

0003 merges all of the "Internals" chapters whose names are the names
of built-in index access methods (Btree, Gin, etc.) into a single
chapter called "Built-In Index Access Methods". All of these chapters
have a very similar structure and none of them are very long, so it
makes a lot of sense, at least in my mind, to consolidate them into
one.

0004 merges the "Generic WAL Records" and "Custom WAL Resource
Managers" chapter together, creating a new chapter called "Write Ahead
Logging for Extensions".

Overall, I think this achieves a minor but pleasant level of
de-cluttering of the index. It's going to take a lot more than one
morning's work to produce a major improvement, but at least this is
something.

--
Robert Haas
EDB: http://www.enterprisedb.com

Вложения

Re: documentation structure

От
Bruce Momjian
Дата:
On Wed, Mar 20, 2024 at 12:43:08PM -0400, Robert Haas wrote:
> Overall, I think this achieves a minor but pleasant level of
> de-cluttering of the index. It's going to take a lot more than one
> morning's work to produce a major improvement, but at least this is
> something.

I think this kind of doc structure review is long overdue.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.



Re: documentation structure

От
Robert Haas
Дата:
On Wed, Mar 20, 2024 at 1:35 PM Bruce Momjian <bruce@momjian.us> wrote:
> On Wed, Mar 20, 2024 at 12:43:08PM -0400, Robert Haas wrote:
> > Overall, I think this achieves a minor but pleasant level of
> > de-cluttering of the index. It's going to take a lot more than one
> > morning's work to produce a major improvement, but at least this is
> > something.
>
> I think this kind of doc structure review is long overdue.

Thanks, Bruce!

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: documentation structure

От
Alvaro Herrera
Дата:
On 2024-Mar-20, Robert Haas wrote:

> 0003 merges all of the "Internals" chapters whose names are the names
> of built-in index access methods (Btree, Gin, etc.) into a single
> chapter called "Built-In Index Access Methods". All of these chapters
> have a very similar structure and none of them are very long, so it
> makes a lot of sense, at least in my mind, to consolidate them into
> one.

I think you can achieve this with a much smaller patch that just changes
the outer tag in each file so that each file is a <sect1>, then create a
single file that includes all of these plus an additional outer tag for
the <chapter> (or maybe just add the <chapter> in postgres.sgml).  This
has the advantage that each AM continues to be a separate single file,
and you still have your desired structure.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/



Re: documentation structure

От
Robert Haas
Дата:
On Wed, Mar 20, 2024 at 5:05 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> I think you can achieve this with a much smaller patch that just changes
> the outer tag in each file so that each file is a <sect1>, then create a
> single file that includes all of these plus an additional outer tag for
> the <chapter> (or maybe just add the <chapter> in postgres.sgml).  This
> has the advantage that each AM continues to be a separate single file,
> and you still have your desired structure.

Right, that could also be done, and not just for 0003. I just wasn't
sure that was the right approach. It would mean that the division of
the SGML into files continues to reflect the original chapter
divisions rather than the current ones forever. In the short run
that's less churn, less back-patching pain, etc.; but in the long term
it means you've got relics of a structure that doesn't exist any more
sticking around forever.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: documentation structure

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Mar 20, 2024 at 5:05 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>> I think you can achieve this with a much smaller patch that just changes
>> the outer tag in each file so that each file is a <sect1>, then create a
>> single file that includes all of these plus an additional outer tag for
>> the <chapter> (or maybe just add the <chapter> in postgres.sgml).  This
>> has the advantage that each AM continues to be a separate single file,
>> and you still have your desired structure.

> Right, that could also be done, and not just for 0003. I just wasn't
> sure that was the right approach. It would mean that the division of
> the SGML into files continues to reflect the original chapter
> divisions rather than the current ones forever. In the short run
> that's less churn, less back-patching pain, etc.; but in the long term
> it means you've got relics of a structure that doesn't exist any more
> sticking around forever.

I'd say that a separate file per AM is a good thing regardless.
Elsewhere in this same thread are grumblings about how big func.sgml
is; why would you think it good to start down that same path for the
AM documentation?

            regards, tom lane



Re: documentation structure

От
Robert Haas
Дата:
On Wed, Mar 20, 2024 at 5:25 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I'd say that a separate file per AM is a good thing regardless.
> Elsewhere in this same thread are grumblings about how big func.sgml
> is; why would you think it good to start down that same path for the
> AM documentation?

Well, I suppose I thought it was a good idea because (1) we don't seem
to have any existing precedent for file-per-sect1 rather than
file-per-chapter and (2) all of the per-AM files combined are less
than 20% of the size of func.sgml.

But, OK, if you want to establish a new paradigm here, sure. I see two
ways to do it. We can either put the <chapter> tag directly in
postgres.sgml, or I can still create a new indextypes.sgml and put
&btree; etc. inside of it. Which way do you prefer?

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: documentation structure

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> Well, I suppose I thought it was a good idea because (1) we don't seem
> to have any existing precedent for file-per-sect1 rather than
> file-per-chapter and (2) all of the per-AM files combined are less
> than 20% of the size of func.sgml.

We have done (1) in places, eg. json.sgml, array.sgml,
rangetypes.sgml, rowtypes.sgml, and the bulk of extend.sgml is split
out into xaggr, xfunc, xindex, xoper, xtypes.  I'd be the first to
concede it's a bit haphazard, but it's not like there's no precedent.

As for (2), func.sgml likely should have been split years ago.

> But, OK, if you want to establish a new paradigm here, sure. I see two
> ways to do it. We can either put the <chapter> tag directly in
> postgres.sgml, or I can still create a new indextypes.sgml and put
> &btree; etc. inside of it. Which way do you prefer?

I'd follow the extend.sgml precedent: have a file corresponding to the
chapter and containing any top-level text we need, then that includes
a file per sect1.

            regards, tom lane



Re: documentation structure

От
Robert Haas
Дата:
On Thu, Mar 21, 2024 at 9:38 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I'd follow the extend.sgml precedent: have a file corresponding to the
> chapter and containing any top-level text we need, then that includes
> a file per sect1.

OK, here's a new patch set. I've revised 0003 and 0004 to use this
approach, and I've added a new 0005 that does essentially the same
thing for the PL chapters.

0001 and 0002 are changed. Should 0002 use the include-an-entity
approach as well?

--
Robert Haas
EDB: http://www.enterprisedb.com

Вложения

Re: documentation structure

От
Robert Haas
Дата:
On Thu, Mar 21, 2024 at 10:31 AM Robert Haas <robertmhaas@gmail.com> wrote:
> 0001 and 0002 are changed. Should 0002 use the include-an-entity
> approach as well?

Woops. I meant to say that 0001 and 0002 are *unchanged*.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: documentation structure

От
Alvaro Herrera
Дата:
On 2024-Mar-21, Robert Haas wrote:

> On Thu, Mar 21, 2024 at 9:38 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > I'd follow the extend.sgml precedent: have a file corresponding to the
> > chapter and containing any top-level text we need, then that includes
> > a file per sect1.
> 
> OK, here's a new patch set. I've revised 0003 and 0004 to use this
> approach, 

Great, thanks.  Looking at the index in the PDF after (only) 0003, we
now have this structure

62. Table Access Method Interface Definition ....................................................... 2475
63. Index Access Method Interface Definition ....................................................... 2476
63.1. Basic API Structure for Indexes .......................................................... 2476
63.2. Index Access Method Functions .......................................................... 2479
63.3. Index Scanning ................................................................................ 2485
63.4. Index Locking Considerations ............................................................. 2486
63.5. Index Uniqueness Checks .................................................................. 2487
63.6. Index Cost Estimation Functions ......................................................... 2489
64. Generic WAL Records ................................................................................. 2492
65. Custom WAL Resource Managers ................................................................. 2494
66. Built-in Index Access Methods ...................................................................... 2496

which is a bit odd: why are the two WAL chapters in the middle of the
chapters 62 and 63 talking about AMs?  Maybe put 66 right after 63
instead.    Also, is it really better to have 62/63 first and 66
later?  It sounds to me like 66 is more user-oriented and the other two
are developer-oriented, so I'm inclined to suggest putting them the
other way around, but I'm not really sure about this.  (Also, starting
chapter 66 straight with 66.1 BTree without any intro text looks a bit
odd; maybe one short introductory paragraph is sufficient?)

> and I've added a new 0005 that does essentially the same
> thing for the PL chapters.

I was looking at the PL chapters earlier today too, wondering whether
this would be valuable; but I worry that there are too many
sub-sub-sections there, so it could end up being a bit messy.  I didn't
look at the resulting output though.

> 0001 and 0002 are [un]changed. Should 0002 use the include-an-entity
> approach as well?

Shrug, I wouldn't, doesn't look worth it.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"No es bueno caminar con un hombre muerto"



Re: documentation structure

От
Robert Haas
Дата:
On Thu, Mar 21, 2024 at 12:43 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> which is a bit odd: why are the two WAL chapters in the middle of the
> chapters 62 and 63 talking about AMs?  Maybe put 66 right after 63
> instead.    Also, is it really better to have 62/63 first and 66
> later?  It sounds to me like 66 is more user-oriented and the other two
> are developer-oriented, so I'm inclined to suggest putting them the
> other way around, but I'm not really sure about this.  (Also, starting
> chapter 66 straight with 66.1 BTree without any intro text looks a bit
> odd; maybe one short introductory paragraph is sufficient?)

I had similar thoughts. I think that we should consider some changes
to the chapter ordering, but I didn't want to try to change too many
things all at once, since somebody only has to hate one thing about
the patch to sink the whole thing.

But since you brought it up, what I've been thinking about is that the
whole division into parts might need to be rethought a bit. I feel
like "VII. Internals" is a mix of about four different kinds of
content. First, the biggest portion of it is information about
developing certain kinds of C extensions -- all the "Writing a
Whatever" chapters, the "Whatever Access Method Interface Definition"
chapters, "Generic WAL Records", "Custom WAL Resource Managers", and
all the index-related chapters. Second, we've got some information
that I think is mainly of interest to people developing PostgreSQL
itself, namely, "PostgreSQL Coding Conventions", "Native Language
Support", and "System Catalog Declarations and Initial Contents". You
*might* care about these if you're developing extensions, or even if
you're not a developer at all, but then again you might not. Third,
we've got some reference material, namely "System Catalogs", "System
Views", and perhaps "Frontend/Backend Protocol". I distinguish these
from the previous two categories because I think you could care about
this stuff as a random user, or a developer of products that
interoperate with PostgreSQL but don't link with it or share any
common code. Finally, there's just a bunch of random bits and bobs
that we've decided to document here for one reason or another, either
because somebody else did a bunch of the work, like "Overview of
PostgreSQL Internals", or because some developer did something and
someone said "hey, that should be documented!", like "Backup Manifest
Format."

So my first thought is to pull out the stuff that's mainly for
PostgreSQL core developers and move it to an appendix. I propose we
create an appendix called "Developer Guide" and that it absorb the
existing appendix I, "The Source Code Repository", possibly all or
part of appendix J, "Documentation", and the chapters from "VII.
Internals" that are mostly of developer interest. I think that
possibly some of what's in "J. Documentation" should actually be moved
into the "Installation" chapter where we talk about building the
source code, because it doesn't make much sense to document the build
tool chain in one part of the documentation and the documentation
toolchain someplace else entirely, but "J.6. Style Guide" is developer
information, not build instructions.

My second thought is that the stuff from "VII. Internals" that I
categorized as reference material should move into section "VI.
Reference". I think we should also consider moving appendix F,
"Additional Supplied Modules and Extensions," and appendix G,
"Additional Supplied Programs" to the reference section. However,
prior to doing that, I think that appendix G needs some cleanup or
possibly we should just find a way to remove it outright. We're
shipping an appendix G with two major subsections, one of which is
completely empty and has been since v14, and the other of which
contains only two things. I think we should just remove the empty
sub-section entirely. I'm not sure what to do about the only with only
2 things in it (vacuumlo and oid2name). Would it be a bad idea to just
merge those bits into the client applications reference section?

My third thought is about what to do with the material in "VII.
Internals" that is about developing specific kind of extensions, like,
say, "Writing a Foreign Data Wrapper." If you look at "V. Server
Programming", you see that we actually have some very similar sections
there, like chapter 47, "Background Worker Processes" and chapter 50,
"Archive Modules". I think it's not very clear in the current
structure where topics that are relevant for extension developers
should go under "Server Programming" or under "Internals", and it
looks to me like different people have just done different things and
it's all a bit haphazard. One idea is to decide that the correct
answer is "Server Programming" and move all of the internals chapters
that fall into this category over to there. I don't think this is the
right answer, because that section also contains information about a
bunch of stuff that's strictly SQL-level, like rules and triggers. So
what I think we should do is create either [A] a new top-level part,
just before or just after what's currently called "VI. Reference" or
[B] a new appendix or [C] a new "Reference" section, that is
specifically for documentation of server APIs intended for extension
use. And then all the chapters under "V. Server Programming" or "VII.
Internals" that are documenting APIs would get moved there.

If we adopted all of the patches that I proposed in my previous email
and all of the suggestions that I just dropped in the preceding wall
of text, then the internals section would be left with only these
chapters:

- Overview of PostgreSQL Internals
- Genetic Query Optimizer
- Database Physical Storage
- Transaction Processing
- How the Planner Uses Statistics
- Backup Manifest Format

A lot of those chapters are pretty dated and maybe not that useful in
2024, but this email is already long enough and full of
sufficiently-aggressive proposals that I'm not inclined to opine too
much further on what we might want to do if and when we've done
everything I just proposed. For now, suffice it to say that I think we
might choose to either rewrite and expand some of these to make them
more useful, or demote some of them to some less prominent place in
the documentation, or just delete some of them entirely; but we can
figure that out if and when we get there.

> I was looking at the PL chapters earlier today too, wondering whether
> this would be valuable; but I worry that there are too many
> sub-sub-sections there, so it could end up being a bit messy.  I didn't
> look at the resulting output though.

That thought occurred to me as well. I certainly think that if we
perform the sort of aggressive purging of the top-level index for
which I'm advocating, there are going to be some people who are grumpy
that the stuff they're trying to find isn't where it used to be, or
who legitimately had trouble finding the content that they want. It
seems to me that if you're looking for the documentation on one of the
individual procedural languages and you don't see it, you'll try
clicking on "Procedural Languages" and then you'll find that it's now
under there. Now, what's maybe a bit unfortunate is that chapter
indexes only show two levels of section headings, and the PL/pgsql
chapter in particular has a lot of <sect2> items. If those get demoted
to <sect3> as I am proposing, they won't show up the chapter index any
more. I do think there's a possibility that this could be a problem
for someone.

On the other hand, the table of contents for
https://www.postgresql.org/docs/devel/plpgsql.html is so long right
now that it doesn't fit on the page, so maybe losing a level of
subsections won't be so bad. Alternately, maybe we could revise the
structure of the section a bit to ameliorate the problem. It seems to
me that most of the first-level section headers are actually pretty
clear about what you're likely to find underneath. If you're looking
for WHILE and you see a section called "Control Structures", it seems
like you're chances of guessing that WHILE will be underneath that
section are pretty good. The major exception that I see is 45.2,
"Basic Statements," which isn't very clear about what might be covered
there. But what if we split that apart into separate sections called
"Assignment", "Executing SQL", and "Doing Nothing at All"? And maybe
we'd even pull "Returning" out of "Control Structures" as well. I
think that would be clear enough for people to find what they need
without the extra level of headers.

(For the sake of completeness, let me note that PL/python and PL/perl
have a few <sect2> headings as well, but I don't think it would create
a problem for users if all of those got changed to <sect3>. PL/Tcl has
no <sect2> headings.)

I'm not sure if this kind of rearrangement is actually necessary or
not; but my point here is that if we think that people will have
problems or we find out that they actually did have problems, we can
look at doing this kind of stuff to compensate. What I don't think we
should do is decide that the only workable solution is to keep having
so many separate chapters at the top level. We're way, way beyond the
point where you can easily find anything on that page, and trying to
emphasize everything just ends up emphasizing nothing. We need to push
in a direction where every chapter and every appendix is expected to
have a large amount of content under it, so that the top-level index
becomes a way of finding the kind of content you want (SQL reference
pages, extension APIs, built-in SQL-callable functions, whatever) and
then you use that page to find the specific content that you want
within that category.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: documentation structure

От
"David G. Johnston"
Дата:
On Wed, Mar 20, 2024 at 9:43 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Mar 19, 2024 at 5:39 PM Andrew Dunstan <andrew@dunslane.net> wrote:
> +many for improving the index.

Here's a series of four patches.

I reviewed the most recent set of 5 patches.
 
Taken together, they cut down the
number of numbered chapters from 76 to 68. I think we could easily
save that much again if I wrote a few more patches along similar
lines, but I'm posting these first to see what people think.

0001 removes the "Installation from Binaries" chapter. The whole thing
is four sentences. I moved the most important information into the
"Installation from Source Code" chapter and retitled it
"Installation".

Makes sense


0002 removes the "Monitoring Disk Usage" chapter by folding it into
the immediately-preceding "Monitoring Database Activity" chapter. I
kind of feel like the "Monitoring Disk Usage" chapter might be in need
of a bigger rewrite or just outright removal, but there's surely not
enough content here to justify making it a top-level chapter.

Just going to note that the section on the cumulative statistics views being a single page is still a strongly bothersome issue here.  Though the quick fix actually requires upgrading the section to chapter status...

Maybe we can stub out that section in the "Monitoring Database Activity" chapter and move that entire section after "System Views" in the Internals part?

I agree with subordinating Monitoring Disk Usage.


0003 merges all of the "Internals" chapters whose names are the names
of built-in index access methods (Btree, Gin, etc.) into a single
chapter called "Built-In Index Access Methods". All of these chapters
have a very similar structure and none of them are very long, so it
makes a lot of sense, at least in my mind, to consolidate them into
one.

One of the more impactful and wanted improvements, IMO.


0004 merges the "Generic WAL Records" and "Custom WAL Resource
Managers" chapter together, creating a new chapter called "Write Ahead
Logging for Extensions".


The positioning of this and the preceding Built-in Index Access Methods chapter seem like they should be switched.

If this sticks we should add an introductory paragraph for the chapter.

and I've added a new 0005 that does essentially the same
thing for the PL chapters.

The following page needs to be reworded to take the new structure into account:


Not having pl/pgsql appear on the main ToC seems like a loss but the others make sense and a special exception for it probably isn't warranted.

Maybe "pl/pgsql and Other Procedural Languages" as the title?

David J.

Re: documentation structure

От
"David G. Johnston"
Дата:
On Thu, Mar 21, 2024 at 11:30 AM Robert Haas <robertmhaas@gmail.com> wrote:

My second thought is that the stuff from "VII. Internals" that I
categorized as reference material should move into section "VI.
Reference". I think we should also consider moving appendix F,
"Additional Supplied Modules and Extensions," and appendix G,
"Additional Supplied Programs" to the reference section.


For "VI. Reference" I propose the following Chapters:

SQL Commands
PL/pgSQL
Cumulative Statistics Views
System Views
System Catalogs
Client Applications
Server Applications
Modules and Extensions

-- Remove Appendix G (Programs) altogether and just note for the two that are listed that they are in contrib as opposed to core.

-- The PostgreSQL qualifier doesn't seem helpful and once you add the additional chapters its unusual presence stands out even more.

-- PL/pgSQL gets its own reference chapter since we wrote it.  Stuff like Perl and Python have entire books that the user can consult as reference material for those languages.

David J.

Re: documentation structure

От
Peter Eisentraut
Дата:
On 19.03.24 14:50, Tom Lane wrote:
> Daniel Gustafsson <daniel@yesql.se> writes:
>> It's actually not very odd, the reference section is using <reference> elements
>> and we had missed the arabic numerals setting on those.  The attached fixes
>> that for me.  That being said, we've had roman numerals for the reference
>> section since forever (all the way down to the 7.2 docs online has it) so maybe
>> it was intentional?
> 
> I'm quite sure it *was* intentional.  Maybe it was a bad idea, but
> it's not that way simply because nobody thought about it.

Looks to me it was just that way because it's the default setting of the 
stylesheets.




Re: documentation structure

От
Peter Eisentraut
Дата:
On 20.03.24 17:43, Robert Haas wrote:
> 0001 removes the "Installation from Binaries" chapter. The whole thing
> is four sentences. I moved the most important information into the
> "Installation from Source Code" chapter and retitled it
> "Installation".

But this separation was explicitly added a few years ago, because most 
people just want to read about the binaries.




Re: documentation structure

От
Peter Eisentraut
Дата:
On 21.03.24 15:31, Robert Haas wrote:
> On Thu, Mar 21, 2024 at 9:38 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'd follow the extend.sgml precedent: have a file corresponding to the
>> chapter and containing any top-level text we need, then that includes
>> a file per sect1.
> 
> OK, here's a new patch set. I've revised 0003 and 0004 to use this
> approach, and I've added a new 0005 that does essentially the same
> thing for the PL chapters.

I'm highly against this.  If I want to read about PL/Python, why should 
I have to wade through PL/Perl and PL/Tcl?

I think, abstractly, in a book, PL/Python should be a chapter of its 
own.  Just like GiST should be a chapter of its own.  Because they are 
self-contained topics.




Re: documentation structure

От
Daniel Gustafsson
Дата:
> On 22 Mar 2024, at 00:33, Peter Eisentraut <peter@eisentraut.org> wrote:
>
> On 19.03.24 14:50, Tom Lane wrote:
>> Daniel Gustafsson <daniel@yesql.se> writes:
>>> It's actually not very odd, the reference section is using <reference> elements
>>> and we had missed the arabic numerals setting on those.  The attached fixes
>>> that for me.  That being said, we've had roman numerals for the reference
>>> section since forever (all the way down to the 7.2 docs online has it) so maybe
>>> it was intentional?
>> I'm quite sure it *was* intentional.  Maybe it was a bad idea, but
>> it's not that way simply because nobody thought about it.
>
> Looks to me it was just that way because it's the default setting of the stylesheets.

That's quite possible.  I don't have strong opinions on whether we should
change, or keep it the way it is.

--
Daniel Gustafsson




Re: documentation structure

От
Bruce Momjian
Дата:
On Fri, Mar 22, 2024 at 01:12:30AM +0100, Daniel Gustafsson wrote:
> > On 22 Mar 2024, at 00:33, Peter Eisentraut <peter@eisentraut.org> wrote:
> > 
> > On 19.03.24 14:50, Tom Lane wrote:
> >> Daniel Gustafsson <daniel@yesql.se> writes:
> >>> It's actually not very odd, the reference section is using <reference> elements
> >>> and we had missed the arabic numerals setting on those.  The attached fixes
> >>> that for me.  That being said, we've had roman numerals for the reference
> >>> section since forever (all the way down to the 7.2 docs online has it) so maybe
> >>> it was intentional?
> >> I'm quite sure it *was* intentional.  Maybe it was a bad idea, but
> >> it's not that way simply because nobody thought about it.
> > 
> > Looks to me it was just that way because it's the default setting of the stylesheets.
> 
> That's quite possible.  I don't have strong opinions on whether we should
> change, or keep it the way it is.

If we can't justify why it should be different, it should be like the
surrounding sections.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.



Re: documentation structure

От
Robert Haas
Дата:
On Thu, Mar 21, 2024 at 6:32 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
> Just going to note that the section on the cumulative statistics views being a single page is still a strongly
bothersomeissue here.  Though the quick fix actually requires upgrading the section to chapter status... 

Yeah, I've been bothered by this in the past, too. I'm not very keen
to start promoting things to the top-level, though. I think we need a
more thoughtful fix than that.

One question I have is why all of these views are documented here
rather than in chapter 53, "System Views," because surely they are
system views. I feel like if our documentation index weren't a mile
long and if you could easily find the entry for "System Views," that's
where you would naturally look for these details. I don't think it's
natural for a user to expect that most of the system views are going
to be documented in section VII, chapter 53 but one particular kind is
going to be documented in section III, chapter 27, under a chapter
title that gives no hint that it will document any views.

> Maybe "pl/pgsql and Other Procedural Languages" as the title?

I guess I have a hard time seeing this as an improvement. It would
help someone who knows that plpgsql exists but doesn't know that it
falls into the general category called procedural languages, but I
suspect that's not a very common confusion. I think it's better to
keep the chapter titles short and to the point.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: documentation structure

От
Robert Haas
Дата:
On Thu, Mar 21, 2024 at 7:37 PM Peter Eisentraut <peter@eisentraut.org> wrote:
> On 20.03.24 17:43, Robert Haas wrote:
> > 0001 removes the "Installation from Binaries" chapter. The whole thing
> > is four sentences. I moved the most important information into the
> > "Installation from Source Code" chapter and retitled it
> > "Installation".
>
> But this separation was explicitly added a few years ago, because most
> people just want to read about the binaries.

I really doubt that this is true. I've been installing software on
UNIX-like operating systems for more than 30 years now, and I don't
think there's been a single time when I have ever consulted the
documentation for a software package to find the download location for
that package. When I first started out, everything was ftp rather than
www, so you went to ftp.whatever.{com,org,net,gov,edu} and tried to
download the distribution bundle, and then you untarred it and ran
configure and make. Then you read the README or the documentation or
whatever afterward. These days, I think what people do is either (a)
use their package manager to install PostgreSQL and then come to the
documentation afterward to find out how to use it or (b) do a search
for "PostgreSQL download" and click on whatever comes up. I'm not
saying there's never been a user who made use of this section of the
documentation to find the download location, but surely the normal
thing to do if you come to www.postgresql.org and you want to download
the software is to click "Download" on the nav bar, not
"Documentation," then a specific version, then chapter 16, then the
exact same download link that's already there on the nav bar.

I do agree that it is very questionable whether "Installation from
Source Code" is of sufficient interest to ordinary users to justify
including it in "III. Server Administration." Most people, probably
including many extension developers, are only going to install the
binary packages. But the solution to that isn't to have a
four-sentence chapter telling me about a download location that I
likely found long before I looked at the documentation, and that I can
certainly find very easily without needing the documentation. Rather,
what we should do if we think that installing from source code is of
marginal interest is move it to an appendix. As I said to Alvaro
yesterday, I think that a "Developer Guide" appendix could be a good
place to house a number of things that currently have toplevel
chapters but don't really need them because they're only of interest
to a small minority of users. This might be another thing that could
go there.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: documentation structure

От
Peter Eisentraut
Дата:
On 22.03.24 13:50, Robert Haas wrote:
> On Thu, Mar 21, 2024 at 7:37 PM Peter Eisentraut <peter@eisentraut.org> wrote:
>> On 20.03.24 17:43, Robert Haas wrote:
>>> 0001 removes the "Installation from Binaries" chapter. The whole thing
>>> is four sentences. I moved the most important information into the
>>> "Installation from Source Code" chapter and retitled it
>>> "Installation".
>>
>> But this separation was explicitly added a few years ago, because most
>> people just want to read about the binaries.
> 
> I really doubt that this is true.

Here is the thread: 
https://www.postgresql.org/message-id/flat/CABUevExRCf8waYOsrCO-QxQL50XGapMf5dnWScOXj7X%3DMXW--g%40mail.gmail.com




Re: documentation structure

От
Robert Haas
Дата:
On Thu, Mar 21, 2024 at 7:40 PM Peter Eisentraut <peter@eisentraut.org> wrote:
> I'm highly against this.  If I want to read about PL/Python, why should
> I have to wade through PL/Perl and PL/Tcl?
>
> I think, abstractly, in a book, PL/Python should be a chapter of its
> own.  Just like GiST should be a chapter of its own.  Because they are
> self-contained topics.

On the other hand, in a book, chapters tend to be of relatively
uniform length. People don't usually write a book with some chapters
that are 100+ pages long, and others that are a single page, or even
just a couple of sentences. I mean, I'm sure it's been done, but it's
not a normal way to write a book.

And I don't believe that if someone were writing a physical book about
PostgreSQL from scratch, they'd ever end up with a top-level chapter
that looks anything like our GiST chapter. All of the index AM
chapters are quite obviously clones of each other, and they're all
quite short. Surely you'd make them sections within a chapter, not
entire chapters.

I do agree that PL/pgsql is more arguable. I can imagine somebody
writing a book about PostgreSQL and choosing to make that topic into a
whole chapter.

However, I also think that people don't make decisions about what
should be a chapter in a vacuum. If you've got 100 people writing a
book together, which is essentially what we actually do have, and each
of those people makes decisions in isolation about what is worthy of
being a chapter, then you end up with exactly the kind of mess that we
now have. Some chapters are long and some are short. Some are
well-written and some are poorly written. Some are updated regularly
and others have hardly been touched in a decade. Books have editors to
straighten out those kinds of inconsistencies so that there's some
uniformity to the product as a whole.

The problem with that, of course, is that it invites bike-shedding. As
you say, every decision that is reflected in our documentation was
made for some reason, and most of them will have been made by
prominent, active committers. So discussions about how to improve
things can easily bog down even when people agree on the overall
goals, simply because few individual changes find consensus. I hope
that doesn't happen here, because I think most people who have
commented so far agree that there is a problem here and that we should
try to fix it. Let's not let the perfect be the enemy of the good.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: documentation structure

От
Robert Haas
Дата:
On Fri, Mar 22, 2024 at 9:35 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> >> But this separation was explicitly added a few years ago, because most
> >> people just want to read about the binaries.
> >
> > I really doubt that this is true.
>
> Here is the thread:
> https://www.postgresql.org/message-id/flat/CABUevExRCf8waYOsrCO-QxQL50XGapMf5dnWScOXj7X%3DMXW--g%40mail.gmail.com

Sorry. I didn't mean to dispute the point that the section was added a
few years ago, nor the point that most people just want to read about
the binaries. I am confident that both of those things are true. What
I do want to dispute is that having a four-sentence chapter in the
documentation index that tells people something they can find much
more easily without using the documentation at all is a good plan. I
agree with the concern that Magnus expressed on the thread, i.e:

> It's kind of strange that if you start your PostgreSQL journey by reading our instructions, you get nothing useful
aboutinstalling PostgreSQL from binary packages other than "go ask somebody else about it". 

But I don't agree that this was the right way to address that problem.
I think it would have been better to just add the download link to the
existing installation chapter. That's actually what we had in chapter
18, "Installation from Source Code on Windows", since removed. But for
some reason we decided that on non-Windows platforms, it needed a
whole new chapter rather than an extra sentence in the existing one. I
think that's massively overkill.

Alternately, I think it would be reasonable to address the concern by
just moving all the stuff about building from source code to an
appendix, and assume people can figure out how to download the
software without us needing to say anything in the documentation at
all. What was weird about the state before that patch, IMHO, was that
we both talked about building from source code and didn't talk about
binary packages. That can be addressed either by adding a mention of
binary packages, or by deemphasizing the idea of installing from
source code.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: documentation structure

От
"David G. Johnston"
Дата:
On Fri, Mar 22, 2024 at 7:10 AM Robert Haas <robertmhaas@gmail.com> wrote:

That's actually what we had in chapter
18, "Installation from Source Code on Windows", since removed. But for
some reason we decided that on non-Windows platforms, it needed a
whole new chapter rather than an extra sentence in the existing one. I
think that's massively overkill.


I agree with the premise that we should have a single chapter, in the main documentation flow, named "Installation".  It should cover the architectural overview and point people to where they can find the stuff they need to install PostgreSQL in the various ways available to them.  I agree with moving the source installation material to the appendix.  None of the sections under Installation would then actually detail how to install the software since that isn't something the project itself handles but has delegated to packagers for the vast majority of cases and the source install details are in the appendix for the one "supported" mechanism that most people do not use.

David J.

Re: documentation structure

От
Robert Haas
Дата:
On Fri, Mar 22, 2024 at 11:50 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
> On Fri, Mar 22, 2024 at 7:10 AM Robert Haas <robertmhaas@gmail.com> wrote:
>> That's actually what we had in chapter
>> 18, "Installation from Source Code on Windows", since removed. But for
>> some reason we decided that on non-Windows platforms, it needed a
>> whole new chapter rather than an extra sentence in the existing one. I
>> think that's massively overkill.
>
> I agree with the premise that we should have a single chapter, in the main documentation flow, named "Installation".
Itshould cover the architectural overview and point people to where they can find the stuff they need to install
PostgreSQLin the various ways available to them.  I agree with moving the source installation material to the appendix.
None of the sections under Installation would then actually detail how to install the software since that isn't
somethingthe project itself handles but has delegated to packagers for the vast majority of cases and the source
installdetails are in the appendix for the one "supported" mechanism that most people do not use. 

Hmm, that's not quite the same as my position. I'm fine with either
moving the installation from source material to an appendix, or
leaving it where it is. But I'm strongly against continuing to have a
chapter with four sentences in it that says to use the same download
link that is on the main navigation bar of every page on the
postgresql.org web site. We're never going to get the chapter index
down to a reasonable size if we insist on having chapters that have a
totally de minimis amount of content.

So my feeling is that if we keep the installation from source material
where it is, then we can make it also mention the download link, just
as we used to do in the installation-on-windows chapter. But if we
banish installation from source to the appendixes, then we shouldn't
keep a whole chapter in the main documentation to tell people
something that is anyway obvious. I don't really think that material
needs to be there at all, but if we want to have it, surely we can
find someplace to put it such that it doesn't require a whole chapter
to say that and nothing else. It could for example go at the beginning
of the "Server Setup and Operation" chapter, for instance; if that
were the first chapter of section III, I think that would be natural
enough.

I notice that you say that the "Installation" section should "cover
the architectural overview and point people to where they can find the
stuff they need to install PostgreSQL in the various ways available to
them" so maybe you're not imagining a four-sentence chapter, either.
But this project is going to be impossible unless we stick to limited
goals. We can, and should, rewrite some sections of the documentation
to be more useful; but if we try to do that as part of the same
project that aims to tidy up the index, the chances of us getting
stuck in an endless bikeshedding loop go from "high" to "certain". So
I don't want to hypothesize the existence of an installation chapter
that isn't any of the things we have today. Let's try to get the
things we have into places that make sense, and then consider other
improvements separately.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: documentation structure

От
"David G. Johnston"
Дата:
On Fri, Mar 22, 2024, 09:32 Robert Haas <robertmhaas@gmail.com> wrote:


I notice that you say that the "Installation" section should "cover
the architectural overview and point people to where they can find the
stuff they need to install PostgreSQL in the various ways available to
them" so maybe you're not imagining a four-sentence chapter, either.

Fair point but I posit that new users are looking for a chapter named Installation in the documentation.  At least the ones willing to read documentation.  Having two of them isn't needed but having zero doesn't make sense either.

The current proposal does that so I'm ok as-is but it can be further improved by moving source install talk elsewhere and having the installation chapter redirect the reader there for details.  I'm not concerned with how long or short the resultant installation chapter is.

David J.

Re: documentation structure

От
Bruce Momjian
Дата:
On Fri, Mar 22, 2024 at 08:32:14AM -0400, Robert Haas wrote:
> On Thu, Mar 21, 2024 at 6:32 PM David G. Johnston
> <david.g.johnston@gmail.com> wrote:
> > Just going to note that the section on the cumulative statistics views being a single page is still a strongly
bothersomeissue here.  Though the quick fix actually requires upgrading the section to chapter status...
 
> 
> Yeah, I've been bothered by this in the past, too. I'm not very keen
> to start promoting things to the top-level, though. I think we need a
> more thoughtful fix than that.
> 
> One question I have is why all of these views are documented here
> rather than in chapter 53, "System Views," because surely they are
> system views. I feel like if our documentation index weren't a mile
> long and if you could easily find the entry for "System Views," that's
> where you would naturally look for these details. I don't think it's
> natural for a user to expect that most of the system views are going
> to be documented in section VII, chapter 53 but one particular kind is
> going to be documented in section III, chapter 27, under a chapter

Well, until this commit in 2022, the system views were _under_ the
system catalogs chapter:

    commit 64d364bb39c
    Author: Bruce Momjian <bruce@momjian.us>
    Date:   Thu Jul 14 16:07:12 2022 -0400
    
        doc:  move system views section to its own chapter
    
        Previously it was inside the system catalogs chapter.
    
        Reported-by: Peter Smith
    
        Discussion: https://postgr.es/m/CAHut+PsMc18QP60D+L0hJBOXrLQT5m88yVaCDyxLq34gfPHsow@mail.gmail.com
    
        Backpatch-through: 15

The thread contains more discussion the issue, and I think it still needs help:


https://www.postgresql.org/message-id/flat/CAHut%2BPsMc18QP60D%2BL0hJBOXrLQT5m88yVaCDyxLq34gfPHsow%40mail.gmail.com

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.



Re: documentation structure

От
Robert Haas
Дата:
On Fri, Mar 22, 2024 at 1:35 PM Bruce Momjian <bruce@momjian.us> wrote:
> > One question I have is why all of these views are documented here
> > rather than in chapter 53, "System Views," because surely they are
> > system views. I feel like if our documentation index weren't a mile
> > long and if you could easily find the entry for "System Views," that's
> > where you would naturally look for these details. I don't think it's
> > natural for a user to expect that most of the system views are going
> > to be documented in section VII, chapter 53 but one particular kind is
> > going to be documented in section III, chapter 27, under a chapter
>
> Well, until this commit in 2022, the system views were _under_ the
> system catalogs chapter:

Even before that commit, the statistics collector views were
documented in a completely separate part of the documentation from all
of the other system views.

I think that commit was a good idea, even though it made the top-level
documentation index bigger, because in v14, the "System Catalogs"
chapter looks like this:

...
52.61. pg_ts_template
52.62. pg_type
52.63. pg_user_mapping
52.64. System Views
52.65. pg_available_extensions
52.66. pg_available_extension_versions
52.67. pg_backend_memory_contexts
...

If you were actually looking for the section called "System Views",
you weren't likely to see it here unless you already knew it was
there, because it was 64 items into a 97-item list. Having one of
these two sections inside the other just doesn't work at all. We could
have alternatively chosen to have one chapter with two <sect1> tags
inside of it, but I think what you actually did was perfectly fine.
IMHO, "System Views" is important enough (and big enough) that giving
it its own chapter is perfectly reasonable.

But that all seems like a separate question from why we have the
statistic collector views in a completely different part of the
documentation from the rest of the system views. My guess is that it's
just kind of a historical accident, but maybe there was some other
logic to it.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: documentation structure

От
Bruce Momjian
Дата:
On Fri, Mar 22, 2024 at 02:19:29PM -0400, Robert Haas wrote:
> If you were actually looking for the section called "System Views",
> you weren't likely to see it here unless you already knew it was
> there, because it was 64 items into a 97-item list. Having one of
> these two sections inside the other just doesn't work at all. We could
> have alternatively chosen to have one chapter with two <sect1> tags
> inside of it, but I think what you actually did was perfectly fine.
> IMHO, "System Views" is important enough (and big enough) that giving
> it its own chapter is perfectly reasonable.
> 
> But that all seems like a separate question from why we have the
> statistic collector views in a completely different part of the
> documentation from the rest of the system views. My guess is that it's
> just kind of a historical accident, but maybe there was some other
> logic to it.

I assume statistics collector views are in "Monitoring Database
Activity" because that is their purpose.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.



Re: documentation structure

От
"David G. Johnston"
Дата:
On Fri, Mar 22, 2024 at 11:19 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Mar 22, 2024 at 1:35 PM Bruce Momjian <bruce@momjian.us> wrote:

But that all seems like a separate question from why we have the
statistic collector views in a completely different part of the
documentation from the rest of the system views. My guess is that it's
just kind of a historical accident, but maybe there was some other
logic to it.


The details under-pinning the cumulative statistics subsystem are definitely large enough to warrant their own subsection. And it isn't like placing them into the monitoring chapter is wrong and aside from a couple of views those under System Views don't fit into what we've defined as monitoring.  I don't have any desire to lump them under the generic system views; which itself could probably use a level of categorization since the nature of pg_locks and pg_cursors is decidedly different than pg_indexes and pg_config.  This all becomes more appealing to work on once we solve the problem of all sect2 entries being placed on a single page.

I struggled for a long while where I'd always look for pg_stat_activity under system views instead of monitoring.  Amending my prior suggestion in light of this I would suggest we move the Cumulative Statistics Views into Reference but as its own Chapter, not part of System Views, and change its name to "Monitoring Views" (going more generalized here feels like a win to me). I'd move pg_locks, pg_cursors, pg_backend_memory_contexts, pg_prepared_*, pg_shmem_allocations, and pg_replication_*.  Those all have the same general monitoring nature to them compared to the others that basically provide details regarding schema and static or session configuration.

The original server admin monitoring section can go into detail regarding Cumulative Statistics versus other kinds of monitoring.  We can use section ordering to fulfill logical grouping desires until we are able to make section3 entries appear on their own pages.

David J.

Re: documentation structure

От
Robert Haas
Дата:
On Fri, Mar 22, 2024 at 2:59 PM Bruce Momjian <bruce@momjian.us> wrote:
> I assume statistics collector views are in "Monitoring Database
> Activity" because that is their purpose.

Well, yes. :-)

But the point is that all other statistics views are in a single
section regardless of their purpose. We don't document pg_roles in the
"Database Roles" chapter, for example.

And on the flip side, pg_locks and pg_replication_origin_status are
also for monitoring database activity, but they're in the "System
Views" chapter anyway. The only system views that are in "Monitoring
Database Activity" rather than "System Views" are the ones where the
name starts with "pg_stat_".

So the reason you state is why these views are under "Monitoring
Database Activity" rather than a chapter chosen at random. But it
doesn't really explain why they're separate from the other system
views at all. That seems to be a pretty much random choice, AFAICT.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: documentation structure

От
Bruce Momjian
Дата:
On Fri, Mar 22, 2024 at 03:13:29PM -0400, Robert Haas wrote:
> On Fri, Mar 22, 2024 at 2:59 PM Bruce Momjian <bruce@momjian.us> wrote:
> > I assume statistics collector views are in "Monitoring Database
> > Activity" because that is their purpose.
> 
> Well, yes. :-)
> 
> But the point is that all other statistics views are in a single
> section regardless of their purpose. We don't document pg_roles in the
> "Database Roles" chapter, for example.
> 
> And on the flip side, pg_locks and pg_replication_origin_status are
> also for monitoring database activity, but they're in the "System
> Views" chapter anyway. The only system views that are in "Monitoring
> Database Activity" rather than "System Views" are the ones where the
> name starts with "pg_stat_".
> 
> So the reason you state is why these views are under "Monitoring
> Database Activity" rather than a chapter chosen at random. But it
> doesn't really explain why they're separate from the other system
> views at all. That seems to be a pretty much random choice, AFAICT.

I agree and they should be with the other views.  I was just explaining
why, at the time, I didn't touch them.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.



Re: documentation structure

От
Robert Haas
Дата:
On Fri, Mar 22, 2024 at 3:17 PM Bruce Momjian <bruce@momjian.us> wrote:
> I agree and they should be with the other views.  I was just explaining
> why, at the time, I didn't touch them.

Ah, OK. That makes total sense.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: documentation structure

От
Peter Eisentraut
Дата:
On 22.03.24 14:59, Robert Haas wrote:
> And I don't believe that if someone were writing a physical book about
> PostgreSQL from scratch, they'd ever end up with a top-level chapter
> that looks anything like our GiST chapter. All of the index AM
> chapters are quite obviously clones of each other, and they're all
> quite short. Surely you'd make them sections within a chapter, not
> entire chapters.
> 
> I do agree that PL/pgsql is more arguable. I can imagine somebody
> writing a book about PostgreSQL and choosing to make that topic into a
> whole chapter.

Yeah, I think there is probably a range of of things from pretty obvious 
to mostly controversial.



Re: documentation structure

От
Peter Eisentraut
Дата:
On 22.03.24 15:10, Robert Haas wrote:
> Sorry. I didn't mean to dispute the point that the section was added a
> few years ago, nor the point that most people just want to read about
> the binaries. I am confident that both of those things are true. What
> I do want to dispute is that having a four-sentence chapter in the
> documentation index that tells people something they can find much
> more easily without using the documentation at all is a good plan.

I think a possible problem we need to consider with these proposals to 
combine chapters is that they could make the chapters themselves too 
deep and harder to navigate.  For example, if we combined the 
installation from source and binaries chapters, the structure of the new 
chapter would presumably be

<chapter> Installation
  <sect1>   Installation from Binaries
  <sect1>   Installation from Source
   <sect2>   Requirements
   <sect2>   Getting the Source
   <sect2>   Building and Installation with Autoconf and Make
   <sect2>   Building and Installation with Meson
etc.

This would mean that the entire "Installation from Source" part would be 
rendered on a single HTML page.

The rendering can be adjusted to some degree, but then we also need to 
make sure any new chunking makes sense in other chapters.  (And it might 
also change a bunch of externally known HTML links.)

I think maybe more could also be done at the top-level structure, too. 
Right now, we have <book> -> <part> -> <chapter>.  We could add <set> on 
top of that.

We could also play with CSS or JavaScript to make the top-level table of 
contents more navigable, with collapsing subsections or whatever.

We could also render additional tables of contents or indexes, so there 
is more than one way to navigate into the content from the top.

We could also build better search.




Re: documentation structure

От
Robert Haas
Дата:
On Mon, Mar 25, 2024 at 11:40 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> I think a possible problem we need to consider with these proposals to
> combine chapters is that they could make the chapters themselves too
> deep and harder to navigate.  For example, if we combined the
> installation from source and binaries chapters, the structure of the new
> chapter would presumably be

I agree with this in theory, but in practice I think the patches that
I posted don't have this issue to a degree that is problematic, and I
posted some specific proposals on adjustments that we could make to
ameliorate the problem if other people feel differently.

> I think maybe more could also be done at the top-level structure, too.
> Right now, we have <book> -> <part> -> <chapter>.  We could add <set> on
> top of that.
>
> We could also play with CSS or JavaScript to make the top-level table of
> contents more navigable, with collapsing subsections or whatever.
>
> We could also render additional tables of contents or indexes, so there
> is more than one way to navigate into the content from the top.
>
> We could also build better search.

These are all reasonable ideas. I think some better CSS and JavaScript
could definitely help, and I also wondered whether the entrypoint to
the documentation has to be the index page, or whether it could maybe
be a page we've crafted specifically for that purpose, that might
include some text as well as a bunch of links.

But that having been said, I don't believe that any of those ideas (or
anything else we do) will obviate the need for some curation of the
toplevel index. If you're going to add another level, as you propose
in the first point, you still need to make decisions about which
things properly go at which levels. If you're going to allow for
collapsing subsections, you still want the overall tree in which
subsections are be expanded and collapsed to make logical sense. If
you have multiple ways to navigate to the content, one of them will
probably be still the index, and it should be good. And good search is
good, but it shouldn't be the only convenient way to find the content.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: documentation structure

От
Robert Haas
Дата:
OK, so I'm coming back to this thread after giving it a few days to
cool off. My last series of patches proposed to do five things:

1. Merge the four-sentence "Installation from Binaries" chapter back
into "Installation from Source". I thought this was a slam-dunk, but
Peter pointed out that exactly the opposite of this was done a few
years ago to create the "Installation from Binaries" chapter in the
first place. Based on subsequent discussion, what I'm now inclined to
do is come up with a new proposal that involves moving the information
about compiling from source to an appendix. So never mind about this
one for now.

2. Demote "Monitoring Disk Usage" from a chapter on its own to a
section of the "Monitoring Database Activity" chapter. I haven't seen
any objections to this, and I'd like to move ahead with it.

3. Merge the separate chapters on various built-in index AMs into one.
Peter didn't think this was a good idea, but Tom and Alvaro's comments
focused on how to do it mechanically, and on whether the chapters
needed to be reordered afterwards, which I took to mean that they were
OK with the basic concept. David Johnston was also clearly in favor of
it. So I'd like to move ahead with this one, too.

4. Consolidate the "Generic WAL Records" and "Custom WAL Resource
Managers" chapters, which cover related topics, into a single one. I
didn't see anyone object to this, but David Johnston pointed out that
the patch I posted was a few bricks short of a load, because it really
needed to put some introductory text into the new chapter. I'll study
this a bit more and propose a new patch that does the same thing a bit
more carefully than my previous version did.

5. Consolidate all of the procedural language chapters into one. This
was clearly the most controversial part of the proposal. I'm going to
lay this one aside for now and possibly come back to it at a later
time.

I hope that this way of proceeding makes sense to people.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: documentation structure

От
Robert Haas
Дата:
On Fri, Mar 29, 2024 at 9:40 AM Robert Haas <robertmhaas@gmail.com> wrote:
> 2. Demote "Monitoring Disk Usage" from a chapter on its own to a
> section of the "Monitoring Database Activity" chapter. I haven't seen
> any objections to this, and I'd like to move ahead with it.
>
> 3. Merge the separate chapters on various built-in index AMs into one.
> Peter didn't think this was a good idea, but Tom and Alvaro's comments
> focused on how to do it mechanically, and on whether the chapters
> needed to be reordered afterwards, which I took to mean that they were
> OK with the basic concept. David Johnston was also clearly in favor of
> it. So I'd like to move ahead with this one, too.

I committed these two patches.

> 4. Consolidate the "Generic WAL Records" and "Custom WAL Resource
> Managers" chapters, which cover related topics, into a single one. I
> didn't see anyone object to this, but David Johnston pointed out that
> the patch I posted was a few bricks short of a load, because it really
> needed to put some introductory text into the new chapter. I'll study
> this a bit more and propose a new patch that does the same thing a bit
> more carefully than my previous version did.

Here is a new version of this patch. I think this is v18 material at
this point, absent an outcry to the contrary. Sometimes we're flexible
about doc patches.

--
Robert Haas
EDB: http://www.enterprisedb.com

Вложения

Re: documentation structure

От
Robert Haas
Дата:
On Mon, Mar 25, 2024 at 11:40 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> I think a possible problem we need to consider with these proposals to
> combine chapters is that they could make the chapters themselves too
> deep and harder to navigate.

I looked into various options for further combining chapters and/or
appendixes and found that this is indeed a huge problem. For example,
I had thought of creating a Developer Information chapter in the
appendix and moving various existing chapters and appendixes inside of
it, but that means that the <sect1> elements in those chapters get
demoted to <sect2>, and what used to be a whole chapter or appendix
becomes a <sect1>. And since you get one HTML page per <sect1>, that
means that instead of a bunch of individual HTML pages of very
pleasant length, you suddenly get one very long HTML page that is,
exactly as you say, hard to navigate.

> The rendering can be adjusted to some degree, but then we also need to
> make sure any new chunking makes sense in other chapters.  (And it might
> also change a bunch of externally known HTML links.)

I looked into this and I'm unclear how much customization is possible.
I gather that the current scheme comes from having chunk.section.depth
of 1, and I guess you can change that to 2 to get an HTML page per
<sect2>, but it seems like it would take a LOT of restructuring to
make that work. It would be much easier if you could vary this across
different parts of the documentation; for instance, if you could say,
well, in this particular chapter or appendix, I want
chunk.section.depth of 2, but elsewhere 1, that would be quite handy,
but after several hours reading various things about DocBook on the
Internet, I was still unable to determine  conclusively whether this
was possible. There's an interesting comment in
stylesheet-speedup-xhtml.xsl that says "Since we set a fixed
$chunk.section.depth, we can do away with a bunch of complicated XPath
searches for the previous and next sections at various levels." That
sounds like it's suggesting that it is in fact possible for this
setting to vary, but I don't know if that's true, or how to do it, and
it sounds like there might be performance consequences, too.

> I think maybe more could also be done at the top-level structure, too.
> Right now, we have <book> -> <part> -> <chapter>.  We could add <set> on
> top of that.

Does this let you create structures of non-uniform depth? i.e. is
there a way that we can group some chapters into sets while leaving
others as standalone chapters, or somesuch?

I'm not 100% confident that non-uniform depth (either via <set> or via
chunk.section.depth or via some other mechanism) is a good idea.
There's a sort of uniformity to our navigation right now that does
have some real appeal. The downside, though, is that if you want
something to be a single HTML page, it's got to either be a chapter
(or appendix) by itself with no sections inside of it, or it's got to
be a <sect1> inside of a chapter, and so anything that's long enough
that it should be an HTML page by itself can never be more than one
level below the index. And that seems to make it quite difficult to
keep the index small.

Without some kind of variable-depth structure, the only other ways
that I can see to improve things are:

1. Make chunk.section.depth 2 and restructure the entire documentation
until the results look reasonable. This might be possible but I bet
it's difficult. We have, at present, chapters of *wildly* varying
length, from a few sentences to many, many pages. That is perhaps a
bad thing; you most likely wouldn't do that in a printed book. But
fixing it is a huge project. We don't necessarily have the same amount
of content about each topic, and there isn't necessarily a way of
grouping related topics together that produces units of relatively
uniform length. I think it's sensible to try to make improvements
where we can, by pushing stuff down that's short and not that
important, but finding our way to a chunk.section.depth=2 world that
feels good to most people compared to what we have today seems like
it's going to be challening.

2. Replace the current index with a custom index or landing page of
some kind. Or keep the current index and add a new landing page
alongside it. Something that isn't derived automatically from the
documentation structure but is created by hand.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: documentation structure

От
"David G. Johnston"
Дата:
On Fri, Apr 5, 2024 at 9:01 AM Robert Haas <robertmhaas@gmail.com> wrote:

> The rendering can be adjusted to some degree, but then we also need to
> make sure any new chunking makes sense in other chapters.  (And it might
> also change a bunch of externally known HTML links.)

I looked into this and I'm unclear how much customization is possible.


Here is a link to my attempt at this a couple of years ago.  It basically "abuses" refentry.


I never did dive into the man page or PDF dynamics of this particular change but it seemed to solve HTML pagination without negative consequences and with minimal risk of unintended consequences since only the markup on the pages we want to alter is changed, not global configuration.

David J.

Re: documentation structure

От
Robert Haas
Дата:
On Fri, Apr 5, 2024 at 12:15 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
> Here is a link to my attempt at this a couple of years ago.  It basically "abuses" refentry.
>
> https://www.postgresql.org/message-id/CAKFQuwaVm%3D6d_sw9Wrp4cdSm5_k%3D8ZVx0--v2v4BH4KnJtqXqg%40mail.gmail.com
>
> I never did dive into the man page or PDF dynamics of this particular change but it seemed to solve HTML pagination
withoutnegative consequences and with minimal risk of unintended consequences since only the markup on the pages we
wantto alter is changed, not global configuration. 

Hmm, but it seems like that might have generated some man page entries
that we don't want?

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: documentation structure

От
"David G. Johnston"
Дата:
On Fri, Apr 5, 2024 at 9:18 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Apr 5, 2024 at 12:15 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
> Here is a link to my attempt at this a couple of years ago.  It basically "abuses" refentry.
>
> https://www.postgresql.org/message-id/CAKFQuwaVm%3D6d_sw9Wrp4cdSm5_k%3D8ZVx0--v2v4BH4KnJtqXqg%40mail.gmail.com
>
> I never did dive into the man page or PDF dynamics of this particular change but it seemed to solve HTML pagination without negative consequences and with minimal risk of unintended consequences since only the markup on the pages we want to alter is changed, not global configuration.

Hmm, but it seems like that might have generated some man page entries
that we don't want?

If so (didn't check) maybe just remove them in post?

David J.

Re: documentation structure

От
Peter Eisentraut
Дата:
On 05.04.24 17:11, Robert Haas wrote:
>> 4. Consolidate the "Generic WAL Records" and "Custom WAL Resource
>> Managers" chapters, which cover related topics, into a single one. I
>> didn't see anyone object to this, but David Johnston pointed out that
>> the patch I posted was a few bricks short of a load, because it really
>> needed to put some introductory text into the new chapter. I'll study
>> this a bit more and propose a new patch that does the same thing a bit
>> more carefully than my previous version did.
> 
> Here is a new version of this patch. I think this is v18 material at
> this point, absent an outcry to the contrary. Sometimes we're flexible
> about doc patches.

Looks good to me.  I think this could go into PG17.



Re: documentation structure

От
jian he
Дата:
On Wed, Mar 20, 2024 at 5:40 AM Andrew Dunstan <andrew@dunslane.net> wrote:
>
>
> +many for improving the index.
>
> My own pet docs peeve is a purely editorial one: func.sgml is a 30k line beast, and I think there's a good case for
splittingout at least the larger chunks of it. 
>

I think I successfully reduced func.sgml from 311322 lines to 13167 lines.
(base-commit: 93582974315174d544592185d797a2b44696d1e5)

writing a patch would be unreviewable.
key gotcha is put the contents between opening `<sect1>`  and closing
`</sect1>` (both inclusive)
into a new file.
in func.sgml, using `&entity`  to refernce the new file.
also update filelist.sgml

here is how I do it:

I found out these build html files are the biggest one:
doc/src/sgml/html/functions-string.html
doc/src/sgml/html/functions-matching.html
doc/src/sgml/html/functions-datetime.html
doc/src/sgml/html/functions-json.html
doc/src/sgml/html/functions-aggregate.html
doc/src/sgml/html/functions-info.html
doc/src/sgml/html/functions-admin.html

so create these new sgml files hold corrspedoning content:
func-string.sgml
func-matching.sgml
func-datetime.sgml
func-json.sgml
func-aggregate.sgml
func-info.sgml
func-admin.sgml

based on funs.sgml structure pattern:
<sect1 id="functions-string">
next section1 line number:
<sect1 id="functions-binarystring">

<sect1 id="functions-matching">
next section1 line number:
<sect1 id="functions-formatting">

<sect1 id="functions-datetime">
next section1 line number:
<sect1 id="functions-enum">

<sect1 id="functions-json">
next section1 line number:
<sect1 id="functions-sequence">

<sect1 id="functions-aggregate">
next section1 line number:
<sect1 id="functions-window">

<sect1 id="functions-info">
next section1 line number:
<sect1 id="functions-admin">

<sect1 id="functions-admin">
next section1 line number:
<sect1 id="functions-trigger">
------------------------------------
step1:   pipe the relative line range contents to new sgml files.
(example: line 2407 to line 4177 include all the content correspond to
functions-string.html)

sed -n '2407,4177 p' func.sgml > func-string.sgml
sed -n '5328,7756 p' func.sgml >  func-matching.sgml
sed -n '8939,11122 p' func.sgml > func-datetime.sgml
sed -n '15498,19348 p' func.sgml > func-json.sgml
sed -n '21479,22896 p' func.sgml > func-aggregate.sgml
sed -n '24257,27896 p' func.sgml > func-info.sgml
sed -n '27898,30579 p' func.sgml > func-admin.sgml

step2:
in place delete these line ranges in func.sgml
sed --in-place  "2407,4177d ; 5328,7756d ; 8939,11122d ; 15498,19348d
; 21479,22896d ; 24257,27896d ; 27898,30579d" \
    func.sgml
reference: https://unix.stackexchange.com/questions/676210/matching-multiple-ranges-with-sed-range-expressions
           https://www.gnu.org/software/sed/manual/sed.html#Command_002dLine-Options

step3:
put following lines into relative position in func.sgml:
(based on above structure pattern, quickly location line position)

`
&func-string
&func-matching
&func-datetime
&func-json
&func-aggregate
&func-info
&func-admin
`

step4: update filelist.sgml:
diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml
index 3fb0709f..0b78a361 100644
--- a/doc/src/sgml/filelist.sgml
+++ b/doc/src/sgml/filelist.sgml
@@ -18,6 +18,13 @@
 <!ENTITY ddl        SYSTEM "ddl.sgml">
 <!ENTITY dml        SYSTEM "dml.sgml">
 <!ENTITY func       SYSTEM "func.sgml">
+<!ENTITY func-string       SYSTEM "func-string.sgml">
+<!ENTITY func-matching       SYSTEM "func-matching.sgml">
+<!ENTITY func-datetime       SYSTEM "func-datetime.sgml">
+<!ENTITY func-json       SYSTEM "func-json.sgml">
+<!ENTITY func-aggregate       SYSTEM "func-aggregate.sgml">
+<!ENTITY func-info       SYSTEM "func-info.sgml">
+<!ENTITY func-admin       SYSTEM "func-admin.sgml">
 <!ENTITY indices    SYSTEM "indices.sgml">
 <!ENTITY json       SYSTEM "json.sgml">
 <!ENTITY mvcc       SYSTEM "mvcc.sgml">

 doc/src/sgml/filelist.sgml       |     7 +
 doc/src/sgml/func-admin.sgml     |  2682 +++++
 doc/src/sgml/func-aggregate.sgml |  1418 +++
 doc/src/sgml/func-datetime.sgml  |  2184 ++++
 doc/src/sgml/func-info.sgml      |  3640 ++++++
 doc/src/sgml/func-json.sgml      |  3851 ++++++
 doc/src/sgml/func-matching.sgml  |  2429 ++++
 doc/src/sgml/func-string.sgml    |  1771 +++
 doc/src/sgml/func.sgml           | 17979 +----------------------------

we can do it one by one, but it's still worth it.



Re: documentation structure

От
Robert Haas
Дата:
On Mon, Apr 8, 2024 at 10:15 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> > Here is a new version of this patch. I think this is v18 material at
> > this point, absent an outcry to the contrary. Sometimes we're flexible
> > about doc patches.
>
> Looks good to me.  I think this could go into PG17.

Hearing no objections, done.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: documentation structure

От
Andres Freund
Дата:
Hi,

On 2024-03-19 17:39:39 -0400, Andrew Dunstan wrote:
> My own pet docs peeve is a purely editorial one: func.sgml is a 30k line
> beast, and I think there's a good case for splitting out at least the
> larger chunks of it.

I think we should work on generating a lot of func.sgml.  Particularly the
signature etc should just come from pg_proc.dat, it's pointlessly painful to
generate that by hand. And for a lot of the functions we should probably move
the existing func.sgml comments to the description in pg_proc.dat.

I suspect that we can't just generate all the documentation from pg_proc,
because of xrefs etc.  Although perhaps we could just strip those out for
pg_proc.

We'd need to add some more metadata to pg_proc, for grouping kinds of
functions together. But that seems doable.

Greetings,

Andres Freund



Re: documentation structure

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> I think we should work on generating a lot of func.sgml.  Particularly the
> signature etc should just come from pg_proc.dat, it's pointlessly painful to
> generate that by hand. And for a lot of the functions we should probably move
> the existing func.sgml comments to the description in pg_proc.dat.

Where are you going to get the examples and text descriptions from?
(And no, I don't agree that the pg_description string should match
what's in the docs.  The description string has to be a short
one-liner in just about every case.)

This sounds to me like it would be a painful exercise with not a
lot of benefit in the end.

I do agree with Andrew that splitting func.sgml into multiple files
would be beneficial.

            regards, tom lane



Re: documentation structure

От
Bruce Momjian
Дата:
On Tue, Apr 16, 2024 at 03:05:32PM -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > I think we should work on generating a lot of func.sgml.  Particularly the
> > signature etc should just come from pg_proc.dat, it's pointlessly painful to
> > generate that by hand. And for a lot of the functions we should probably move
> > the existing func.sgml comments to the description in pg_proc.dat.
> 
> Where are you going to get the examples and text descriptions from?
> (And no, I don't agree that the pg_description string should match
> what's in the docs.  The description string has to be a short
> one-liner in just about every case.)
> 
> This sounds to me like it would be a painful exercise with not a
> lot of benefit in the end.

Maybe we could _verify_ the contents of func.sgml against pg_proc.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.



Re: documentation structure

От
Andres Freund
Дата:
Hi,

On 2024-04-16 15:05:32 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > I think we should work on generating a lot of func.sgml.  Particularly the
> > signature etc should just come from pg_proc.dat, it's pointlessly painful to
> > generate that by hand. And for a lot of the functions we should probably move
> > the existing func.sgml comments to the description in pg_proc.dat.
>
> Where are you going to get the examples and text descriptions from?

I think there's a few different way to do that. E.g. having long_desc, example
fields in pg_proc.dat. Or having examples and description in a separate file
and "enriching" that with auto-generated function signatures.


> (And no, I don't agree that the pg_description string should match
> what's in the docs.  The description string has to be a short
> one-liner in just about every case.)

Definitely shouldn't be the same in all cases, but I think there's a decent
number of cases where they can be the same. The differences between the two is
often minimal today.

Entirely randomly chosen example:

{ oid => '2825',
  descr => 'slope of the least-squares-fit linear equation determined by the (X, Y) pairs',
  proname => 'regr_slope', prokind => 'a', proisstrict => 'f',
  prorettype => 'float8', proargtypes => 'float8 float8',
  prosrc => 'aggregate_dummy' },

and

      <row>
       <entry role="func_table_entry"><para role="func_signature">
        <indexterm>
         <primary>regression slope</primary>
        </indexterm>
        <indexterm>
         <primary>regr_slope</primary>
        </indexterm>
        <function>regr_slope</function> ( <parameter>Y</parameter> <type>double precision</type>,
<parameter>X</parameter><type>double precision</type> )
 
        <returnvalue>double precision</returnvalue>
       </para>
       <para>
        Computes the slope of the least-squares-fit linear equation determined
        by the (<parameter>X</parameter>, <parameter>Y</parameter>)
        pairs.
       </para></entry>
       <entry>Yes</entry>
      </row>


The description is quite similar, the pg_proc entry lacks argument names. 


> This sounds to me like it would be a painful exercise with not a
> lot of benefit in the end.

I think the manual work for writing signatures in sgml is not insignificant,
nor is the volume of sgml for them. Manually maintaining the signatures makes
it impractical to significantly improve the presentation - which I don't think
is all that great today.

And the lack of argument names in the pg_proc entries is occasionally fairly
annoying, because a \df+ doesn't provide enough information to use functions.

It'd also be quite useful if clients could render more of the documentation
for functions. People are used to language servers providing full
documentation for functions etc...

Greetings,

Andres Freund



Re: documentation structure

От
Corey Huinker
Дата:
> This sounds to me like it would be a painful exercise with not a
> lot of benefit in the end.

Maybe we could _verify_ the contents of func.sgml against pg_proc.

All of the functions redefined in catalog/system_functions.sql complicate using pg_proc.dat as a doc generator or source of validation. We'd probably do better to validate against a live instance, and even then the benefit wouldn't be great.

Re: documentation structure

От
Dagfinn Ilmari Mannsåker
Дата:
Andres Freund <andres@anarazel.de> writes:

> Definitely shouldn't be the same in all cases, but I think there's a decent
> number of cases where they can be the same. The differences between the two is
> often minimal today.
>
> Entirely randomly chosen example:
>
> { oid => '2825',
>   descr => 'slope of the least-squares-fit linear equation determined by the (X, Y) pairs',
>   proname => 'regr_slope', prokind => 'a', proisstrict => 'f',
>   prorettype => 'float8', proargtypes => 'float8 float8',
>   prosrc => 'aggregate_dummy' },
>
> and
>
>       <row>
>        <entry role="func_table_entry"><para role="func_signature">
>         <indexterm>
>          <primary>regression slope</primary>
>         </indexterm>
>         <indexterm>
>          <primary>regr_slope</primary>
>         </indexterm>
>         <function>regr_slope</function> ( <parameter>Y</parameter> <type>double precision</type>,
<parameter>X</parameter><type>double precision</type> )
 
>         <returnvalue>double precision</returnvalue>
>        </para>
>        <para>
>         Computes the slope of the least-squares-fit linear equation determined
>         by the (<parameter>X</parameter>, <parameter>Y</parameter>)
>         pairs.
>        </para></entry>
>        <entry>Yes</entry>
>       </row>
>
>
> The description is quite similar, the pg_proc entry lacks argument names. 
>
>
>> This sounds to me like it would be a painful exercise with not a
>> lot of benefit in the end.
>
> I think the manual work for writing signatures in sgml is not insignificant,
> nor is the volume of sgml for them. Manually maintaining the signatures makes
> it impractical to significantly improve the presentation - which I don't think
> is all that great today.

And it's very inconsistent.  For example, some functions use <optional>
tags for optional parameters, others use square brackets, and some use
<literal>VARIADIC</literal> to indicate variadic parameters, others use
ellipses (sometimes in <optional> tags or brackets).

> And the lack of argument names in the pg_proc entries is occasionally fairly
> annoying, because a \df+ doesn't provide enough information to use functions.

I was also annoyed by this the other day (specifically wrt. the boolean
arguments to pg_ls_dir), and started whipping up a Perl script to parse
func.sgml and generate missing proargnames values for pg_proc.dat, which
is how I discovered the above.  The script currently has a pile of hacky
regexes to cope with that, so I'd be happy to submit a doc patch to turn
it into actual markup to get rid of that, if people think that's a
worhtwhile use of time and won't clash with any other plans for the
documentation.

> It'd also be quite useful if clients could render more of the documentation
> for functions. People are used to language servers providing full
> documentation for functions etc...

A more user-friendly version of \df+ (maybe spelled \hf, for symmetry
with \h for commands?) would certainly be nice.

> Greetings,
>
> Andres Freund

- ilmari



Re: documentation structure

От
Corey Huinker
Дата:
And it's very inconsistent.  For example, some functions use <optional>
tags for optional parameters, others use square brackets, and some use
<literal>VARIADIC</literal> to indicate variadic parameters, others use
ellipses (sometimes in <optional> tags or brackets).

Having just written a couple of those functions, I wasn't able to find any guidance on how to document them with regards to <optional> vs [], etc. Having such a thing would be helpful.

While we're throwing out ideas, does it make sense to have function parameters and return values be things that can accept COMMENTs? Like so:

COMMENT ON FUNCTION function_name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ] ARGUMENT argname IS '....';
COMMENT ON FUNCTION function_name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ] RETURN VALUE IS '....';

I don't think this is a great idea, but if we're going to auto-generate documentation then we've got to store the metadata somewhere, and pg_proc.dat is already lacking relevant details.

Re: documentation structure

От
Andres Freund
Дата:
Hi,

On 2024-04-17 02:46:53 -0400, Corey Huinker wrote:
> > > This sounds to me like it would be a painful exercise with not a
> > > lot of benefit in the end.
> >
> > Maybe we could _verify_ the contents of func.sgml against pg_proc.
> >
> 
> All of the functions redefined in catalog/system_functions.sql complicate
> using pg_proc.dat as a doc generator or source of validation. We'd probably
> do better to validate against a live instance, and even then the benefit
> wouldn't be great.

There are 80 'CREATE OR REPLACE's in system_functions.sql, 1016 occurrences of
func_table_entry in funcs.sgml and 3.3k functions in pg_proc. I'm not saying
that differences due to system_functions.sql wouldn't be annoying to deal
with, but it'd also be far from the end of the world.

Greetings,

Andres Freund



Re: documentation structure

От
Andres Freund
Дата:
Hi,

On 2024-04-17 12:07:24 +0100, Dagfinn Ilmari Mannsåker wrote:
> Andres Freund <andres@anarazel.de> writes:
> > I think the manual work for writing signatures in sgml is not insignificant,
> > nor is the volume of sgml for them. Manually maintaining the signatures makes
> > it impractical to significantly improve the presentation - which I don't think
> > is all that great today.
> 
> And it's very inconsistent.  For example, some functions use <optional>
> tags for optional parameters, others use square brackets, and some use
> <literal>VARIADIC</literal> to indicate variadic parameters, others use
> ellipses (sometimes in <optional> tags or brackets).

That seems almost inevitably the outcome of many people having to manually
infer the recommended semantics, for writing something boring but nontrivial,
from a 30k line file.


> > And the lack of argument names in the pg_proc entries is occasionally fairly
> > annoying, because a \df+ doesn't provide enough information to use functions.
> 
> I was also annoyed by this the other day (specifically wrt. the boolean
> arguments to pg_ls_dir),

My bane is regexp_match et al, I have given up on remembering the argument
order.


> and started whipping up a Perl script to parse func.sgml and generate
> missing proargnames values for pg_proc.dat, which is how I discovered the
> above.

Nice.


> The script currently has a pile of hacky regexes to cope with that,
> so I'd be happy to submit a doc patch to turn it into actual markup to get
> rid of that, if people think that's a worhtwhile use of time and won't clash
> with any other plans for the documentation.

I guess it's a bit hard to say without knowing how voluminious the changes
would be. If we end up rewriting the whole file the tradeoff is less clear
than if it's a dozen inconsistent entries.


> > It'd also be quite useful if clients could render more of the documentation
> > for functions. People are used to language servers providing full
> > documentation for functions etc...
> 
> A more user-friendly version of \df+ (maybe spelled \hf, for symmetry
> with \h for commands?) would certainly be nice.

Indeed.

Greetings,

Andres Freund



Re: documentation structure

От
Dagfinn Ilmari Mannsåker
Дата:
Andres Freund <andres@anarazel.de> writes:

> Hi,
>
> On 2024-04-17 12:07:24 +0100, Dagfinn Ilmari Mannsåker wrote:
>> Andres Freund <andres@anarazel.de> writes:
>> > I think the manual work for writing signatures in sgml is not insignificant,
>> > nor is the volume of sgml for them. Manually maintaining the signatures makes
>> > it impractical to significantly improve the presentation - which I don't think
>> > is all that great today.
>> 
>> And it's very inconsistent.  For example, some functions use <optional>
>> tags for optional parameters, others use square brackets, and some use
>> <literal>VARIADIC</literal> to indicate variadic parameters, others use
>> ellipses (sometimes in <optional> tags or brackets).
>
> That seems almost inevitably the outcome of many people having to manually
> infer the recommended semantics, for writing something boring but nontrivial,
> from a 30k line file.

As Corey mentioned elsethread, having a markup style guide (maybe a
comment at the top of the file?) would be nice.

>> > And the lack of argument names in the pg_proc entries is occasionally fairly
>> > annoying, because a \df+ doesn't provide enough information to use functions.
>> 
>> I was also annoyed by this the other day (specifically wrt. the boolean
>> arguments to pg_ls_dir),
>
> My bane is regexp_match et al, I have given up on remembering the argument
> order.

There's a thread elsewhere about those specifically, but I can't be
bothered to find the link right now.

>> and started whipping up a Perl script to parse func.sgml and generate
>> missing proargnames values for pg_proc.dat, which is how I discovered the
>> above.
>
> Nice.
>
>> The script currently has a pile of hacky regexes to cope with that,
>> so I'd be happy to submit a doc patch to turn it into actual markup to get
>> rid of that, if people think that's a worhtwhile use of time and won't clash
>> with any other plans for the documentation.
>
> I guess it's a bit hard to say without knowing how voluminious the changes
> would be. If we end up rewriting the whole file the tradeoff is less clear
> than if it's a dozen inconsistent entries.

It turned out to not be that many that used [] for optional parameters,
see the attached patch. 

I havent dealt with variadic yet, since the two styles are visually
different, not just markup (<optional>...</optional> renders as [...]).

The two styles for variadic are the what I call caller-style:

   concat ( val1 "any" [, val2 "any" [, ...] ] )
   format(formatstr text [, formatarg "any" [, ...] ])

which shows more clearly how you'd call it, versus definition-style:

   num_nonnulls ( VARIADIC "any" )
   jsonb_extract_path ( from_json jsonb, VARIADIC path_elems text[] )

which matches the CREATE FUNCTION statement.  I don't have a strong
opinion on which we should use, but we should be consistent.

> Greetings,
>
> Andres Freund

- ilmari

From f71e0669eb25b205bd5065f15657ba6d749261f3 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= <ilmari@ilmari.org>
Date: Wed, 17 Apr 2024 16:00:52 +0100
Subject: [PATCH] func.sgml: Consistently use <optional> to indicate optional
 parameters

Some functions were using square brackets instead.
---
 doc/src/sgml/func.sgml | 54 +++++++++++++++++++++---------------------
 1 file changed, 27 insertions(+), 27 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 8dfb42ad4d..afaaf61d69 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -3036,7 +3036,7 @@
          <primary>concat</primary>
         </indexterm>
         <function>concat</function> ( <parameter>val1</parameter> <type>"any"</type>
-         [, <parameter>val2</parameter> <type>"any"</type> [, ...] ] )
+         <optional>, <parameter>val2</parameter> <type>"any"</type> [, ...] </optional> )
         <returnvalue>text</returnvalue>
        </para>
        <para>
@@ -3056,7 +3056,7 @@
         </indexterm>
         <function>concat_ws</function> ( <parameter>sep</parameter> <type>text</type>,
         <parameter>val1</parameter> <type>"any"</type>
-        [, <parameter>val2</parameter> <type>"any"</type> [, ...] ] )
+        <optional>, <parameter>val2</parameter> <type>"any"</type> [, ...] </optional> )
         <returnvalue>text</returnvalue>
        </para>
        <para>
@@ -3076,7 +3076,7 @@
          <primary>format</primary>
         </indexterm>
         <function>format</function> ( <parameter>formatstr</parameter> <type>text</type>
-        [, <parameter>formatarg</parameter> <type>"any"</type> [, ...] ] )
+        <optional>, <parameter>formatarg</parameter> <type>"any"</type> [, ...] </optional> )
         <returnvalue>text</returnvalue>
        </para>
        <para>
@@ -3170,7 +3170,7 @@
          <primary>parse_ident</primary>
         </indexterm>
         <function>parse_ident</function> ( <parameter>qualified_identifier</parameter> <type>text</type>
-        [, <parameter>strict_mode</parameter> <type>boolean</type> <literal>DEFAULT</literal> <literal>true</literal>
])
 
+        <optional>, <parameter>strict_mode</parameter> <type>boolean</type> <literal>DEFAULT</literal>
<literal>true</literal></optional> )
 
         <returnvalue>text[]</returnvalue>
        </para>
        <para>
@@ -3309,8 +3309,8 @@
          <primary>regexp_count</primary>
         </indexterm>
         <function>regexp_count</function> ( <parameter>string</parameter> <type>text</type>,
<parameter>pattern</parameter><type>text</type>
 
-         [, <parameter>start</parameter> <type>integer</type>
-         [, <parameter>flags</parameter> <type>text</type> ] ] )
+         <optional>, <parameter>start</parameter> <type>integer</type>
+         <optional>, <parameter>flags</parameter> <type>text</type> </optional> </optional> )
         <returnvalue>integer</returnvalue>
        </para>
        <para>
@@ -3331,11 +3331,11 @@
          <primary>regexp_instr</primary>
         </indexterm>
         <function>regexp_instr</function> ( <parameter>string</parameter> <type>text</type>,
<parameter>pattern</parameter><type>text</type>
 
-         [, <parameter>start</parameter> <type>integer</type>
-         [, <parameter>N</parameter> <type>integer</type>
-         [, <parameter>endoption</parameter> <type>integer</type>
-         [, <parameter>flags</parameter> <type>text</type>
-         [, <parameter>subexpr</parameter> <type>integer</type> ] ] ] ] ] )
+         <optional>, <parameter>start</parameter> <type>integer</type>
+         <optional>, <parameter>N</parameter> <type>integer</type>
+         <optional>, <parameter>endoption</parameter> <type>integer</type>
+         <optional>, <parameter>flags</parameter> <type>text</type>
+         <optional>, <parameter>subexpr</parameter> <type>integer</type> </optional> </optional> </optional>
</optional></optional> )
 
         <returnvalue>integer</returnvalue>
        </para>
        <para>
@@ -3360,7 +3360,7 @@
          <primary>regexp_like</primary>
         </indexterm>
         <function>regexp_like</function> ( <parameter>string</parameter> <type>text</type>,
<parameter>pattern</parameter><type>text</type>
 
-         [, <parameter>flags</parameter> <type>text</type> ] )
+         <optional>, <parameter>flags</parameter> <type>text</type> </optional> )
         <returnvalue>boolean</returnvalue>
        </para>
        <para>
@@ -3380,7 +3380,7 @@
         <indexterm>
          <primary>regexp_match</primary>
         </indexterm>
-        <function>regexp_match</function> ( <parameter>string</parameter> <type>text</type>,
<parameter>pattern</parameter><type>text</type> [, <parameter>flags</parameter> <type>text</type> ] )
 
+        <function>regexp_match</function> ( <parameter>string</parameter> <type>text</type>,
<parameter>pattern</parameter><type>text</type> <optional>, <parameter>flags</parameter> <type>text</type> </optional>
)
         <returnvalue>text[]</returnvalue>
        </para>
        <para>
@@ -3400,7 +3400,7 @@
         <indexterm>
          <primary>regexp_matches</primary>
         </indexterm>
-        <function>regexp_matches</function> ( <parameter>string</parameter> <type>text</type>,
<parameter>pattern</parameter><type>text</type> [, <parameter>flags</parameter> <type>text</type> ] )
 
+        <function>regexp_matches</function> ( <parameter>string</parameter> <type>text</type>,
<parameter>pattern</parameter><type>text</type> <optional>, <parameter>flags</parameter> <type>text</type> </optional>
)
         <returnvalue>setof text[]</returnvalue>
        </para>
        <para>
@@ -3426,8 +3426,8 @@
          <primary>regexp_replace</primary>
         </indexterm>
         <function>regexp_replace</function> ( <parameter>string</parameter> <type>text</type>,
<parameter>pattern</parameter><type>text</type>, <parameter>replacement</parameter> <type>text</type>
 
-         [, <parameter>start</parameter> <type>integer</type> ]
-         [, <parameter>flags</parameter> <type>text</type> ] )
+         <optional>, <parameter>start</parameter> <type>integer</type> </optional>
+         <optional>, <parameter>flags</parameter> <type>text</type> </optional> )
         <returnvalue>text</returnvalue>
        </para>
        <para>
@@ -3447,7 +3447,7 @@
         <function>regexp_replace</function> ( <parameter>string</parameter> <type>text</type>,
<parameter>pattern</parameter><type>text</type>, <parameter>replacement</parameter> <type>text</type>,
 
          <parameter>start</parameter> <type>integer</type>,
          <parameter>N</parameter> <type>integer</type>
-         [, <parameter>flags</parameter> <type>text</type> ] )
+         <optional>, <parameter>flags</parameter> <type>text</type> </optional> )
         <returnvalue>text</returnvalue>
        </para>
        <para>
@@ -3467,7 +3467,7 @@
         <indexterm>
          <primary>regexp_split_to_array</primary>
         </indexterm>
-        <function>regexp_split_to_array</function> ( <parameter>string</parameter> <type>text</type>,
<parameter>pattern</parameter><type>text</type> [, <parameter>flags</parameter> <type>text</type> ] )
 
+        <function>regexp_split_to_array</function> ( <parameter>string</parameter> <type>text</type>,
<parameter>pattern</parameter><type>text</type> <optional>, <parameter>flags</parameter> <type>text</type> </optional>
)
         <returnvalue>text[]</returnvalue>
        </para>
        <para>
@@ -3486,7 +3486,7 @@
         <indexterm>
          <primary>regexp_split_to_table</primary>
         </indexterm>
-        <function>regexp_split_to_table</function> ( <parameter>string</parameter> <type>text</type>,
<parameter>pattern</parameter><type>text</type> [, <parameter>flags</parameter> <type>text</type> ] )
 
+        <function>regexp_split_to_table</function> ( <parameter>string</parameter> <type>text</type>,
<parameter>pattern</parameter><type>text</type> <optional>, <parameter>flags</parameter> <type>text</type> </optional>
)
         <returnvalue>setof text</returnvalue>
        </para>
        <para>
@@ -3510,10 +3510,10 @@
          <primary>regexp_substr</primary>
         </indexterm>
         <function>regexp_substr</function> ( <parameter>string</parameter> <type>text</type>,
<parameter>pattern</parameter><type>text</type>
 
-         [, <parameter>start</parameter> <type>integer</type>
-         [, <parameter>N</parameter> <type>integer</type>
-         [, <parameter>flags</parameter> <type>text</type>
-         [, <parameter>subexpr</parameter> <type>integer</type> ] ] ] ] )
+         <optional>, <parameter>start</parameter> <type>integer</type>
+         <optional>, <parameter>N</parameter> <type>integer</type>
+         <optional>, <parameter>flags</parameter> <type>text</type>
+         <optional>, <parameter>subexpr</parameter> <type>integer</type> </optional> </optional> </optional>
</optional>)
 
         <returnvalue>text</returnvalue>
        </para>
        <para>
@@ -3980,7 +3980,7 @@
 
     <para>
 <synopsis>
-<function>format</function>(<parameter>formatstr</parameter> <type>text</type> [, <parameter>formatarg</parameter>
<type>"any"</type>[, ...] ])
 
+<function>format</function>(<parameter>formatstr</parameter> <type>text</type> <optional>,
<parameter>formatarg</parameter><type>"any"</type> [, ...] </optional>)
 
 </synopsis>
      <parameter>formatstr</parameter> is a format string that specifies how the
      result should be formatted.  Text in the format string is copied
@@ -10568,7 +10568,7 @@
 
    <para>
 <synopsis>
-date_trunc(<replaceable>field</replaceable>, <replaceable>source</replaceable> [, <replaceable>time_zone</replaceable>
])
+date_trunc(<replaceable>field</replaceable>, <replaceable>source</replaceable> <optional>,
<replaceable>time_zone</replaceable></optional>)
 
 </synopsis>
     <replaceable>source</replaceable> is a value expression of type
     <type>timestamp</type>, <type>timestamp with time zone</type>,
@@ -29308,11 +29308,11 @@
         <indexterm>
          <primary>pg_logical_emit_message</primary>
         </indexterm>
-        <function>pg_logical_emit_message</function> ( <parameter>transactional</parameter> <type>boolean</type>,
<parameter>prefix</parameter><type>text</type>, <parameter>content</parameter> <type>text</type> [,
<parameter>flush</parameter><type>boolean</type> <literal>DEFAULT</literal> <literal>false</literal>] )
 
+        <function>pg_logical_emit_message</function> ( <parameter>transactional</parameter> <type>boolean</type>,
<parameter>prefix</parameter><type>text</type>, <parameter>content</parameter> <type>text</type> <optional>,
<parameter>flush</parameter><type>boolean</type> <literal>DEFAULT</literal> <literal>false</literal></optional> )
 
         <returnvalue>pg_lsn</returnvalue>
        </para>
        <para role="func_signature">
-        <function>pg_logical_emit_message</function> ( <parameter>transactional</parameter> <type>boolean</type>,
<parameter>prefix</parameter><type>text</type>, <parameter>content</parameter> <type>bytea</type> [,
<parameter>flush</parameter><type>boolean</type> <literal>DEFAULT</literal> <literal>false</literal>] )
 
+        <function>pg_logical_emit_message</function> ( <parameter>transactional</parameter> <type>boolean</type>,
<parameter>prefix</parameter><type>text</type>, <parameter>content</parameter> <type>bytea</type> <optional>,
<parameter>flush</parameter><type>boolean</type> <literal>DEFAULT</literal> <literal>false</literal></optional> )
 
         <returnvalue>pg_lsn</returnvalue>
        </para>
        <para>
-- 
2.39.2


Re: documentation structure

От
jian he
Дата:
On Thu, Apr 18, 2024 at 2:37 AM Dagfinn Ilmari Mannsåker
<ilmari@ilmari.org> wrote:
>
> Andres Freund <andres@anarazel.de> writes:
>
> > Hi,
> >
> > On 2024-04-17 12:07:24 +0100, Dagfinn Ilmari Mannsåker wrote:
> >> Andres Freund <andres@anarazel.de> writes:
> >> > I think the manual work for writing signatures in sgml is not insignificant,
> >> > nor is the volume of sgml for them. Manually maintaining the signatures makes
> >> > it impractical to significantly improve the presentation - which I don't think
> >> > is all that great today.
> >>
> >> And it's very inconsistent.  For example, some functions use <optional>
> >> tags for optional parameters, others use square brackets, and some use
> >> <literal>VARIADIC</literal> to indicate variadic parameters, others use
> >> ellipses (sometimes in <optional> tags or brackets).
> >
> > That seems almost inevitably the outcome of many people having to manually
> > infer the recommended semantics, for writing something boring but nontrivial,
> > from a 30k line file.
>
> As Corey mentioned elsethread, having a markup style guide (maybe a
> comment at the top of the file?) would be nice.
>
> >> > And the lack of argument names in the pg_proc entries is occasionally fairly
> >> > annoying, because a \df+ doesn't provide enough information to use functions.
> >>
> >> I was also annoyed by this the other day (specifically wrt. the boolean
> >> arguments to pg_ls_dir),
> >
> > My bane is regexp_match et al, I have given up on remembering the argument
> > order.
>
> There's a thread elsewhere about those specifically, but I can't be
> bothered to find the link right now.
>
> >> and started whipping up a Perl script to parse func.sgml and generate
> >> missing proargnames values for pg_proc.dat, which is how I discovered the
> >> above.
> >
> > Nice.
> >
> >> The script currently has a pile of hacky regexes to cope with that,
> >> so I'd be happy to submit a doc patch to turn it into actual markup to get
> >> rid of that, if people think that's a worhtwhile use of time and won't clash
> >> with any other plans for the documentation.
> >
> > I guess it's a bit hard to say without knowing how voluminious the changes
> > would be. If we end up rewriting the whole file the tradeoff is less clear
> > than if it's a dozen inconsistent entries.
>
> It turned out to not be that many that used [] for optional parameters,
> see the attached patch.
>

hi.
I manually checked the html output. It looks good to me.



Re: documentation structure

От
Corey Huinker
Дата:
I havent dealt with variadic yet, since the two styles are visually
different, not just markup (<optional>...</optional> renders as [...]).

The two styles for variadic are the what I call caller-style:

   concat ( val1 "any" [, val2 "any" [, ...] ] )
   format(formatstr text [, formatarg "any" [, ...] ])

While this style is obviously clumsier for us to compose, it does avoid relying on the user understanding what the word variadic means. Searching through online documentation of the python *args parameter, the word variadic never comes up, the closest they get is "variable length argument". I realize that python is not SQL, but I think it's a good point of reference for what concepts the average reader is likely to know.

Looking at the patch, I think it is good, though I'd consider doing some indentation for the nested <optional>s to allow the author to do more visual tag-matching. The ']'s were sufficiently visually distinct that we didn't really need or want nesting, but <optional> is just another tag to my eyes in a sea of tags.

Re: documentation structure

От
Dagfinn Ilmari Mannsåker
Дата:
Corey Huinker <corey.huinker@gmail.com> writes:

>>
>> I havent dealt with variadic yet, since the two styles are visually
>> different, not just markup (<optional>...</optional> renders as [...]).
>>
>> The two styles for variadic are the what I call caller-style:
>>
>>    concat ( val1 "any" [, val2 "any" [, ...] ] )
>>    format(formatstr text [, formatarg "any" [, ...] ])
>>
>
> While this style is obviously clumsier for us to compose, it does avoid
> relying on the user understanding what the word variadic means. Searching
> through online documentation of the python *args parameter, the word
> variadic never comes up, the closest they get is "variable length
> argument". I realize that python is not SQL, but I think it's a good point
> of reference for what concepts the average reader is likely to know.

Yeah, we can't expect everyone wanting to call a built-in function to
know how they would define an equivalent one themselves. In that case I
propos marking it up like this:

    <function>format</function> (
    <parameter>formatstr</parameter> <type>text</type>
    <optional>, <parameter>formatarg</parameter> <type>"any"</type>
    <optional>, ...</optional> </optional> )
    <returnvalue>text</returnvalue>


> Looking at the patch, I think it is good, though I'd consider doing some
> indentation for the nested <optional>s to allow the author to do more
> visual tag-matching. The ']'s were sufficiently visually distinct that we
> didn't really need or want nesting, but <optional> is just another tag to
> my eyes in a sea of tags.

The requisite nesting when there are multiple optional parameters makes
it annoying to wrap and indent it "properly" per XML convention, but how
about something like this, with each parameter on a line of its own, and
all the closing </optional> tags on one line?

    <function>regexp_substr</function> (
    <parameter>string</parameter> <type>text</type>,
    <parameter>pattern</parameter> <type>text</type>
    <optional>, <parameter>start</parameter> <type>integer</type>
    <optional>, <parameter>N</parameter> <type>integer</type>
    <optional>, <parameter>flags</parameter> <type>text</type>
    <optional>, <parameter>subexpr</parameter> <type>integer</type>
    </optional> </optional> </optional> </optional> )
    <returnvalue>text</returnvalue>

A lot of functions mostly follow this style, except they tend to put the
first parameter on the same line of the function namee, even when that
makes the line overly long. I propose going the other way, with each
parameter on a line of its own, even if the first one would fit after
the function name, except the whole parameter list fits after the
function name.

Also, when there's only one optional argument, or they're independently
optional, not nested, the </optional> tag should go on the same line as
the parameter.

    <function>substring</function> (
    <parameter>bits</parameter> <type>bit</type>
    <optional> <literal>FROM</literal> <parameter>start</parameter> <type>integer</type> </optional>
    <optional> <literal>FOR</literal> <parameter>count</parameter> <type>integer</type> </optional> )
    <returnvalue>bit</returnvalue>


I'm not quite sure what to with things like json_object which have even
more complex nexting of optional parameters, but I do think the current
200+ character lines are too long.

- ilmari



Re: documentation structure

От
Corey Huinker
Дата:
Yeah, we can't expect everyone wanting to call a built-in function to
know how they would define an equivalent one themselves. In that case I
propos marking it up like this:

    <function>format</function> (
    <parameter>formatstr</parameter> <type>text</type>
    <optional>, <parameter>formatarg</parameter> <type>"any"</type>
    <optional>, ...</optional> </optional> )
    <returnvalue>text</returnvalue>

Looks good, but I guess I have to ask: is there a parameter-list tag out there instead of (, and should we be using that?

 
The requisite nesting when there are multiple optional parameters makes
it annoying to wrap and indent it "properly" per XML convention, but how
about something like this, with each parameter on a line of its own, and
all the closing </optional> tags on one line?

    <function>regexp_substr</function> (
    <parameter>string</parameter> <type>text</type>,
    <parameter>pattern</parameter> <type>text</type>
    <optional>, <parameter>start</parameter> <type>integer</type>
    <optional>, <parameter>N</parameter> <type>integer</type>
    <optional>, <parameter>flags</parameter> <type>text</type>
    <optional>, <parameter>subexpr</parameter> <type>integer</type>
    </optional> </optional> </optional> </optional> )
    <returnvalue>text</returnvalue>

Yes, that has an easy count-the-vertical, count-the-horizontal, do-they-match flow to it.
 
A lot of functions mostly follow this style, except they tend to put the
first parameter on the same line of the function namee, even when that
makes the line overly long. I propose going the other way, with each
parameter on a line of its own, even if the first one would fit after
the function name, except the whole parameter list fits after the
function name.

+1
 

Also, when there's only one optional argument, or they're independently
optional, not nested, the </optional> tag should go on the same line as
the parameter.

    <function>substring</function> (
    <parameter>bits</parameter> <type>bit</type>
    <optional> <literal>FROM</literal> <parameter>start</parameter> <type>integer</type> </optional>
    <optional> <literal>FOR</literal> <parameter>count</parameter> <type>integer</type> </optional> )
    <returnvalue>bit</returnvalue>

+1

Re: documentation structure

От
jian he
Дата:
On Wed, Apr 17, 2024 at 7:07 PM Dagfinn Ilmari Mannsåker
<ilmari@ilmari.org> wrote:
>
>
> > It'd also be quite useful if clients could render more of the documentation
> > for functions. People are used to language servers providing full
> > documentation for functions etc...
>
> A more user-friendly version of \df+ (maybe spelled \hf, for symmetry
> with \h for commands?) would certainly be nice.
>

I think `\hf` is useful.
otherwise people first need google to find out the function html page,
then need Ctrl + F to locate specific function entry.

for \hf
we may need to offer a doc url link.
but currently many functions are unlinkable in the doc.
Also one section can have many sections.
I guess just linking directly to a nearby position in a html page
should be fine.


We can also add a url for functions decorated as underscore
like mysql (https://dev.mysql.com/doc/refman/8.3/en/string-functions.html#function_concat).
I am not sure it is an elegant solution.



Re: documentation structure

От
jian he
Дата:
On Mon, Apr 15, 2024 at 1:00 PM jian he <jian.universality@gmail.com> wrote:
>
> On Wed, Mar 20, 2024 at 5:40 AM Andrew Dunstan <andrew@dunslane.net> wrote:
> >
> >
> > +many for improving the index.
> >
> > My own pet docs peeve is a purely editorial one: func.sgml is a 30k line beast, and I think there's a good case for
splittingout at least the larger chunks of it. 
> >
>
> I think I successfully reduced func.sgml from 311322 lines to 13167 lines.
> (base-commit: 93582974315174d544592185d797a2b44696d1e5)
>
> writing a patch would be unreviewable.

I've splitted it to7 patches.
each patch split one <sect1> into separate new files.

> func-string.sgml
> func-matching.sgml
> func-datetime.sgml
> func-json.sgml
> func-aggregate.sgml
> func-info.sgml
> func-admin.sgml

the above will be newly created files, each corresponding to related
individual patches.

Вложения

Re: documentation structure

От
Corey Huinker
Дата:
I've splitted it to7 patches.
each patch split one <sect1> into separate new files.

Seems like a good start. Looking at the diffs of these, I wonder if we would be better off with a func/ directory, each function gets its own file in that dir, and either these files above include the individual files, or the original func.sgml just becomes the organizer of all the functions. That would allow us to do future reorganizations with minimal churn, make validation of this patch a bit more straightforward, and make it easier for future editors to find the function they need to edit.