Re: [PGdocs] fix description for handling pf non-ASCII characters

Поиск
Список
Период
Сортировка
От Karl O. Pinc
Тема Re: [PGdocs] fix description for handling pf non-ASCII characters
Дата
Msg-id 20230926010328.2c556046@slate.karlpinc.com
обсуждение исходный текст
Ответ на RE: [PGdocs] fix description for handling pf non-ASCII characters  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
Ответы RE: [PGdocs] fix description for handling pf non-ASCII characters  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
Список pgsql-hackers
Hello Hayato and Jian,

On Tue, 4 Jul 2023 01:30:56 +0000
"Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> wrote:

> Dear Jian,

> > looks fine. Do you need to add to commitfest?  
> 
> Thank you for your confirmation. ! I registered to following:
> 
> https://commitfest.postgresql.org/44/4437/

The way the Postgres commitfest process works is that
someone has to update the page to mark "reviewed" and the
reviewer has to use the commitfest website to pass
the patches to a committer.

I see a few problems with the English and style of the patches
and am commenting below and have signed up as a reviewer.  At
commitfest.postgresql.org I have marked the thread
as needing author attention.  Hayato, you will need
to mark the thread as needing review when you have
replied to this message.

Jian, you might want to sign on as a reviewer as well.
It can be nice to have that record of your participation.

Now, to reviewing the patch:

First, it is now best practice in the PG docs to
put a line break at the end of each sentence.
At least for the sentences on the lines you change.
(No need to update the whole document!)  Please
do this in the next version of your patch.  I don't
know if this is a requirement for acceptance by
a committer, but it won't hurt.

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e700782d3c..a4ce99ba4d 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7040,9 +7040,8 @@ local0.*    /var/log/postgresql
         The name will be displayed in the
<structname>pg_stat_activity</structname> view and included in CSV log
entries.  It can also be included in regular log entries via the <xref
linkend="guc-log-line-prefix"/> parameter.
-        Only printable ASCII characters may be used in the
-        <varname>application_name</varname> value. Other characters
will be
-        replaced with question marks (<literal>?</literal>).
+        Non-ASCII characters used in the
<varname>application_name</varname>
+        will be replaced with hexadecimal strings.
        </para>
       </listitem>
      </varlistentry>

Don't use the future tense to describe the system's behavior.  Instead
of "will be" write "are".  (Yes, this change would be an improvement
on the original text.  We should fix it while we're working on it
and our attention is focused.)

It is more accurate, if I understand the issue, to say that characters
are replaced with hexadecimal _representations_ of the input bytes.
Finally, it would be good to know what representation we're talking
about.  Perhaps just give the \xhh example and say: the Postgres
C-style escaped hexadecimal byte value.  And hyperlink to
https://www.postgresql.org/docs/16/sql-syntax-lexical.html#SQL-SYNTAX-STRINGS-ESCAPE

(The docbook would be, depending on text you want to link:

<link linkend="sql-syntax-strings-escape">C-style escaped hexadecimal
byte value</link>.

I think.  You link to id="someidvalue" attribute values.)


@@ -8037,10 +8036,9 @@ COPY postgres_log FROM
'/full/path/to/logfile.csv' WITH csv; <para>
         The name can be any string of less
         than <symbol>NAMEDATALEN</symbol> characters (64 characters in
a standard
-        build). Only printable ASCII characters may be used in the
-        <varname>cluster_name</varname> value. Other characters will be
-        replaced with question marks (<literal>?</literal>).  No name
is shown
-        if this parameter is set to the empty string
<literal>''</literal> (which is
+        build). Non-ASCII characters used in the
<varname>cluster_name</varname>
+        will be replaced with hexadecimal strings. No name is shown if
this
+        parameter is set to the empty string <literal>''</literal>
(which is the default). This parameter can only be set at server start.
        </para>
       </listitem>

Same review as for the first patch hunk.

diff --git a/doc/src/sgml/postgres-fdw.sgml
b/doc/src/sgml/postgres-fdw.sgml index 5062d712e7..98785e87ea 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -1067,9 +1067,8 @@ postgres=# SELECT postgres_fdw_disconnect_all();
       of any length and contain even non-ASCII characters.  However
when it's passed to and used as <varname>application_name</varname>
       in a foreign server, note that it will be truncated to less than
-      <symbol>NAMEDATALEN</symbol> characters and anything other than
-      printable ASCII characters will be replaced with question
-      marks (<literal>?</literal>).
+      <symbol>NAMEDATALEN</symbol> characters and non-ASCII characters
will be
+      replaced with hexadecimal strings.
       See <xref linkend="guc-application-name"/> for details.
      </para>
 
Same review as for the first patch hunk.

Since the both of you have looked and confirmed that the
actual behavior matches the new documentation I have not
done this.

But, have either of you checked that we're really talking about
replacing everything outside the 7-bit ASCII encodings?  
My reading of the commit referenced in the first email of this
thread says that it's everything outside of the _printable_
ASCII encodings, ASCII values outside of the range 32 to 127,
inclusive.  

Please check.  The docs should probably say "printable ASCII",
or "non-printable ASCII", depending.  I think the meaning
of "printable ASCII" is widely enough known to be 32-127.
So "printable" is good enough, right?

Regards,

Karl <kop@karlpinc.com>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrey Lepikhov
Дата:
Сообщение: Re: POC: GROUP BY optimization
Следующее
От: Daniel Gustafsson
Дата:
Сообщение: Re: Improve the log message output of basic_archive when basic_archive.archive_directory parameter is not set