On 2017-05-21 03:00, Tom Lane wrote:
> I wrote:
>> Also, I found two places where an overlength comment line is simply busted
>> altogether --- notice that a character is missing at the split point:
>
> I found the cause of that: you need to apply this patch:
>
> --- freebsd_indent/pr_comment.c~ 2017-05-17 14:59:31.548442801 -0400
> +++ freebsd_indent/pr_comment.c 2017-05-20 20:51:16.447332977 -0400
> @@ -344,8 +353,8 @@ pr_comment(void)
> {
> int len = strlen(t_ptr);
>
> - CHECK_SIZE_COM(len);
> - memmove(e_com, t_ptr, len);
> + CHECK_SIZE_COM(len + 1);
> + memmove(e_com, t_ptr, len + 1);
> last_bl = strpbrk(e_com, " \t");
> e_com += len;
> }
>
> As the code stands, the strpbrk call is being applied to a
> not-null-terminated string and therefore is sometimes producing an
> insane value of last_bl, messing up decisions later in the comment.
> Having the memmove include the trailing \0 resolves that.
I have been analyzing this and came to different conclusions. Foremost,
a strpbrk() call like that finds the first occurrence of either space or
a tab, but last_bl means "last blank" - it's used for marking where to
wrap a comment line if it turns out to be too long. The previous coding
moved the character sequence byte after byte, updating last_bl every
time it was copying one of the two characters. I've rewritten that part as: CHECK_SIZE_COM(len);
memmove(e_com, t_ptr, len);
- last_bl = strpbrk(e_com, " \t"); e_com += len;
+ last_bl = NULL;
+ for (t_ptr = e_com - 1; t_ptr > e_com - len; t_ptr--)
+ if (*t_ptr == ' ' || *t_ptr == '\t') {
+ last_bl = t_ptr;
+ break;
+ } }
But then I also started to wonder if there is any case when there's more
than one character to copy and I haven't found one yet. It looks like } while (!memchr("*\n\r\b\t", *buf_ptr, 6) &&
(now_col <= adj_max_col || !last_bl));
guarantees that if we're past adj_max_col, it'll only be one non-space
character. But I'm not sure yet.