Re: Broken links in mailinglist archive due to percent-encoding

Поиск
Список
Период
Сортировка
От Erik Wienhold
Тема Re: Broken links in mailinglist archive due to percent-encoding
Дата
Msg-id 1805604733.531229.1694729117448@office.mailbox.org
обсуждение исходный текст
Ответ на Broken links in mailinglist archive due to percent-encoding  (Erik Wienhold <ewie@ewie.name>)
Список pgsql-www
On 29/08/2023 21:38 CEST Erik Wienhold <ewie@ewie.name> wrote:

> It looks like the archive percent-encodes subcomponent delimiters in the query
> component.  Perhaps the encoding is allowed and it's just git.postgresql.org
> that can't handle it.  But I'm pretty sure that links to git.postgresql.org
> from the archive worked in the past.

I've been digging around a bit more because this is an odd bug.

Turns out it's the result of applying Django's urlize filter to the message
body [1]:

    >>> from django.template.defaultfilters import urlize
    >>> urlize('http://example.net/foo?bar=baz;abc=123')
    '<a href="http://example.net/foo?bar=baz%3Babc%3D123" rel="nofollow">http://example.net/foo?bar=baz;abc=123</a>'

Looks like a bug in Django because it does not percent-encode any sub-delimiters
outside the query component:

    >>> urlize('http://example.net/foo;bar=baz')
    '<a href="http://example.net/foo;bar=baz" rel="nofollow">http://example.net/foo;bar=baz</a>'

And regarding git.postgresql.org: gitweb generates URLs with semicolon as the
separator of query pairs [2] instead of using ampersand, although semicolon is
no longer recommended by W3C.  But gitweb also handles query components with
ampersand instead of semicolon.  Which means that links [1] and [3] work after
I've manually replaced all semicolons with ampersands.

[1]
https://git.postgresql.org/gitweb/?p=pgarchives.git&a=blob&f=django/archives/mailarchives/templates/_message.html&h=c90a80afea418fc4800ae81bb517978fa56f7a4d&hb=HEAD#l64
[2] https://git.kernel.org/pub/scm/git/git.git/tree/gitweb/gitweb.perl#n1505
[3]
https://git.postgresql.org/gitweb/?p=postgresql.git&a=blob&f=src/bin/psql/describe.c&h=bac94a338cfbc497200f0cf960cbabce2dadaa33&hb=9b581c53418666205938311ef86047aa3c6b741f#l1420

--
Erik



В списке pgsql-www по дате отправления:

Предыдущее
От: "Jonathan S. Katz"
Дата:
Сообщение: Re: Broken URL on PostgreSQL 16 Press kit Page
Следующее
От: Joe Conway
Дата:
Сообщение: Re: Wiki editor request