On Tue, Apr 19, 2022 at 11:18 AM Daniel Gustafsson <daniel@yesql.se> wrote:
> On 18 Apr 2022, at 20:04, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Magnus Hagander <magnus@hagander.net> writes: >> What would be the actual *advantage* of excluding them? > > The immediate problem is that Google is still preferentially returning old > pages in some cases, e.g. top hit for "postgres gist gin index" is still > > https://www.postgresql.org/docs/9.1/textsearch-indexes.html > > Now maybe that just means they've not completely reindexed since we made > the canonical-version change, so I'm content to wait awhile longer > before concluding that that change wasn't sufficient. But we should be > considering the possibility that it wasn't.
That particular 9.1 page is the second hit for "postgres gin index" after the /current/ page for the Gin Index chapter. (I first thought it was the first hit since I dismissed the "featured snippet" result as an ad.) DuckDuckGo returns the 9.1 page or the current page seemingly at random for "postgres gin gist index".
Searching for "postgres gist gin index <version>" on Google returns the correct page for versions 8.3 through 9.4, for any other version (including lower) it returns /current/.
This seems to indicate it just hasn't picked that up yet? That's the bahaviour we saw before it found the rel=canonical parts, isn't it?
Removing the old content might improve search results, but it might also just remove it altogether bumping non-postgresql.org content higher.
Yeah, if we remove them completely then presumably they also stop counting as "link score" for us.