Обсуждение: Re: [DOCS] suggestion about SEO on www.postgresql.org/docs
On Thu, Apr 3, 2014 at 08:32:21PM +0200, Antony wrote: > > > > May I suggest that on the documentation on the current version (now 9.3) is added a link rel=canonical > > Ex: > http://www.postgresql.org/docs/9.3/static/ > > could have > > <link rel="canonical" href="http://www.postgresql.org/docs/current/static/" /> > > I believe this could help google offering the current documentation as a first choice. > Right now it probably considers it as a duplicated content. > > Probably not the right place to tell this, but could’t guess who to send this to. Are we using the rel="canonical" suggestion in our web docs now? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Wed, Aug 27, 2014 at 6:00 PM, Bruce Momjian <bruce@momjian.us> wrote: > Are we using the rel="canonical" suggestion in our web docs now? Apparently not. I looked into this and I'm not 100% certain we should do it. But if we decide so, I'm willing to code up a patch. https://tools.ietf.org/html/rfc6596 states: ==== 8< ==== The target (canonical) IRI MUST identify content that is either duplicative or a superset of the content atthe context (referring) IRI. Authors who declare the canonical link relation ought to anticipate that applications suchas search engines can: o Index content only from the target IRI (i.e., content from the context IRIs will be likely disregarded as duplicative). o Consolidate IRI properties, such as link popularity, to the target IRI. o Display the target IRI as the representative IRI. ==== 8< ==== We certainly want property 2, but property 1 suggests that older versions of docs are dropped from search engines altogether. It's not clear whether they are that strict in reality -- does anyone know? This would not be a problem if we also retained notes about earlier supported versions in the current version, which would make our latest version a "superset" of earlier ones. But I believe we very rarely remove material from docs, so I believe the upsides outweigh the cons. ---- Another question is whether we should make "interactive" point to "static" -- again, actually the interactive one is the superset, since static doesn't include user comments. But do we care about search engines indexing comments anyway? They're not present in sitemap.xml either and I've never landed on the interactive version when coming from Google. My proposal: 1. Doc pages that are *older* than current, and exist in the current version have canonical URL /docs/current/static/pagename.html 2. If it doesn't exist in current, we link to the last version that includes this page, like /docs/8.4/static/install-win32.html 3. Newer versions (devel/beta) should perhaps point to itself and not /current/? This would make new features googleable for testers. The doc links use rel=nofollow when linking to them, so they're already ranked lower by search engines. It appears there are already lots of places that hardcode the http://www.postgresql.org/ URL, so it makes sense to use absolute URLs for canonical too? Did I miss anything? Regards, Marti
Marti Raudsepp wrote: > Another question is whether we should make "interactive" point to > "static" -- again, actually the interactive one is the superset, since > static doesn't include user comments. But do we care about search > engines indexing comments anyway? They're not present in sitemap.xml > either and I've never landed on the interactive version when coming from Google. Please see this thread: http://www.postgresql.org/message-id/CABUevEySZgGdTaKJz=DYoYYkPqhV2Pi4RAeY2vLTsAGV0me3Ug@mail.gmail.com -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 10/07/2014 06:46 PM, Marti Raudsepp wrote: > On Wed, Aug 27, 2014 at 6:00 PM, Bruce Momjian <bruce@momjian.us> wrote: >> Are we using the rel="canonical" suggestion in our web docs now? > > Apparently not. I looked into this and I'm not 100% certain we should > do it. But if we decide so, I'm willing to code up a patch. > > https://tools.ietf.org/html/rfc6596 states: > ==== 8< ==== > The target (canonical) IRI MUST identify content that is either > duplicative or a superset of the content at the context (referring) > IRI. Authors who declare the canonical link relation ought to > anticipate that applications such as search engines can: > > o Index content only from the target IRI (i.e., content from the > context IRIs will be likely disregarded as duplicative). > > o Consolidate IRI properties, such as link popularity, to the target > IRI. > > o Display the target IRI as the representative IRI. > ==== 8< ==== > > We certainly want property 2, but property 1 suggests that older > versions of docs are dropped from search engines altogether. It's not > clear whether they are that strict in reality -- does anyone know? > > This would not be a problem if we also retained notes about earlier > supported versions in the current version, which would make our latest > version a "superset" of earlier > ones. > > But I believe we very rarely remove material from docs, so I believe > the upsides outweigh the cons. I'm not sure how search engines really behave here - dont we have any SEO experts on the list who can shed some light on this? > > ---- > Another question is whether we should make "interactive" point to > "static" -- again, actually the interactive one is the superset, since > static doesn't include user comments. But do we care about search > engines indexing comments anyway? They're not present in sitemap.xml > either and I've never landed on the interactive version when coming from Google. > > My proposal: > 1. Doc pages that are *older* than current, and exist in the current > version have canonical URL /docs/current/static/pagename.html > 2. If it doesn't exist in current, we link to the last version that > includes this page, like /docs/8.4/static/install-win32.html > 3. Newer versions (devel/beta) should perhaps point to itself and not > /current/? This would make new features googleable for testers. The > doc links use rel=nofollow when linking to them, so they're already > ranked lower by search engines. > > It appears there are already lots of places that hardcode the > http://www.postgresql.org/ URL, so it makes sense to use absolute URLs > for canonical too? I would actually strongly prefer to _NOT_ use even more absolute URLs on the website for multiple reasons, one is that it will make moving the website to https-only more difficult and the other one is that it makes playing with your own copy of it (running under a different url) a pain. I actually did a round of cleanups the other day (mostly on the presskit) to remove some of the hardcoded urls. Stefan