Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block web crawlers on v1.9-branch #3862

Open
wants to merge 5 commits into
base: v1.9-branch
Choose a base branch
from

Conversation

thesuperzapper
Copy link
Member

@thesuperzapper thesuperzapper commented Sep 4, 2024

Google keeps surfacing links from very old versions of the Kubeflow docs website, which is confusing to users.

This PR adds the <meta name="robots" content="noindex"> meta tag to tell Google to stop indexing the pages from the v1.9-branch branch.

It will take probably a few months (or longer) for Google to re-index these pages, I have left nofollow off, so that google will re-index the whole site quicker by following links.


It also backports the changes from #3863 so the version selector is consistent across each archive version.


You might be wondering, why should we do this for the 1.9 branch, isn't that the latest version?

  1. There is still no reason for google to index these pages:
    • The latest content is on the main site.
    • Google will take a long time to remove things from the index if we add this metadata at a later date.
  2. The "The site that you are currently viewing is an archived snapshot" header is applicable:
    • It's a snapshot from the time of the 1.9 release.
    • We don't typically make any further updates to the versioned branches.
    • It's hard for users to realize they ended up on a snapshotted version of the docs without that warning header.

@thesuperzapper thesuperzapper changed the base branch from master to v1.9-branch September 4, 2024 23:02
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign james-jwu for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

githubbranch = "master"
url = "https://master.kubeflow.org"
url = "https://www.kubeflow.org"
Copy link
Member

@ederign ederign Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, do the www prefix instead of kubeflow.org change anything on google indexing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or did you change the other PRs for the sake of consistency?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Google prefers direct links, and only adds to the "backlinks" count for real links, not ones that are redirected (at least that's my experience).

@ederign
Copy link
Member

ederign commented Sep 6, 2024

/lgtm

@google-oss-prow google-oss-prow bot removed the lgtm label Sep 6, 2024
Copy link

New changes are detected. LGTM label has been removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants