Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sitemap #175

Open
arildm opened this issue Dec 15, 2023 · 1 comment
Open

Sitemap #175

arildm opened this issue Dec 15, 2023 · 1 comment
Labels
effort 3 high enhancement New feature or request prio 1 low Nice-to-have

Comments

@arildm
Copy link
Member

arildm commented Dec 15, 2023

Google (and other bots) might not easily find all the urls and index them when crawling the site, since most works and terms are only listed after searching or paging/scrolling, which I doubt the bots will be doing.

Background:

  • A sitemap is a list of urls on the site [Google: Build a Sitemap]
  • They can be in XML, RSS or plain text. The plain text format is literally a list of urls. The other formats can be used to add metadata, but that doesn't seem really valuable in our case.
  • The sitemap can be submitted in the Google Search Console, or referred from robots.txt
  • We could generate one by doing API queries and outputting route paths
    • I'm sure WordPress can generate a sitemap for the /om/ pages. We can combine the two sitemaps with a sitemap index
  • I don't think we should generate the sitemap as a static file at build time, as this would need to be triggered again manually after relevant changes in Libris or the QLIT backend
  • Better, perhaps, to do the sitemap generation in a Vue route view. It would take a long time but hopefully that's not a problem.
@arildm arildm added enhancement New feature or request effort 3 high prio 2 medium prio 1 low Nice-to-have and removed prio 2 medium labels Dec 15, 2023
@arildm
Copy link
Member Author

arildm commented Jan 8, 2024

Another approach that would probably benefit search bot indexing is #57 "Search query as url parameters", especially including the page number. If I understand correctly, the bot can then navigate through the result list and reach all the work urls by itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
effort 3 high enhancement New feature or request prio 1 low Nice-to-have
Projects
None yet
Development

No branches or pull requests

1 participant