diff --git a/robots.txt b/robots.txt index 44f6d6e8..f775195e 100644 --- a/robots.txt +++ b/robots.txt @@ -3,6 +3,9 @@ # Page is under FOPL-ZERO # https://owly.fans/license/fopl-zero +# Want to know when this page is updated? Follow the RSS! +# https://owly.fans/rss/robots.xml + # This page is also on git: # Please feel free to suggest a change to this file # https://github.com/DynTylluan/owly.fans/blob/main/robots.txt (main) @@ -64,6 +67,7 @@ Disallow: / # Used to «improve language models for our speech recognition technology», # so more AI rubbish that I don't want from a company that I don't like. User-Agent: FacebookBot +User-Agent: meta-externalagent Disallow: / # Google's AdSense/StoreBot bots @@ -150,7 +154,7 @@ Disallow: / # it, simply remove the «#» before the «User-agent» and «Disallow» part. # DuckDuckGo -# The search engine website uses the following bot to index sites. +# The search engine website uses the following bots to index sites. # https://duckduckgo.com/duckduckbot # https://duckduckgo.com/duckduckgo-help-pages/results/duckduckbot # https://duckduckgo.com/duckduckgo-help-pages/results/sources diff --git a/rss/index.html b/rss/index.html index f622398c..80de8a59 100644 --- a/rss/index.html +++ b/rss/index.html @@ -84,6 +84,13 @@

RSS Feeds

Doom: Rediscovering History is a blog all about Doom (1993) and its many, many mods made for it.

Follow every time a new issue is published: RSS Feed icon. + + + +

robots.txt

+

My robots.txt is used on a few websites by a number of sysops, so as a way of letting people know when a change is made only to this file, this feed was made.

+ +

Follow every time a new version is published: RSS Feed icon.


diff --git a/rss/robot.png b/rss/robot.png new file mode 100644 index 00000000..34d5efdf Binary files /dev/null and b/rss/robot.png differ diff --git a/rss/robots.xml b/rss/robots.xml new file mode 100755 index 00000000..71dc09af --- /dev/null +++ b/rss/robots.xml @@ -0,0 +1,41 @@ + + + + robots.txt updates + https://owly.fans/rss/robots.xml + See when OwlyFans updates their robots.txt + en-us + + https://owly.fans/rss/robot.png + + + + + 2024-07-29: This feed is set up + Mon, 29 Jul 2024 + + + The first update comes thanks to a post by Seirdy, who writes that ®Facebook/Meta updated its robots.txt entry for opting out of GenAI data scraping. If you blocked FacebookBot before, you should block meta-externalagent now [as the bot was renamed]?. +

+

+ It is legitimately scummy that Facebook chose to do this, but regardless, I have decided to block both FacebookBot and meta-externalagent, even if it is technically incorrect to block the former. +

+

+ Thank you to Piper of yarrie.net for showing me this originally. +

+

+ The Seirdy post: https://pleroma.envs.net/notice/AkLKKvKad7mzVYN8bY +

+ ]]> +
+
+ +
+
\ No newline at end of file