Skip to content

Releases: bee4/robots.txt

Official RFC support

24 Feb 22:41
Compare
Choose a tag to compare

Now, with the help of @ranvis, the lib fully support the official RFC.

All the details are accessible here: http://www.robotstxt.org/norobots-rfc.txt

This release include :

  • Inline comments parsing - #7,
  • Multiple / Missing spaces before and after the rule value - #8,
  • Rule matching must be case insensitive.

A great move !

16 Feb 16:10
Compare
Choose a tag to compare

All issues has been fixed \o/

The test suite is now ran with atoum. It now has a better coverage and some integration tests on real world examples.

This lib is following all the robots.txt official guidelines 😄 :

  • Multiple User-Agent definition
  • Case insensitive User-Agent rules
  • Pattern building using $, *

This release also welcome different exceptions to identify more precisely the errors :

  • InvalidUrlException
  • InvalidContentException

Hope this help !

Hotfix release

05 May 09:11
Compare
Choose a tag to compare

This release include some updates about the parsing of empty robots.txt file.

Some rules was not properly handled by the v0.0.0 :

  • If the file is empty, this is like an Allow All directive
  • If the file contains two rules with the same UserAgent, a specific exception is thrown : Bee4\RobotsTxt\Exception\DuplicateRuleException

First release !

13 Mar 14:03
Compare
Choose a tag to compare

For the implemented rules, you must Visit Google details here: https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt

It include a simple API to manipulate robots.txt files.