Skip to content

Using the link checker

Philip Colmer edited this page Aug 15, 2019 · 3 revisions

The link checker is an essential part of the validation process that Linaro applies to every pull request submitted against a Linaro-built static web site. It is therefore useful for contributors to be able to use the same tool before raising a pull request in order to ensure that the contribution being submitted doesn't contain any broken links.

The simplest way to run the tool is to go into the directory containing the built site and run this command:

../check-links.sh

Note the use of two full-stops to reference the directory above the one you are currently in. This is done because you have to be inside the directory to be checked and the script is in the directory above that.

Command-line options

--skip-dns-check <file> This allows you to specify a text file that is a list of domains which may occasionally cause errors when resolving names to IP addresses. Using this option tells the tool to ignore any name resolution errors. Note that the file should be relative to the directory containing the script and not the directory you are in. So, for example, if you were checking the 96Boards web site, you might use:

--skip-dns-check _data/fqdn_exceptions.txt

--skip-path <path> or -s <path> This allows you to specify one or more files or directories that the tool should not bother checking. This is typically used if a directory or file is known to contain broken links because they refer to an external resource that is no longer available and it is better to keep the original content than try to fix it. The option can be used multiple times.

--verbose or -v This allows you to get some debugging information out of the tool. Multiple uses of the option increases the level of verbosity.

--file <file> or -f <file> This allows you to specify one or more files that are the only ones to be checked. Everything else is ignored. The option can be used multiple times. Note that you must specify files and not directories.

--nointernal Tells the tool to skip checking of internal references.

--noexternal Tells the tool to skip checking of external references.

There are a couple of other options but they are not useful when running the tool interactively.

External link checking

The tool checks multiple external links in parallel to get the final results as quickly as possible. While it is doing so, a number of characters are output to indicate progress and possible reasons for failure. Note that the pages are not necessarily checked in any particular order so the characters are just for information purposes.

. - success

Failures:

D - DNS resolution failure

X - 404 or 405 error

_ - other 4xx error

It is also possible for the tool to get a variety of "non-fatal" errors, meaning that there was an error of some sort that prevented the tool from confirming that the URL was correct but they aren't treated as page failures. These are indicated by lower-case letters between a and h.

When the tool finishes, it produces a list of any failed external links and which pages reference them.

Clone this wiki locally