Mura Link Checker

Usage

Running the check for all the links is just a matter of clicking on the "Start checking the site" button.

Timeout can be increased to reduce the risk of timeouts, or reduced to make the check faster.

Redirects can be ignored, which means that URLs returning redirection codes will not be reported. In this case, the link checker will follow redirects to see if the redirection returns an error code.

Results

Each issue is reported with the page containing the link, the status, the HTML element the link is on (the link checker checks a, img, video, audio, source, track, embed, script and iframe), and the link.

The status is usually the returned status code, but it can be a string for internal links or if the link checker has problems when connecting. For instance, if a Mura page is not found for an internal link, the status will be "not found", but the link checker does not actually try an HTTP connection in this case, it just looks in the Mura database to see if the page exists (which is much faster).

A list of status codes is available here [mozilla.org].

Known Issues

  • Some SSL issues are simply caused by Java Virtual Machine limitations in handling SSL certificates and can be disregarded.

  • Some sites do not support the HEAD method used by the link checker, and will report a 405 status code.

  • When DNS resolution is very slow, a status can be unknown host, even when the timeout is increased. Unfortunately, Java does not provide a way to set the DNS resolution timeout (which is not part of the connection timeout entered in the link checker).

Source

The source code is available at https://github.com/MSU-NatSci/MuraLinkChecker.

Comments

Post a Comment

Required Field