Dactyl includes a link-checking script to automatically detect and report on broken hyperlinks in your generated documentation.
First, build some documentation to an output path. Depending on your configuration, you may want to build multiple targets to different output directories before running the link checker.
Then, run the link checker as follows:
This checks all the files in the output directory for links and confirms that any HTTP(S) links, including relative links to other files, are valid. For anchor links, it checks that an element with the correct ID exists in the target file. It also checks that the
src of all image tags exists.
If there are links that are always reported as broken but you don't want to remove (for example, URLs that block Python's user-agent) you can add them to the
known_broken_links array in the config.
In quiet mode (
-q), the link checker still reports in every 30 seconds just so that it doesn't get treated as stalled and killed by continuous integration software (e.g. Jenkins).
To reduce the number of meaningless failure reports (because a particular website happened to be down momentarily while you ran the link checker), if there are any broken remote links, the link checker waits 2 minutes after finishing and then retries those links in case they came back up. (If they did, they're not considered broken for the link checker's final report.)
You can also run the link checker in offline mode (
-o) to skip any remote links and just check that the files and anchors referenced exist in the output directory.
ignore_anchors_in array in your config to skip checking for links that go to anchors in such pages.
Unusual Link Types
Some unusual types of links that you may encounter in HTML or Markdown include:
- Protocol relative URL - these start with
//and refer to "whatever protocl is being used now". Dactyl assumes HTTPS should work for these URLs.
- Mailto URL - Email addresses. The link checker ignores these and other links that use other URI schemes.
- Empty anchor - A link to
<img>paths but OK for