Broken Link Checkers
As previously mentioned, I'm looking for an online program that will check my site for broken links, both internally and externally. I've tried a couple now but not got the reports I wanted.
The first site I tried was the W3C Link Checker. Who better to check my site than the masters of the web? The options default to checking a single page, but there's an option to specify recursion, and how deep. It also tells you about redirects (both '401 permanent' and '402 temporary' redirects). Unfortunately it's pretty academic as there's a hard limit of 150 pages checked. It blew most of those checking all the photos in my gallery.
The dead links checker seemed to be what I was looking for. It has a maximum execution time of 45 minutes, which seemed to be enough for this site. Unfortunately, to spider entire sites in that time the response timeout for requests is rather short. Their disclaimer:
The dead link checker can report as a broken link any address if the time to get any response is higher than a fixed limit. The timeout is deliberately short to make a faster analysis. Repeat the analysis if you have any doubt.
The very first link on my site that it found and followed came back as not found:
http://surferbill.com/ 258 links found
Link towards http://www.dead-links.com/ found!.
Maximum documents to crawl in host surferbill.com: 15000
1 visited - 94 in the host - 112 out
http://surferbill.com/gallery/6am 404 Not Found
The occasional false positive might be OK, but this short timeout generated enough 'missing' links that weren't, that going through the report was still too time-consuming.
There's also no way to exclude a single site, so it also tried to follow all the Blogger links built into my posts. Blogger doesn't like being spidered so blocks HEAD requests, resulting in '400 Bad Request' errors. It also (logically) followed the links to delete my posts, and even reported some as 404 Not Found...
http://www2.blogger.com/comment.g?blogID=8541327&postID=111900168983833811 400 Bad Request
http://www2.blogger.com/delete-comment.g?blogID=8541327&postID=112895650015539451 405 Method Not Allowed
http://www2.blogger.com/comment.g?blogID=8541327&postID=111841173066060753 400 Bad Request
http://www2.blogger.com/comment.g?blogID=8541327&postID=113801380150834838 400 Bad Request
http://www2.blogger.com/comment.g?blogID=8541327&postID=113691168261357715 404 Not Found
Frustrating. Anyway, I'm still on the look-out for a FREE, reliable, site-wide, (preferably online) broken link checker that will follow internal links recursively and direct external links, so let me know what you recommend. If I don't find anything online I might be tempted to install something that'll do the job.







0 Comments:
Post a Comment
<< Home