Psience Quest

Full Version: Any way to get list of broken links to images and files?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Just wondered

Chris

(2018-09-09, 03:04 PM)Max_B Wrote: [ -> ]Just wondered if there was any way to spider psience to gather broken links to files...

I've noticed broken links in old posts on SF, to files I've uploaded to my Wordpress account, and assume now that Wordpress may be simply deleting files that they have received a takedown notice about... even if the notice is invalid... I'm certainly not getting any notification this is happening... but image files have certainly disappeared from my wordpress media directory, but it's impossible for me to say which files?

For my own website, I use a very old version of Xenu's Link Sleuth, which runs on a PC:
https://en.wikipedia.org/wiki/Xenu%27s_Link_Sleuth

It certainly finds a lot of broken links, though I think it doesn't necessarily do so if there is a forwarding mechanism in place.
I use KLinkStatus, which runs on Linux.
Max, I did a little parsing for you. Ran this query on the PQ database:

Code:
select message from mybb_posts where message REGEXP 'thinkingdeeper';

Which pulled out all of the posts containing links to your website. Then put together a little script from some existing code I have, and extracted from those posts all of the URLs within them, sorted them, and removed duplicates. Here's the final list (there seems to be a minor bug in the script as the first URL seems incomplete - sorry about that). Hopefully this helps!

https://thinkingdeeper.files.wordpress.
https://thinkingdeeper.files.wordpress.c...-time4.jpg
https://thinkingdeeper.files.wordpress.c...eption.jpg
https://thinkingdeeper.files.wordpress.c...anisms.pdf
https://thinkingdeeper.files.wordpress.c...ments2.pdf
https://thinkingdeeper.files.wordpress.c...ments3.pdf
https://thinkingdeeper.files.wordpress.c...n_dead.png
https://thinkingdeeper.files.wordpress.c..._dead2.png
https://thinkingdeeper.files.wordpress.c....jpg?w=689
https://thinkingdeeper.files.wordpress.c...erence.jpg
https://thinkingdeeper.files.wordpress.c...l_box1.jpg
https://thinkingdeeper.files.wordpress.c...aware1.jpg
https://thinkingdeeper.files.wordpress.c...ware21.jpg
https://thinkingdeeper.files.wordpress.c...eyson2.pdf
https://thinkingdeeper.files.wordpress.c...tions3.jpg
https://thinkingdeeper.files.wordpress.c...er_fun.jpg
https://thinkingdeeper.files.wordpress.c...r_fun2.jpg
https://thinkingdeeper.files.wordpress.c...r_fun3.jpg
https://thinkingdeeper.files.wordpress.c...engths.jpg
https://thinkingdeeper.files.wordpress.c...=584&h=403
https://thinkingdeeper.files.wordpress.c..._binhi.png
https://thinkingdeeper.files.wordpress.c...=584&h=123
https://thinkingdeeper.files.wordpress.c...i_page.jpg
https://thinkingdeeper.files.wordpress.c...6_x_16.png
https://thinkingdeeper.files.wordpress.c...2_x_32.png
https://thinkingdeeper.files.wordpress.c...6_x_96.png
https://thinkingdeeper.files.wordpress.c...rg1974.pdf
https://thinkingdeeper.files.wordpress.c...oblems.jpg
https://thinkingdeeper.files.wordpress.c...9/rngs.png
https://thinkingdeeper.files.wordpress.c...d-2d_1.jpg
https://thinkingdeeper.files.wordpress.c...d-2d_2.jpg
https://thinkingdeeper.files.wordpress.c...d-2d_3.jpg
https://thinkingdeeper.files.wordpress.c...d-2d_4.jpg
https://thinkingdeeper.files.wordpress.c...d-2d_5.jpg
https://thinkingdeeper.files.wordpress.c...d-2d_7.jpg
https://thinkingdeeper.files.wordpress.c...d-2d_8.jpg?
https://thinkingdeeper.files.wordpress.c...2017_1.png
https://thinkingdeeper.files.wordpress.c...2017_2.png
https://thinkingdeeper.files.wordpress.c...2017_3.png
https://thinkingdeeper.files.wordpress.c...2017_4.png
https://thinkingdeeper.files.wordpress.c...2017_5.png
https://thinkingdeeper.files.wordpress.c...usion1.jpg
https://thinkingdeeper.files.wordpress.c...usion2.jpg
https://thinkingdeeper.files.wordpress.c...review.pdf
https://thinkingdeeper.files.wordpress.c...usness.pdf
https://thinkingdeeper.files.wordpress.c...3a59c.jpeg
https://thinkingdeeper.files.wordpress.c...fe2014.pdf
https://thinkingdeeper.files.wordpress.c...anasia.jpg
https://thinkingdeeper.files.wordpress.c...ntrose.jpg
https://thinkingdeeper.files.wordpress.c...demand.gif
https://thinkingdeeper.files.wordpress.c...y_1913.jpg
https://thinkingdeeper.files.wordpress.c...1913_2.jpg
https://thinkingdeeper.files.wordpress.c...field1.jpg
https://thinkingdeeper.files.wordpress.c...field2.jpg
https://thinkingdeeper.files.wordpress.c..._crash.jpg
https://thinkingdeeper.files.wordpress.c...shes_2.jpg
https://thinkingdeeper.files.wordpress.c...shes_3.jpg
https://thinkingdeeper.files.wordpress.c...ourier.png
https://thinkingdeeper.wordpress.com/201...xperiment/
https://thinkingdeeper.wordpress.com/201...nd-memory/
https://thinkingdeeper.wordpress.com/201...-research/
https://thinkingdeeper.wordpress.com/ste...ime_drugs/
(2018-09-09, 03:37 PM)Laird Wrote: [ -> ]there seems to be a minor bug in the script as the first URL seems incomplete

Ah, turns out that's just because MyBB shortens bare URLs so that the displayed text of an auto-linkified bare URL is something like, e.g., "https://thinkingdeeper.files.wordpress. ... ments2.pdf" - from which, understandably (not a bug) given the space, my URL extractor pulled out the apparent URL "https://thinkingdeeper.files.wordpress.".
(2018-09-09, 03:37 PM)Laird Wrote: [ -> ]Here's the final list

And just to do a totally complete job...

I ran KLinkStatus on this thread page itself, and it found no broken links amongst all of those links to your site.

Clicking a few at random seems to confirm that they're all still functional.