Any way to get list of broken links to images and files?

6 Replies, 1712 Views

Just wondered
We shall not cease from exploration
And the end of all our exploring 
Will be to arrive where we started
And know the place for the first time.
(This post was last modified: 2020-07-06, 02:33 PM by Max_B.)
[-] The following 2 users Like Max_B's post:
  • Laird, Valmar
(2018-09-09, 03:04 PM)Max_B Wrote: Just wondered if there was any way to spider psience to gather broken links to files...

I've noticed broken links in old posts on SF, to files I've uploaded to my Wordpress account, and assume now that Wordpress may be simply deleting files that they have received a takedown notice about... even if the notice is invalid... I'm certainly not getting any notification this is happening... but image files have certainly disappeared from my wordpress media directory, but it's impossible for me to say which files?

For my own website, I use a very old version of Xenu's Link Sleuth, which runs on a PC:
https://en.wikipedia.org/wiki/Xenu%27s_Link_Sleuth

It certainly finds a lot of broken links, though I think it doesn't necessarily do so if there is a forwarding mechanism in place.
[-] The following 1 user Likes Guest's post:
  • Laird
I use KLinkStatus, which runs on Linux.
Max, I did a little parsing for you. Ran this query on the PQ database:

Code:
select message from mybb_posts where message REGEXP 'thinkingdeeper';


Which pulled out all of the posts containing links to your website. Then put together a little script from some existing code I have, and extracted from those posts all of the URLs within them, sorted them, and removed duplicates. Here's the final list (there seems to be a minor bug in the script as the first URL seems incomplete - sorry about that). Hopefully this helps!

https://thinkingdeeper.files.wordpress.
https://thinkingdeeper.files.wordpress.c...-time4.jpg
https://thinkingdeeper.files.wordpress.c...eption.jpg
https://thinkingdeeper.files.wordpress.c...anisms.pdf
https://thinkingdeeper.files.wordpress.c...ments2.pdf
https://thinkingdeeper.files.wordpress.c...ments3.pdf
https://thinkingdeeper.files.wordpress.c...n_dead.png
https://thinkingdeeper.files.wordpress.c..._dead2.png
https://thinkingdeeper.files.wordpress.c....jpg?w=689
https://thinkingdeeper.files.wordpress.c...erence.jpg
https://thinkingdeeper.files.wordpress.c...l_box1.jpg
https://thinkingdeeper.files.wordpress.c...aware1.jpg
https://thinkingdeeper.files.wordpress.c...ware21.jpg
https://thinkingdeeper.files.wordpress.c...eyson2.pdf
https://thinkingdeeper.files.wordpress.c...tions3.jpg
https://thinkingdeeper.files.wordpress.c...er_fun.jpg
https://thinkingdeeper.files.wordpress.c...r_fun2.jpg
https://thinkingdeeper.files.wordpress.c...r_fun3.jpg
https://thinkingdeeper.files.wordpress.c...engths.jpg
https://thinkingdeeper.files.wordpress.c...=584&h=403
https://thinkingdeeper.files.wordpress.c..._binhi.png
https://thinkingdeeper.files.wordpress.c...=584&h=123
https://thinkingdeeper.files.wordpress.c...i_page.jpg
https://thinkingdeeper.files.wordpress.c...6_x_16.png
https://thinkingdeeper.files.wordpress.c...2_x_32.png
https://thinkingdeeper.files.wordpress.c...6_x_96.png
https://thinkingdeeper.files.wordpress.c...rg1974.pdf
https://thinkingdeeper.files.wordpress.c...oblems.jpg
https://thinkingdeeper.files.wordpress.c...9/rngs.png
https://thinkingdeeper.files.wordpress.c...d-2d_1.jpg
https://thinkingdeeper.files.wordpress.c...d-2d_2.jpg
https://thinkingdeeper.files.wordpress.c...d-2d_3.jpg
https://thinkingdeeper.files.wordpress.c...d-2d_4.jpg
https://thinkingdeeper.files.wordpress.c...d-2d_5.jpg
https://thinkingdeeper.files.wordpress.c...d-2d_7.jpg
https://thinkingdeeper.files.wordpress.c...d-2d_8.jpg?
https://thinkingdeeper.files.wordpress.c...2017_1.png
https://thinkingdeeper.files.wordpress.c...2017_2.png
https://thinkingdeeper.files.wordpress.c...2017_3.png
https://thinkingdeeper.files.wordpress.c...2017_4.png
https://thinkingdeeper.files.wordpress.c...2017_5.png
https://thinkingdeeper.files.wordpress.c...usion1.jpg
https://thinkingdeeper.files.wordpress.c...usion2.jpg
https://thinkingdeeper.files.wordpress.c...review.pdf
https://thinkingdeeper.files.wordpress.c...usness.pdf
https://thinkingdeeper.files.wordpress.c...3a59c.jpeg
https://thinkingdeeper.files.wordpress.c...fe2014.pdf
https://thinkingdeeper.files.wordpress.c...anasia.jpg
https://thinkingdeeper.files.wordpress.c...ntrose.jpg
https://thinkingdeeper.files.wordpress.c...demand.gif
https://thinkingdeeper.files.wordpress.c...y_1913.jpg
https://thinkingdeeper.files.wordpress.c...1913_2.jpg
https://thinkingdeeper.files.wordpress.c...field1.jpg
https://thinkingdeeper.files.wordpress.c...field2.jpg
https://thinkingdeeper.files.wordpress.c..._crash.jpg
https://thinkingdeeper.files.wordpress.c...shes_2.jpg
https://thinkingdeeper.files.wordpress.c...shes_3.jpg
https://thinkingdeeper.files.wordpress.c...ourier.png
https://thinkingdeeper.wordpress.com/201...xperiment/
https://thinkingdeeper.wordpress.com/201...nd-memory/
https://thinkingdeeper.wordpress.com/201...-research/
https://thinkingdeeper.wordpress.com/ste...ime_drugs/
(This post was last modified: 2018-09-09, 03:55 PM by Laird.)
[-] The following 3 users Like Laird's post:
  • Ninshub, Max_B, Doug
(2018-09-09, 03:37 PM)Laird Wrote: there seems to be a minor bug in the script as the first URL seems incomplete

Ah, turns out that's just because MyBB shortens bare URLs so that the displayed text of an auto-linkified bare URL is something like, e.g., "https://thinkingdeeper.files.wordpress. ... ments2.pdf" - from which, understandably (not a bug) given the space, my URL extractor pulled out the apparent URL "https://thinkingdeeper.files.wordpress.".
(2018-09-09, 03:37 PM)Laird Wrote: Here's the final list

And just to do a totally complete job...

I ran KLinkStatus on this thread page itself, and it found no broken links amongst all of those links to your site.

Clicking a few at random seems to confirm that they're all still functional.
This post has been deleted.

  • View a Printable Version
Forum Jump:


Users browsing this thread: 4 Guest(s)