(2017-08-15, 11:52 PM)Laird Wrote: Max, you can, if you are concerned, right now use my forum-scraping app, FUPS, to pull your posts and only your posts from the Skeptiko Forums (unfortunately, it doesn't allow you to limit the scraping to just CD and Other Stuff). I am in the process of extending this app to pull all posts in a (sub)forum out of a XenForo forum - FUPS already supports this for phpBB - and I am well into this (not particularly difficult) process.
Please let me know if you have any problems using FUPS.
Laird,
Could you just get a copy of all of skeptiko (all users)? Would that be an easy change to your program? Could you restore/import it as a sub forum here? (Then you could delete the sub forums that Alex is retaining if you wanted.)
Do you know what is the format of backed up xenforo forums? Is it text / xml or binary? Could Alex back up the forums in question and then let us download the backup files? (All I want is something readable and searchable.)
Can myBB import xenforo backup files? Certain forum software can import forums from other types of forum software. Is there a migration path from xenForo to myBB?
The first gulp from the glass of science will make you an atheist, but at the bottom of the glass God is waiting for you - Werner Heisenberg. (More at my Blog & Website)
(This post was last modified: 2017-08-16, 03:52 AM by Jim_Smith.)
(2017-08-16, 03:39 AM)Jim_Smith Wrote: Laird,
Could you just get a copy of all of skeptiko (all users)?
That wouldn't be a good way to go about it because there would be no thread continuity, just a bunch of collections of posts by individual users.
(2017-08-16, 03:39 AM)Jim_Smith Wrote: Would that be an easy change to your program?
I've almost finished the changes that will allow us to scrape all threads in a set of specified forums - so, will probably scrape CD, Other Stuff, Consciousness & Science, and Extended Consciousness & Spirituality. I actually have something functional, there are just a few bugs to iron out.
(2017-08-16, 03:39 AM)Jim_Smith Wrote: Could you restore/import it as a sub forum here?
That could be tricky. The posts will be in HTML, and I don't think MyBB (the forum software running PsienceQuest) supports raw HTML in posts. Maybe we could set some sort of static restoration up in the web root (I deliberately ran the forums under the /forums directory so that we have the option to do stuff like that).
(2017-08-16, 03:39 AM)Jim_Smith Wrote: Do you know what is the format of backed up xenforo forums? Is it text / xml or binary?
As best I can tell, the only way to back up XenForo forums is to back up the underlying (MySQL?) database. The format would thus be SQL.
(2017-08-16, 03:39 AM)Jim_Smith Wrote: Could Alex back up the forums in question and then let us download the backup files?
I don't think it would be easy to provide a backup of just the forums in question, but it would be straightforward for him to back up the entire database and provide it to us. I PMed him asking him to do that very thing, but he didn't respond.
(2017-08-16, 03:39 AM)Jim_Smith Wrote: (All I want is something readable and searchable.)
I think I'll be able to satisfy those criteria. :-)
One thing my script doesn't handle though is avatars. It will at least retain usernames.
Is it worth just sending a webcrawler to get the stuff from Critical Discussions?
I've grabbed at least a good chunk of the Resource Thread info, at least AFAICTell, from just saving the HTML of the pages, but I notice there are some good discussions that are worth preserving in the mix.
Ah didn't realize that Exploration is on the chopping block too? I'll try to comb through and grab Resource threads but not sure if I can get everything just by hunting & pecking.
I'll try to see if a cheap/free webcrawler can grab everything off at least CD & SE.
Laird if you're grabbing stuff reliably honestly I would just get stuff from Consciousness and Science as well though maybe wait until after the deadline for the others. No telling when any of that content might end up gone.
'Historically, we may regard materialism as a system of dogma set up to combat orthodox dogma...Accordingly we find that, as ancient orthodoxies disintegrate, materialism more and more gives way to scepticism.'
- Bertrand Russell
(This post was last modified: 2017-08-16, 05:59 AM by Sciborg_S_Patel.)
Yes, Consciousness & Science is on the chopping block as well.
This post has been deleted.
(2017-08-16, 08:55 PM)Max_B Wrote: OK, tried getting a new Skeptiko password and logging in, but even when logged in FUPS hits the same error... so it doesn't work if you're banned, even if your logged in.
Oh, the login has to be via the FUPS app - I'm guessing you logged in via a browser? That has no effect on FUPS.
(2017-08-16, 08:55 PM)Max_B Wrote: Just so I understand clearly... you think after making changes to FUPS it will be possible for FUPS to scrape my posts from Skeptiko despite being banned? And further you expect to have those changes in place before the 21st Aug?
Yes. I have just finished updating the script to scrape entire forums, and am currently scraping the CD forum - it seems to be progressing just fine. So now I can get onto the changes I promised you so that you can scrape your posts.
(2017-08-16, 08:55 PM)Max_B Wrote: There is no pressure from me, but I'd like to understand clearly, so that I don't get to the 21st August only to discover that I've misunderstood you, and I should have just manually copied and pasted my posts by hand, because FUPS won't be able to scrape them automatically as I'm banned.
I estimate I'll have something ready for you at least within 24 hours, probably much sooner.
(2017-08-17, 02:03 AM)Laird Wrote: I estimate I'll have something ready for you at least within 24 hours, probably much sooner.
As in much, much sooner, lol.
Should be working for you now, Max. All you need to do is enter all details as before, but this time for the new setting "Extract User Username", enter "Max_B". Any problems, let me know!
Oh, but a warning: don't enter anything in the new field, "Forum IDs" - that will result in FUPS downloading not only your posts, but also the entire set of threads/posts of any forums that you specify here. I am already downloading these myself, and will make them publicly available, so there's no need for anybody else to do that too.
(And yes, I know the documentation of that new field isn't very good).
(This post was last modified: 2017-08-17, 04:01 AM by Laird.)
This post has been deleted.
(2017-08-17, 11:40 AM)Max_B Wrote: Thanks so much Laird, you've saved all those many many hours of my research, and interesting ideas that could have been lost forever. Honestly I'm indebted to you.
Happy to have helped, Max. You've helped us too, so it's a circle of giving.
Also, just to let folks know, I'm uploading my scrapes of the Skeptiko Forums here: http://psiencequest.net/skeptiko-forum-scrapes/. So far the only one is the Critical Discussions forum. Obviously, the format ( JSON) is a bit unwieldy, and I'll need to work on converting it into a bunch of HTML pages, but all the data is there. The file is huge, and is probably best opened either in Google Chrome or a text editor - Firefox seems to struggle with it, but if you have lots of memory, Firefox might manage it.
(This post was last modified: 2017-08-17, 08:21 PM by Laird.)
|