Google
 

View Full Version : Downloading Threads For Offline Browsing


Oberon
11-23-2004, 12:37 PM
Is there a Download Manager, or, can I set uop my download manager to download the topics as well as the posts in the threads to a folder for offline use, or do I have to download each and every web page?

Does this question make any sense? LOl

Basically, I want to click on say the Big Debates Forum, page 1, and download the threads listed there to a folder on my drive and view them later, rather than have to stay online and save each individual page of every thread. IT's probably simple, but I can't figure out how, so any help on this is apreciated.

Gibson
11-23-2004, 12:46 PM
Look into Teleport Pro http://www.tenmax.com
I'm not sure how it'd work as everything is related to MySQL databases in VBulletin forums. It MAY work, it may not :shrug:

Kraw
11-23-2004, 01:32 PM
there's a hack that lets you do that kinda stuff. Is it really worth looking into? Would anyone else use it?

Oberon
11-23-2004, 03:41 PM
Well, since they are web pages, it shouldn't be a big deal, shouldn't need a database program of any sort. I was screwing around with a Magpie Tools program for Mozilla FireFox, and got it to work on another website, but didn't write down or remember the sequence of what I did; it works by incrementing or decrementing the URL's somehow, but that may depend on how a site is set up? Anyway, I thought it was cool to download the pages all the links on the 'main page' and store them to look at later. Seems to me it save bandwidth all the way around.

I've also been screwing with NEt Ants, but can't get that to do it. Anyway, I thought somebody who knows a lot more about it than I do would be using something like this. I'll check out the links. As long as it's just downloading web pages by 'layer', like one 'level' below the current that the main page is linked to, it shouldn't be a hacker thing.

On the other hand, it may be problematic on something like a forum, where there are many links on the same page; there were only three on the one I got to work.

Gibson
11-23-2004, 05:25 PM
Well, since they are web pages, it shouldn't be a big deal, shouldn't need a database program of any sort. I was screwing around with a Magpie Tools program for Mozilla FireFox, and got it to work on another website, but didn't write down or remember the sequence of what I did; it works by incrementing or decrementing the URL's somehow, but that may depend on how a site is set up? Anyway, I thought it was cool to download the pages all the links on the 'main page' and store them to look at later. Seems to me it save bandwidth all the way around.

I've also been screwing with NEt Ants, but can't get that to do it. Anyway, I thought somebody who knows a lot more about it than I do would be using something like this. I'll check out the links. As long as it's just downloading web pages by 'layer', like one 'level' below the current that the main page is linked to, it shouldn't be a hacker thing.

On the other hand, it may be problematic on something like a forum, where there are many links on the same page; there were only three on the one I got to work.

Yeah they're web pages, but the content contained is generated by fields in a database on the server. It'd be easy if a forum used html but they don't they use PHP.

Oberon
11-24-2004, 02:21 AM
Yeah they're web pages, but the content contained is generated by fields in a database on the server. It'd be easy if a forum used html but they don't they use PHP.

Ah, that's probably it. I've tried it on regular websites and got it to work, but when I tried it on this and a couple of other forums it would give an error message, since it can't determine the size of the download.

Manu
11-25-2004, 07:03 PM
I usually just open up like 25 tabs.

goa103
02-08-2005, 01:36 PM
Is there a Download Manager, or, can I set uop my download manager to download the topics as well as the posts in the threads to a folder for offline use, or do I have to download each and every web page?

To archive the whole thread you can simply save the Printable Version of this thread, that feature is available from the Thread Tools, just above the first post, your post.

Basically, I want to click on say the Big Debates Forum, page 1, and download the threads listed there to a folder on my drive and view them later, rather than have to stay online and save each individual page of every thread. IT's probably simple, but I can't figure out how, so any help on this is apreciated.

The above solution doesn't work for multiple pages threads as the printable version doesn't include all pages. The good thing with the printable version, it's that you only get the contents you're interested in. To archive all pages my advice is to use an offline browsing tool. You mentionned the Magpie Firefox extension, I think you could archive all the pages using it, not sure as I don't know much about this extension, however it seems it allows you to archive multiple files from a series. For example the first page of the 49118 thread is at http://www.discussanything.com/forums/showthread.php?t=49118. The next page is at http://www.discussanything.com/forums/showthread.php?t=49118&page=2&pp=20. You should try to setup a page counter using Magpie.

You also mentionned the NetAnts download manager but it's not made to archive a website nor multiple pages, it's just a download manager, not an offline browsing tool.

If I had to archive all the pages of that thread, I would do it using the HTTrack Website Copier (http://www.httrack.com) software. This tool allows you to archive a website and specify filters. URLs to include, others to exclude.

Here is the process to archive the pages of the 49118 thread.

Step 1

http://goa103.free.fr/http/www.discussanything.com/forums/images/1.png

Enter a New project name, select a Base path for your archives and click Next.

Step 2

http://goa103.free.fr/http/www.discussanything.com/forums/images/2.png

Select Download web site(s) from the Action list. Enter the URLs you want to archive in the Web Addresses (URL) text area or add them clicking Add URL. Next click Set options to set your preferences and mirror options.

Step 3

http://goa103.free.fr/http/www.discussanything.com/forums/images/3.png

From this dialog switch to the Scan Rules tab. It allows you to include and exclude links to archive. You can manually enter exclude or include filters, or automatically using the wizards clicking Exclude link(s) or Include link(s). Let's do it manually. The first thing is to exclude all links by specifying -* in the text area. - stands for exclude, * stands for any character - so everything. It excludes everything, so it excludes all links. We exclude all links because we're only interested by the pages of our thread. That's our next step. We add +*http://www.discussanything.com/forums/showthread.php?t=49118* to the text area. + stands for include. We also place our link between two * characters but you don't have to use the first one, only the last one matters as our goal is to archive all links beginning with http://www.discussanything.com/forums/showthread.php?t=49118. This link is the first page of our thread, the thread itself. The next page is at http://www.discussanything.com/forums/showthread.php?t=49118&page=2&pp=20. The third one at http://www.discussanything.com/forums/showthread.php?t=49118&page=3&pp=20... So for the second page, the * includes &page=2&pp=20, the end of our URL...

Now click OK, Next and finally Finish. The archiving process will begin and you will get the whole thread as multiple HTML pages. In fact you will get more that 3 pages even if the thread is made up by 3 pages. It seems the pages use links that make HTTrack believes that a page is different even if it's the same one, because of the URL query. It's an other story :). So you should get 6 pages instead of 3. No big deal :). You could also try to archive the printable version pages to get a clean & neat archive of the thread. And adding some exclude filters, you could get a 3 HTML pages archive, the perfect archive. I let you find the solution by yourself :).

Last but not least note that we excluded all other links, it means embedded images, avatars and other resources are not archived. But by specifying other filters, you could include them. It all depends of what you want to archive.

Trying HTTrack is adopting it :). Don't hesitate to post your questions regarding this software on the official Forum (http://forum.httrack.com).

DngrMse
02-08-2005, 01:45 PM
Ow. That's a pretty hefty first post. Welcome, Noob!

Manu
02-08-2005, 07:43 PM
Welcome...

Google