How to grab all urls (especially nonindexed) from blogger,multiply,typepad,etc sites?
Posted 01 December 2011 - 05:16 AM
Similar request here except I'm looking to suck out all the urls from my energizer and other syndication type sites...typepad,blogger,posterous, etc etc
But these don't seem to have a sitemap to work with like the shwp and I can't seem to find a way to use an excel function either as the posts tend to be typepad.com/posttitlehere so no chronological numbers to work with like that either...
I can do a site: scrape for them in box but like I said, I want the non indexed urls in order to juice them up more, and of course the site: scrape is only going to give me already indexed urls.
Thought of using link gopher to suck all the links off each page but that is sooo not practical. Thought of using sbox external link extractor but that isn't feasible either of course...
Thoughts, tricks, wild ideas?
*a massive inexorable force, campaign, movement, or object that crushes whatever is in its path
Aut inveniam viam aut faciam
Testing effects of sig surge on post nasal drip and buy neopoints keywords
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users