I'm looking for some information on where to find an open source web crawler that I can modify to fit my needs.
-Be able to crawl a site's source code and look for a particular piece of code
-If the code is found I need it to then crawl the code looking for a different piece of code.
-If the code is found and the second code isn't found it will then pull the URL and place it in a txt file if the second piece of code is found it will just continue to the next site. So basically if contains this and doesn't contain that append to text file.
I need it to be able to crawl a list of URL but would prefer for it to just automatically go from site to site to site without a list of URL's but either is fine.
Anyone know of an open source web crawler that is similar to what I need that can be customized? Thanks a lot