I was wondering, I'm writing a spidering tool in PHP and I'm having some trouble with preg_match and whatnot.
I want to grab all the links in a document. What do I mean by that? Well I'll show you a template of a link and what info I need:
<a href="INEEDTHIS" border=0 class="bob">INEEDTHIS</a>
The only problem is this: I don't know how to get just the two pieces of data I need (url and title text). Also, sometimes links use images as their "title text" - how do I manage that? Grabbing the <img> tag is fine by me, but sometimes it messes up my preg_match.
Also, I need all these in 2 arrays: $url and $title, and I want these two arrays to coordinate with each other.
I've been trying to get this darn regex to work for a few hours, so any help possible is great!