Parsing with PCRE

Anon

Hi all,

On a quick note what exactly is the differnce between basic regexps and PCRE?

Anyway.... I'm trying to parse <a href... > tags on a page and get the page it links to and also the title of the link out of em.

So I want to get maybe /about.html and "About" out of it, it needs to work with images and alt tags too. It's a peice of cake if your only looking for simple links but with style sheets and font declarations inside links it a pain. Here's what I have so far which sorta works:

preg_match_all("'<a\s+href\s= \s\"\'[\"\']\s?> (<[^>]+>(.)<[^>]+></a> | .alt\s=\s*\"\'[\"\']></a>)'iUx", $this->sFile, $matches, PREG_SET_ORDER);

As you can probably see it's rather complicated. Anyone got a simpler/better/proper working way of doing it?

Thanks for help,
Billy