Well, if the URLs you want to match are the href attributes of <a> tags, and you want all of them, the appropriate function to use would be [man]preg_match_all[/man]
preg_match_all(
"/<a.+?href\s*=\s*([\"']?)(.+?)\\1.*?>/is",
$string,
$matches
);
The URLs will be in $matches[2].
The expression reads:
/
Start of the regexp
<a
"<a"
.+?
some characters between <a and the next bit
href\s=\s
we expect to see "href=" at some point, maybe with space around the = sign.
b[/b]
Attribute values should be quoted with either " or '. We can't assume this, unfortunately. Whichever of the three possibilities occurs, though, we need to remember it.
b[/b]
The real engine room of the match. The ? is so's we don't get carried away and match all the way to the end of the string..
\1
That remembered (possible lack of) quote, to end the attribute value.
.*?
Maybe some more stuff before the final
>
/is
The end of the expression. The modifier "i" is used to make the match case-insensitive (so that <A HREF will match as well as <a href), and "s" is so that . will match \n (in case the tag wraps across more than one line).
There are more efficient solutions.