Hello,
I am trying to parse out URLs in links from an HTML page that I read. I read the entire HTML page into a string called $html. Now, this HTML page has numerous links on it. The links which interest me, however, have an id attribute which a particular prefix.
Example:
<a id=main34 href=index.html>
<a id=main35 href=about.html>
I only want the href values, which may or may not be surrounded by single or double quotes, of anchor tags which have an id value beginning with "main".
I am trying something like
<a\sid=main\Shref\s=\s.*>
I'm not having much luck though, any help would be appreciated.