Hi guys,
I use the bellow code to replace all the relative links in a HTML file with the absolute ones:
$string="
<a href="link.htm">1</a>
<a href="http://www.site.com/link.htm">2</a>
<a href="#link">3</a>
<a href="mailto:mail@example.com">4</a>
";
$link_noabs = "/<(a.*?href=\")([^http^#])([^>]+)>/";
$link_abs = '<\\1http://www.example.com/folder/\\2\\3 >';
$replace=preg_replace($link_noabs, $link_abs, $string);
The results are shown bellow:
<a href="http://www.example.com/folder/link.htm" target="_blank">1</a>
<a href="http://www.site.com/link.htm" target="_blank" >2</a>
<a href="#link" >3</a>
<a href="http://www.example.com/folder/mailto:mail@example.com" target="_blank" >4</a>
My problem is here:
$link_noabs = "/<(a.*?href=\")([^http^#])([^>]+)>/";
The REGEX above will search for all the links that do not begin with http and/or #. I need a regex that will search for all the links that do not begin with http, # and mailto: because, otherwise, the result will look like this:
<a href="http://www.example.com/folder/mailto:mail@example.com" target="_blank" >4</a>
I've tried the following code:
$link_noabs = "/<(a.*?href=\")([^http^#^mailto])([^>]+)>/";
but with no result.
Please help!
Thank you,