A much looser 'quick and dirty' pattern could be:
$str = array('http://www.site.asia/file.php', 'http://some-site.co.uk/file-01.php', 'www.site.com/file.inc.php', 'site.com/somefolder/file.php', 'subdomain.site.com/file.php');
foreach ($str as $val) {
echo $val . ' => ';
echo (preg_match('#^(?:http://)?(www\.)?[a-z0-9-]+\.(?:[a-z]{2}\.[a-z]{2}|[a-z]{2,4})/[\w-]+\.(php|x?html|asp)$#i', $val))? "valid format.<br />\n" : "invalid format.<br />\n";
}
So with everything case insensitive:
http:// is optional.
so is www. (dot included)
Then it looks for any a-z0-9- one or more times (truth be told, I haven't really looked into what all the possible characters are allowed.. but this can be modified within the [a-z0-9-] character class.
then it looks for a dot, followed by [a-z]{2}.[a-z]{2} (think .co.uk) or alternatively, [a-z]{2,4} (ranges can be .ca to .com to .asia by example).
For the file name, [\w-]+.(php|x?html|asp) basically accepts a-zA-Z0-9_- one or more times, then the dot, followed by an alternation that contains a list of some existing file extentions..
There are issues of course.. this pattern would not allow a file like addon.inc.php by example. To solve this, we can alter the pattern to match practically anything like this:
#^(?:http://)?(www\.)?[a-z0-9-]+\.(?:[a-z]{2}\.[a-z]{2}|[a-z]{2,4})/(?:[\w-]+\.)+(php|x?html|asp)$#
I am assuming you want only specific file extensions.. but if you want to be even more open ended, you can change the final alternation within the pattern to just include anything occuring 3 or 4 times...
#^(?:http://)?(www\.)?[a-z0-9-]+\.(?:[a-z]{2}\.[a-z]{2}|[a-z]{2,4})/(?:[\w-]+\.)+[a-z]{3,4}$#i
So it really depends on how strict or how open ended you would like to go.. URL's are also tricky moving targets, so I hope my pattern (pick the one that suites you best) will work for what you have in mind.