[RESOLVED] i need more preg_match_all help

TheIceman5

bradgrafelman helped me out with a function to extract image names from html code and it is working perfectly to date

$pattern = '/<img.*src\s*=\s*["\'](.*)["\'].*>/Usi'; 
preg_match_all($pattern, $data, $matches); 

$files = array_map('basename', $matches[1]);

now i need more help in extracting .pdf documents from html code. i have come up with
$pattern = '/<a href="\'["\'].>/Usi';
and it works ok, but cant workout how to specify only .pdf documents.

what i also want to do is combine the wtwo functions above so that i can extract images and .pdf files in one go, is this possible? if so can someone show me how its done? im stuck.

bretticus

try...

$pattern = '/<a.*href\s*=\s*["\'](.+\.pdf)["\'].*>/Usi';

As for combining them...don't. I suggest that because you're going to struggle to modify the pattern for img and a tags. Plus you need to specify ext for pdf. It's just easier to keep them separate. It's really not much (if any) extra overhead.

TheIceman5

thanks, that works, ive used array_merge to give the result in one array and that works perfect.
i must learn these regular expresisons properly one of these days but i dont use them often.