wren9;10916537 wrote:Hi guys
I want to scrape emails addresses in my own website.
http://czone01.com/
using REGEX while i'm formulating my own REGEX.
Please i dont need DOM this time i only need REGEX.
Please assume that these email addresses will change someday.
-warren
Well, I know if I was to scrape a site for email addresses, I would simply use a quick and dirty method (as opposed to being all fancy and complex about it).. perhaps something alone the lines of:
setlocale(LC_CTYPE, 'C');
$str = 'My email is: (leet-koder_koderkid@Irockthebinaries.com), as this is simply more text filler. Another email address I use is: bitwisemaster@codeslamers.de... More usless text...';
preg_match_all('#[\w-]+@[\w.-]+#', $str, $matches);
foreach($matches[0] as &$val){
$val = rtrim($val, '.'); // eliminate any period(s) at the very end of the match...
}
echo '<pre>'.print_r($matches[0], true);
Output:
Array
(
[0] => leet-koder_koderkid@Irockthebinaries.com
[1] => bitwisemaster@codeslamers.de
)
So if I replace the $str= line in the code above with a URL likes yours for example:
$str = file_get_contents('http://czone01.com/');
The output I get is:
Array
(
[0] => iridion_us@yahoo.com
)
This method isn't foolproof though (nor is it meant to be).. If you have email address with spaces within the address, or used [at] (or variations of this) instead of @, or (dot) (or variations of that), then the pattern will fail. It is meant to be a simple, rudimentary pattern for no-nonsense emails....
EDIT - I wouldn't even post emails in sites these days (simply due to spam and whatnot). I would instead use a contact form that mails the user's email and message your email address instead, IMO. Many sites are already doing this anyways.