Hi guys

I want to scrape emails addresses in my own website.
http://czone01.com/

using REGEX while i'm formulating my own REGEX.
Please i dont need DOM this time i only need REGEX.

Please assume that these email addresses will change someday.

Thank you in advance,

-warren

    Sounds straightforward then; what part do you need help with? Finding an appropriate regexp pattern or how to use it?

    As a sidenote, I found this particularly interesting:

    What do you get from my design?
    ...
    7. It will be tested and pass the W3C standard validation.

    given the circumstances. :p

      Finding an appropriate regexp pattern

        bradgrafelman i know how to fix the W3C standard of my website.

        Im not interested fixing my site to pass the W3C.

        I prefer to broaden my knowledge first than my site.

          I just removed the W3C standard thing...

          So people will not criticize that, i really don't like fixing my site...

          I prefer to fix my client site...

            bradgrafelman can i see your website?

              if you don't want to help nevermind, i will formulate my own.

              thanks anywa...

                wren9;10916537 wrote:

                Hi guys

                I want to scrape emails addresses in my own website.
                http://czone01.com/

                using REGEX while i'm formulating my own REGEX.
                Please i dont need DOM this time i only need REGEX.

                Please assume that these email addresses will change someday.
                -warren

                Well, I know if I was to scrape a site for email addresses, I would simply use a quick and dirty method (as opposed to being all fancy and complex about it).. perhaps something alone the lines of:

                setlocale(LC_CTYPE, 'C');
                $str = 'My email is: (leet-koder_koderkid@Irockthebinaries.com), as this is simply more text filler. Another email address I use is: bitwisemaster@codeslamers.de... More usless text...';
                
                preg_match_all('#[\w-]+@[\w.-]+#', $str, $matches);
                foreach($matches[0] as &$val){
                    $val = rtrim($val, '.'); // eliminate any period(s) at the very end of the match...
                }
                echo '<pre>'.print_r($matches[0], true);
                

                Output:

                Array
                (
                    [0] => leet-koder_koderkid@Irockthebinaries.com
                    [1] => bitwisemaster@codeslamers.de
                )
                

                So if I replace the $str= line in the code above with a URL likes yours for example:

                $str = file_get_contents('http://czone01.com/');
                

                The output I get is:

                Array
                (
                    [0] => iridion_us@yahoo.com
                )
                

                This method isn't foolproof though (nor is it meant to be).. If you have email address with spaces within the address, or used [at] (or variations of this) instead of @, or (dot) (or variations of that), then the pattern will fail. It is meant to be a simple, rudimentary pattern for no-nonsense emails....

                EDIT - I wouldn't even post emails in sites these days (simply due to spam and whatnot). I would instead use a contact form that mails the user's email and message your email address instead, IMO. Many sites are already doing this anyways.

                  wren9;10916560 wrote:

                  I just removed the W3C standard thing...

                  So people will not criticize that, i really don't like fixing my site...

                  I prefer to fix my client site...

                  Well, if you are in the business of providing web development services, I would definitely recommend that you fix your end first..as clients who are aware of the w3c validator would expect your own to validate. Why make only your client's sites validate and not your own?

                  Additionally, I would even go so far as to suggest that you have a look at some professional templates (think template monster for example), because your site's presentation is not professional / confident / trustworthy looking IMO. Sorry if I'm sounding harsh.. the intent is not to discourage you, but rather to encourage examining other options if site presentation is not your forté.
                  Templates are not expensive (unless you pay for a unique version).

                    thanks nrg_alpha for the suggestion and for your help.

                    But i have really no interest fixing my own site.

                    Thanks again.

                      Write a Reply...