[RESOLVED] Help rewriting eregi to epregi

dj2ball · Feb 11, 2010

Hi guys

Trying to work on a script on my wamp server at home, but its 5.3 and throwing up deprecated errors on one of my applications functions.

function toLink($text){
        $text = html_entity_decode($text);
        $text = " ".$text;
        $text = eregi_replace('(((f|ht){1}tp://)[-a-zA-Z0-9@:%_\+.~#?&//=]+)',
                '<a href="\\1">\\1</a>', $text);
        $text = eregi_replace('(((f|ht){1}tps://)[-a-zA-Z0-9@:%_\+.~#?&//=]+)',
                '<a href="\\1">\\1</a>', $text);
        $text = eregi_replace('([[:space:]()[{}])(www.[-a-zA-Z0-9@:%_\+.~#?&//=]+)',
        '\\1<a href="http://\\2">\\2</a>', $text);
        $text = eregi_replace('([_\.0-9a-z-]+@([0-9a-z][0-9a-z-]+\.)+[a-z]{2,3})',
        '<a href="mailto:\\1">\\1</a>', $text);
        return $text;

Anyone able to help me with the syntax for rewriting this as epregi?

Thanks

bpat1434 · Feb 12, 2010

There is no function "epregi", rather it's "preg".

Basically, your eregi_replace function is now preg_replace. From a quick glance, those patterns should still work in a basic form.

You have to add delimiters to the patterns. So for example, a delimiter I like to use is the tilde ("~") since it's not seen very often. You wrap your pattern in the delimiter, and then add any flags for the pattern after the last flag. Eregi does a case insensitive search, so for preg the flag is "i".

So for example, your first eregi_replace would become:

$text = preg_replace('~(((f|ht){1}tp://)[a-z0-9@:&#37;_+.\~#?&/=]+)~i', '<a href="\\1">\\1</a>', $text);

Hope that helps.

dj2ball · Feb 13, 2010

Thanks, that does explain it somewhat, although it now throws up this error:

Warning: preg_replace() [function.preg-replace]: Unknown modifier '#' in C:\wamp\www\include\search.php on line 309

based on this code:

$text = preg_replace('~(((f|ht){1}tp://)[-a-zA-Z0-9@:%_\+.~#?&//=]+)~i',
                '<a href="\\1">\\1</a>', $text);
        $text = preg_replace('~(((f|ht){1}tps://)[-a-zA-Z0-9@:%_\+.~#?&//=]+)~i',
                '<a href="\\1">\\1</a>', $text);
        $text = preg_replace('~([[:space:]()[{}])(www.[-a-zA-Z0-9@:%_\+.~#?&//=]+)~i',
        '\\1<a href="http://\\2">\\2</a>', $text);
        $text = preg_replace('~([_\.0-9a-z-]+@([0-9a-z][0-9a-z-]+\.)+[a-z]{2,3})~i',
        '<a href="mailto:\\1">\\1</a>', $text);

halojoy · Feb 13, 2010

I think putting one backslash in front of ~ might solve this.
It comes from the fact you added those ~ as delimiters.

There is no longer an error. But I am not sure the regex will work.

$text = preg_replace('~(((f|ht){1}tp://)[-a-zA-Z0-9@:%_\+.~#?&//=]+)~i',
                '<a href="\\1">\\1</a>', $text); 
//change to
$text = preg_replace('~(((f|ht){1}tp://)[-a-zA-Z0-9@:%_\+.\~#?&//=]+)~i',
                '<a href="\\1">\\1</a>', $text);

dj2ball · Feb 13, 2010

Superb, thank you!

nrg_alpha · Feb 14, 2010

When using characters within the pattern that are being used as delimiters, typically they must be escaped (there are oddball circumstances that don't require this, but its not likely you'll run into such situations, so we won't go there).

Also note:

Quantifiers like {1} is pointless.. so (f|ht){1} is the exact same as simply (f|ht)
Since you are using the i modifier after the closing delimiter (which is case insensitive), you don't need to specifiy both a-z and A-Z in your pattern. The i modifier already takes care of this.
There are a few unecessary nested capturing groups. Since the goal is to simply replace the whole pattern, we can eliminate the groups that serve nothing useful and make the alternation groups non capturing (via using ?: at the start within the parenthesis).

So your current pattern could become something like:

'~(?:f|ht)tp://[-a-z0-9@:%_+.\~#?&/=]+~i' // use \\0 instead of \\1 in your replacement argument

If you use the following prior to your preg_replace line:

setlocale(LC_CTYPE, 'C');

this enables you to make your pattern even more condensed. The character class short hand \w (word character) is equal to a-zA-Z0-9_ (without that small snippet just mentioned, \w might include even more characters, depending on your locale) so your pattern could be simplified somewhat:

'~(?:f|ht)tp://[-\w@:%+.\~#?&/=]+~i'

Granted, despite all of this, any patterns in this thread will a) require the link to start with ftp or http, and can support inappropriate links like [url]http://&&&&???###..........=====[/url] , but depending on the situation, this may not be an issue.

halojoy · Feb 14, 2010

Yes, I figured it out.
Had not thought of it before.
By trying backslash and run, I did see the errors dissapear.

So, in fact it is a bit like with strings:
"djdhdhdh " fhfhfhf" .....will not work in PHP
"djdhdhdh \" fhfhfhf" .....will work

Thanks for your detailed info, nrg_alpha
Feels good to have you and others around,
if we need some really good Regex stuff. Which happens quite often

[RESOLVED] Help rewriting eregi to epregi

Ddj2ball

bpat1434

Ddj2ball

Hhalojoy

Ddj2ball

Nnrg_alpha

Hhalojoy