Hello,

I'm trying to figure out if I've run into a bug or a string size limit in preg_replace using PHP 5.0.4 on Debian Linux.

I've got a large string of approximately 13770 chars that runs through a preg_replace and throws a segmentation fault error. However when I knock off a handful of characters to get it below a specific size then the preg_replace seeems to work fine.

I've set up the same test on Mac OS X running PHP 5.0.4 and it seems to work fine and does not through the seg fault.

Has anyone else run into anything like before?

Any help or insight would greatly be appreciated.

John


Here's the code I'm using to test and debug the preg_replace function:

// Note chars removed due to posting size restrictions.
$str = "abc@@[ ]|[@dt[#1 food retailer in the
Netherlands and Schuitema, a Dutch retailer 
and food distributor (73%-owned). Other 
interests include online food retailer Peapod and
 a 60% stake in Scandinavian food seller ICA 
AB.]dt@]|[]@@ @@[ ]|[@dt[
<p>James Miller, ex-CEO of U.S. Foodservice, 
filed a lawsuit against his former employer in 
February 2004 for failing to pay severance and 
retirement benefits and $10 million in 
damages. Four former senior executives of U.S. 
Foodservice have been charged with fraud for 
artificially inflating the company's earnings by 
some $800 million from 2000 to 2002. Miller 
has not been charged.</p>
<p>Royal Ahold plans to move its corporate 
headquarters from Zaandam to Amsterdam by 
the end of 2005.</p>]dt@]|[]@@       @@[ 
]|[@dt[Albert Heijn and his wife took over his 
father's grocery store in Ootzaan, Netherlands, 
in 1887. By the end of WWI, the company had 
50 Albert Heijn grocery stores in Holland, and at
 WWII's end it had almost 250 stores. In 1948 
the company went public.]dt@]|[]@@                 
1"; echo "\nstrlen: ".strlen($str)."\n"; $str = preg_replace("/@@\[((.|\n|\r\n)*?)\]@@/e","strlen('$1')-(substr_count('$1','\"')+1)",$str); echo "\n\n".$str."\n\n";

    Hello, does it segfault on any string of this size? i do not know what the cause is, but this is what i would check too.

      Hmmm...good question.

      Apparently it does not. I passed through a larger string that does not contain the conditions to do the preg_replace and no seg fault.

      However, if I wrap the string in the surrounding @@[ ]|[@dt[ ... ]dt@]|[]@@ strings to match the preg_replace condition I get the seg fault.

      Also, when I bump the string char count up over 20000 then I get the seg fault on the Mac now.

      Any thoughts?

      thanks,
      John

        Here's something interesting...

        When I remove the explicit match on the ".|\n|\r\n" chars in the reg exp and match on any character using just "." like:

        $str = preg_replace("/@@\[((.)*?)\]@@/e","strlen('$1')-(substr_count('$1','\"')+1)",$str);

        the seg fault seems to go away.

        Does anyone know if "." will match on the end of line chars in preg_replace so that they don't have to be explicit in the code?

        Thanks,
        John

          Maybe you're getting some Catastrophic Backtracking happening that's triggered because of the amount of memory being used due to a large amount of info going in. I had a batch of that going on and managed to restructure the patterns the way that page suggests.

            Write a Reply...