file parsing/ regular expressions

Anon

I have a big fat file to parse. the file looks like this

-- new bolivia --
<title>story a</title>
<author>kl an</author>
<story>
klan2000-10-26.txt
</story>
<email>my@php.net</email>
-- new china --
<title>story s</title>
<author>op ag</author>
<story>
opag2000-10-25.txt
</story>
<email>my2@php.net</email>

Now, I am getting all the elements between the tags using the foll code
preg_match_all ("|<[^{>]+>(.*)</[^>]+>|U",} $lines_string, $out, PREG_PATTERN_ORDER);
$number_items = count($out[0]);

the $out[0][$i] array now contains foll:
$out[0][1] = <title>story a</title>
$out[0][2] = <author>kl an</author>
$out[0][3] = <email>my@php.net</email>

NOTICE THAT the <story>..</story> was not included as the items are on different lines. Dont know why the <story> is not present in the $out array.

Anyone know how to extract <story> ? My guess is that the regular expression is missing on the newline character !!! So what I am looking at is a solution to extract element from following
<story>
element
</story>

AND NOT
<story>element</story>

notice the new line breaks *

Any solutions/ Regular expressions out there?
nilesh :-)

Anon

use eregi_replace() on $lines_string to remove newlines before doing the preg_match-all().

Anon

thanks,
the way to my solution.
1> strip all the \r\n
2> feed this new string to preg_match_all
3> extract the <items>

heres how.
$text = str_replace("\r\n","",$lines_string);
preg_match_all ("|<ITEM>(.*)</ITEM>|U", $text, $out, PREG_PATTERN_ORDER);
$nitems = count($out[0]);

and $out[1][$i] (i from 0 to $nitems) has all th extracted elements.

cheers,
nilesh :-)

note : \r\n to be done on windows not on unix, I beleive \n is enough for unix.