I'm using preg_split to split HTML text composed of multiple paragraphs into an array of separate paragraphs, minus the paragraph tags. The way I'm doing it feels clunky (though it DOES work) - I figure there must be a more elegant approach. I tried preg_match_all but I ended up with an array of arrays - not sure why but I must have done something wrong.
Here's how I have been doing it:
$this->text = preg_split("/<p>/", $text, -1, PREG_SPLIT_NO_EMPTY);
foreach ($this->text as $k => $v)
{
$this->text[$k] = preg_replace("/<\/p>/", '', $v); // get rid of end "p" tags
}
This was my attempt with preg_match_all:
$pattern = "/<p>.*<\/p>/";
preg_match_all($pattern, $text, $this->text);
But instead of netting me an array where each element was a paragraph, it gave me an array with one element, and THAT element was an array where each element was a paragraph - i.e. it nested it one extra level deep. Also, I'd love to strip off the <p> tags in the process. Tried parentheses around the "contents" - i.e. (.*) but that gave me another nested array element. Obviously I don't understand preg_match_all
I'm thinking preg_split is the way to go but I wish it were possible to strip of the end </p> tags at the same time.
Any tips?