or need to extract all tags <p> from a site in Italian
function getTextBetweenTags1($tag, $html, $strict=0)
{
/ a new dom object /
$dom = new domDocument;
/ load the html into the object /
if($strict==1)
{
$dom->loadXML($html);
}
else
{
$dom->loadHTML($html);
}
/*** discard white space ***/
$dom->preserveWhiteSpace = false;
/*** the tag by its tag name ***/
$content = $dom->getElementsByTagname($tag);
/*** the array to return ***/
$out = array();
foreach ($content as $item)
{
/*** add node value to the out array ***/
$out[] = $item->nodeValue;
}
/*** return the results ***/
return $out;
}
<?php
$content = getTextBetweenTags1('li', $html);
foreach( $content as $item )
{
echo $item.'.';
}
?>
My problem is that it does not recognize accented characters.
"con un certo miglioramento del testo e, cosa più importante, con le firme dei sottoscrittori. scusate la ripetizione. che però, dicevano gli antichi, iuvat. un caro saluto. "
Or need your help
regards,cristian