Hello,
I am parsing following xmlfile by Simplexml. XML file is parsing successfully but when it store the parse content (headline tag text) into a variable the entity codes converts into their respective entity characters (like — to — ). I want entity codes using in xml should not replace their respective characters because i need further transformation on this parsed text.
If anybody knows how to parse xml in which entity codes should not replaced into their respective characters, please reply.
I am attaching xml and php script.
XML file name: sample.xml
<?xml version="1.0" encoding="UTF-8"?>
<Document>
<Headline>
<p>—Headline Text.</p>
<p>—Creator Name</p>
</Headline>
</Document>
There are some entity codes in Headline paragraphs like (—). I am using following php script to parse above xml file.
testing.php
--------------
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
<title>Testing</title>
</head>
<body>
<?php
$fileName = "sample.xml";
$filePath = $fileName;
libxml_use_internal_errors(true);
$xml = simplexml_load_file($filePath);
if (!$xml)
{
$errors = libxml_get_errors();
foreach ($errors as $error)
{
echo "<br />".$error->file;
echo "<br />".$error->line;
echo "<br />".$error->column;
echo "<br />".$error->message;
}
libxml_clear_errors();
}else
{
foreach ($xml->Headline->p as $para)
{
$headlineText .= "<p>".$para."</p>";
}
error_log($headlineText);
}
?>
</body>
</html>