So right now I've successfully parsed a (NITF formatted) XML file using the PEAR extension XML_NITF. Once it's parsed, I take the bits and pieces I need and insert them into a database. Here is what I have to do this part:
// Include the XML_NITF class
include 'XML/NITF.php';
// Initialize a new XML_NITF object
$nitf =& new XML_NITF();
// Parse the source XML file(s)
$nitf->setInputFile("filename.xml");
$nitf->parse();
/**
* Grab all information needed
* Headline, Author, Document-ID, and the body of the article
*
*/
$headline = trim($nitf->getHeadline(), " ");
$author = trim($nitf->getByline(), " ");
$docid = $nitf->getDocData($sProperty = 'doc-id');
$body = $nitf->getContent();
// Wrap paragraph tags around each part of the getContent() array
function paragraph_tag_wrap(&$p) {
$p = "<p>$p</p>";
}
array_walk($body, 'paragraph_tag_wrap');
// Display the new array
$content = implode('', $body);
// Connect to database
$db = "dbname";
$user = "username";
$pw = "password";
$mysqli = new mysqli("localhost", $user, $pw, $db);
if (mysqli_connect_errno()) {
printf("Connect failed: %s\n", mysqli_connect_error());
exit();
}
// Escape all characters in $content that need to be escaped in order to
// be entered into the database
$content = $mysqli->real_escape_string($content);
if ($mysqli->query("INSERT into tablename (id, headline, body) VALUES ('$docid', '$headline', '$content')")) {
printf("%d row inserted.\n", $mysqli->affected_rows);
} else {
printf("Error: %s\n", $mysqli->error);
}
$mysqli->close();
What I need to do now is open and read a directory full of .xml files (that are formatted the same way), parse each one, and finally insert what I have selected into its own row in my database. So in the end, each article will have its own row in the table.
So far I've tried opening and reading the directory containing all of the files using this bit of code (which works):
$dir = "./dirname";
if(is_dir($dir)) {
if ($dh = opendir($dir)) {
while (($filename = readdir($dh)) !== false) {
if ($filename != "." && $filename != "..") {
$dir_array[] = $filename;
}
}
}
}
Then once the directory has been read, I tried just putting all of the code that parses the single file and inserts into a database in a for loop. I ended up doing something like this...
for ($counter; $counter <= count($dir_array); $counter++) {
$nitf[$counter] =& new XML_NITF();
$nitf[$counter]->setInputFile("\"" . $dir_array[$counter] . "\"");
$nitf[$counter]->parse();
$headline = trim($nitf[$counter]->getHeadline(), " ");
$author = trim($nitf[$counter]->getByline(), " ");
$docid = $nitf[$counter]->getDocData($sProperty = 'doc-id');
...and so on.
Obviously this doesn't work. Anyway, would anyone be able to lead me in the right direction/know what I'm doing wrong? I'd really appreciate it.
Thank you.
Edit: I just realized that the XML_NITF PEAR extension has a reset() method that allows you to reset the parser so you can use one parser instance to parse multiple XML documents. However, I can't wrap my brain around how to implement this in what I want to do.