I am trying to use the loadHtmlFile command to do some work with RSS feeds but I have encountered some difficulties.
The following code takes a news feed and prints out the titles in a list.
$dom = new domdocument;
$url = 'http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/uk/rss.xml';
$dom->loadHTMLFile($url);
echo '<h1>BBC UK News Headlines</h1>';
echo '<ul>';
$items = $dom->getElementsByTagName("item");
foreach($items as $item) {
$titles = $item->getElementsByTagName('title');
foreach($titles as $title)
{
$titleText = $title->firstChild->data;
}
$links = $item->getElementsByTagName('link');
foreach($links as $link)
{
$linkLoc = $link->firstChild->data;
echo $linkLoc;
}
echo '<li><a href="' . $linkLoc . '">'.$titleText.'</a></li>';
}
echo '</ul><br/><br/>';
The problem is it will not pull out the links. I can pull out any of the other data from the feed apart from the links.
So it investigate further I used the following command near the top of my code:
echo $dom->saveHtml();
When I look at the source of this the xml for the feed is intact except for one major error each of the links was missing the closing tag </link> hence me not being able to pull out the information.
Why is this? Is it something really simple that I am missing? I have tried this on 3 news feeds now yahoo, google and the BBC.
Any help will be greatly appreciated. Thanks in advance