I would do something similar to
read desired page(s) with get_file_contents()
find portion of page to extract with some preg_replace() pattern, ie: everything between comments, than possibly explode() to break up the string into single posts based on what content you're looking to extract
fopen() & fread() or get_file_contents() to get current xml contents
check if content is not already present in xml with strstr() or similar search string function,
open read and write to file with fopen() fread() and fwrite() or use some xml function that I don't know to make the xml doc better.
or possibly, take the sites contents and put it in a database and generate the xml from that..
I've never done anything like this, and those are just how I (an amateur) would go about it.. obviously you're going to run into some problems, one being, how do you regulate the updates of the xml feed? Thats a lot of processes to be executing every time it's requested..
I'm sure you'd learn lots of helpful functions for xml in php.net's xml section