I'm creating a PHP program that's going to grab information from multiple other web pages. These pages may change.
It is going to grab URL and Description from lists found on other pages.
What I was thinking of doing was creating a table for each web page, that contained:
main_start
main_end
url_start
url_end
desc_start
desc_end
.. so that this can be modified in the db when the external page structure is changed somehow.
My idéa is to grab the entire HTML, and then cut out everything outside "main_start" and "main_end".
Then, I will start parsing the HTML using the "url_start" ... "desc_end" to find the url and its description.
Does this seem like a good approach? Can anyone here think of a better approach?