Hi there - good morning
i need to parse the result pages of this site:
http://www.bildung-lsa.de/bildungsland/schulen_und_hochschulen.html#art5511
i need to parse all html-pages in order to get the results - eg like this one here:
Adresse:
Friedrich-Schiller-Gymnasium Calbe
39240 Calbe
Große Angergasse 10
Homepage: http://www.gym-calbe.info
Telefon: 039291/2560
Telefax: 039291/78874
E-Mail: kontakt@gym-schiller-calbe.bildung-lsa.de
Well - this can be done with SAX or something like that.
Well - we can do this with PERL too:
We could do it with HTML::TreeBuilder::XPath:
Well therefore - working with TreeBuilder i have to identify the xpath-expressions....for the resulting pages!
i try to come up with some examples... Guess that t his would be a good way. Getting the paths for one page would be good to prepare the job to parse all pages!
Look forward to any idea and help!
bernhard