I want to write a script that create meta data of an HTML page ... and also gets all the succeeding links... This script is like an indexer that will get all necessary keywords.. and also, extract the value of HREF="value"
Could anybody give me some chunk of code on how I can extract the HREF values contained inside an HTML page
Example
<HEAD><TITLE>TITLE</TITLE>
<BODY>
<A HREF = "http://mail.yahoo.com">Yahoo!! Mail</A><BR>
<A HREF = "aboutme.html">About Me</A>
<A HREF = "http://www.infoseek.com">Infoseek</A>
Hello, this is a sample PHP scripts
</BODY>
The script should produce this output.
----------output----------------------
Hello, this is a sample PHP scripts
Links contained:
1. Yahoo!! Mail [http://mail.yahoo.com]
2. About Me [aboutme.html]
3. Infoseek [http://www.infoseek.com]
----------end of output ---------------
More power to all!!
Jun