Lots of sites use another websites API... like Googles or a third party like www.atomz.com
But if you want to make your own... setup three tables:
table_Words
ID,word,metaphone
table_Link
ID,word_ID,page_ID
table_Pages
ID,page,title,descrip
Feed a little script every URL you want to index, have it first strip the HTML, then insert ( or dont' if word is alreayd there ) words into your word table, the page into your page table, and links between the two into your link table.
When someone runs a search, find the word_ID, then find the links with that word_ID, then select all the pages that are appropriately given to you by that link table.