I own and operate a review website with a few thousand products. In an attempt to make my website more user friendly I would like people to be able to find a product even if they don't spell the name correctly.
Instead of just automatically selecting products that their search was close to, I would like to try to create something like Google's "Did you mean: XXXX" search correction, so that the user knows they spelled the name incorrectly.
I'm new to fuzzy searching, but this is my plan so far...
Start off by creating a list of unique words taken from product names (they include manufacturer names) and storing them in a table.
IE
This...
Ford Focus
Ford Explorer
Mercedes Benz SLK
Would become...
Ford
Focus
Explorer
Mercedes
Benz
SLK
When a user searches their query will be broken into pieces by spaces or dashes. Each word from the query will be compared against the table with all the words compiled from product names. If the word matches exactly with a word from the table, then it is A-OK, otherwise we will try to find if there is a word it is close to...
Now this is where I am kind of confused as to what to use. I've played around with soundex, similar_text, metaphone, and levenshtein a little bit, but I am not still not sure what would be the best to use in this case. I've even read that people use two of the functions (ie levenshtein and metaphone) together for the best results.
http://www.php.net/similar_text
http://www.php.net/levenshtein
http://www.php.net/metaphone
http://www.php.net/soundex
Any opinions or tips?
Thanks!