Right it;s not all that tidy but here's the gist of it.
We've got a table with a list of keywords with the structure
CREATE TABLE w_fuzzy_search (
fuzzy_id INT(10) NOT NULL PRIMARY KEY AUTO_INCREMENT,
keyword varchar(127) NOT NULL,
a int(2) NOT NULL DEFAULT 0,
b int(2) NOT NULL DEFAULT 0,
c int(2) NOT NULL DEFAULT 0,
d int(2) NOT NULL DEFAULT 0,
e int(2) NOT NULL DEFAULT 0,
f int(2) NOT NULL DEFAULT 0,
g int(2) NOT NULL DEFAULT 0,
h int(2) NOT NULL DEFAULT 0,
i int(2) NOT NULL DEFAULT 0,
j int(2) NOT NULL DEFAULT 0,
k int(2) NOT NULL DEFAULT 0,
l int(2) NOT NULL DEFAULT 0,
m int(2) NOT NULL DEFAULT 0,
n int(2) NOT NULL DEFAULT 0,
o int(2) NOT NULL DEFAULT 0,
p int(2) NOT NULL DEFAULT 0,
q int(2) NOT NULL DEFAULT 0,
r int(2) NOT NULL DEFAULT 0,
s int(2) NOT NULL DEFAULT 0,
t int(2) NOT NULL DEFAULT 0,
u int(2) NOT NULL DEFAULT 0,
v int(2) NOT NULL DEFAULT 0,
w int(2) NOT NULL DEFAULT 0,
x int(2) NOT NULL DEFAULT 0,
y int(2) NOT NULL DEFAULT 0,
z int(2) NOT NULL DEFAULT 0
);
This gets populated such that the number of occurences of each letter of a word go in the appropriate column.
For example to insert the keyword pineapple we would do
INSERT INTO w_fuzzy_search (keyword,a,e,i,l,n,p) VALUES ('pineapple', '1', '2', '1', '1', '1', '3');
This means that words can be easily pulled from the database dependent on the number of letters that they have in common. The criteria for this can be tweaked in the query, I hva chosen to set it up so that any one of the words can be missing (but only one) and there can be more letters of any type within a word length constraint. (If you would like the code I made for this let me know)
Then I wanted the first letter to match so I used this little loop.
<?
$result=mysql_query($sql);
$result_array=array();
while($row=mysql_fetch_array($result)) {
if(substr($row['keyword'],0,1)==substr($searchword,0,1)) {
$result_array[]=$row['keyword'];
}
}
?>
Then we order the results by how close their length is to the search word
<?
$size_array=array();
for($a=0;count($size_array)<count($result_array);$a++) {
for($i=0;$i<count($result_array);$i++) {
if(abs(strlen($searchword)-strlen($result_array[$i]))==$a) {
$size_array[]=$result_array[$i];
}
}
}
?>
and then by the letter positioning from the begining of the string
<?
$result_copy=$result_array;
$order_array=array();
for($i=strlen($searchword);$i>0;$i--) {
for($j=0;$j<count($result_array);$j++) {
if(substr(remove_doubles($searchword),0,$i)==substr(remove_doubles($result_array[$j]),0,$i)) {
$order_array[]=$result_array[$j];
$result_array[$j]=false;
}
}
}
?>
Now we have to score the two arrays and combine them into one array.
<?
//build scores array
$scores=array();
for($i=0;$i<count($result_copy);$i++) {
$scores[$result_copy[$i]]=0;
}
//build scores
foreach($scores as $key => $value) {
for($i=0;$i<count($size_array);$i++) {
if($size_array[$i]==$key) $scores[$key]=$i;
}
for($i=0;$i<count($order_array);$i++) {
if($order_array[$i]==$key) $scores[$key]=($scores[$key]+$i)/2;
}
}
asort($scores);
?>
You can add different initail sort methods (I was thinking about a letter positioning from the end of the string) and just play around with the scorring loop at the end to get the scoring right.
HTH
Rob