Hi,

I try to read some posts which discussed about Ranking. But unable to find so

I want to write a custom ranking functionality for full text search. Let me the explain the logic with two cases:
1. Search word : "PHP"
2. Search word: "PHP and AJAX"

So for 1st search word i received 20 results: I can count the number of search word found in each result and rank them and also i can give the priority to some database columns

For 2nd search word: i found 5 results (and is for LOGIC AND) and the number of times each word repeated in the 5 results are (Assumption : Customer has no choice to set the priority for the keywords)
PHP AJAX
a. 5 1
b. 4 1
c. 3 3
d. 2 2
e. 1 4

So now i can rank these 5

So can anyone share the logic for a proper ranking approach.

Thank you

    It really depends on you, what you want to promote in such ranking system.
    If the word count - just add the two. However you can assume that if one word appears much more frequent then other, it may not be the best choice for both words. I'd introduce some "inequality handicap", like what percentage of the greater number is the smaller number. Then multiply the sum by this coefficient. Thus (3+3)100% would be more relevant then (5+1)20%.
    It can be further applied for more then a pair of words (like the invert of the variance from the gaussian bell curve could be the multiplier - which would promote more even word counts).

    Each logical OR unit should have an equal share in the final score.

      Thank you Wilku for your suggestion 🙂

      I also thought of sort of Gaussian and bayesian approaches with focus on the Mean, Variance, Number of occurrences of each word.

      I am also thinking of

      1. prioritizing some of the database columns.
      2. threshold value for rank, so that some results can not be shown in the search results

      Can anybody give comments and suggestions on my current thinking and also other options

        Write a Reply...