mika79;10990336 wrote:
I advised him to CONSIDER the use of full-text search as he is searching for a quick and simple solution.
It seems I confused this thread with another one involving words such as DVD, and so I missed the point of his question. So, as far as this goes, I agree with you that full text indexing would do the trick.
However, I would still advice you (you being the original poster from this point on), to consider either a secondary layer for full text indexing, i.e. Sphinx, or migrating to a DB that implements both full text indexing and is ACID compliant. MySQL's InnoDB engine is ACID compliant, or at least can be with proper db settings, but has no full text indexing, and MySQL's MyIsam engine is not ACID compliant while supporting full text indexing.
Either way, once you have a full text index on whatever text data you want to search in you can do something like
SELECT MATCH(searched_field) AGAINST ('words to appear in text') AS relevance
FROM tbl
ORDER BY relevance DESC
As for temporary tables, what you refer to would most likely be represented by a permanent table with temporary data, since temporary tables only have session duration (where session means the connection to the database, which is closed when the http connection is closed). As such, this would not allow you to retain data between 2 subsequent page requests from the same user.
Anyway, you may or may not gain from such an approach assuming you display paginated data to users and they keep flipping a lot in the paginated set. What you'd do is retrieve all ids in a search and store these along with the user's id or the users session_id (php session this time), so that a user can be linked to his particular search. You may even opt to store the search as such, in case the user re-searches the same thing after a new search.
Hmm, one thing that struck me right now though, is that this may actually still be quicker, even without subsequent requests for the paginated set. Ordering the results in the direct way, that is directly when performing the search, would mean that a file sort is needed. Doing it like this, all that is needed is to retrieve the data and insert it into another table. Once you retrieve the proper subset of that data, the index can be used for ordering purposes, thus (most likely) eliminating the need for filesort.
$query =
"INSERT INTO user_search (id, user_id, relevance, search)
SELECT id, $user_id, MATCH(searched_field) AGAINST ('text being searched for') AS relevance, 'text being searched for'
FROM searched_table
HAVING relevance > 0 -- or some appropriate threshold
-- no ordering needed here, it would mean a filesort
And then, on each request for pages in this paginated set, you'd retrieve the data from the search table, and joining it against the original table using the ids. But it does take some time to insert this data into the search table, so if users usually just search once, this approach takes more time. However, if they requests pages from one search several times, the overhead for saving the search results will be regained by the following queries being faster
$query =
'SELECT st.stuff
FROM user_search us
INNER JOIN searched_table st ON us.id = st.id
ORDER BY us.relevance DESC
LIMIT $page, $limit
So in the end, wether you gain or lose is dependend on user behaviour.
The table used to store the temp data should probably contain an index on all four fields (if you incldue the "search text" in the table, otherwise 3), in this order (user_id, search, relevance, id). I'm not really certain having id in there will help, but that would be easy for you to check.
Also note that if you don't require user accounts and login so that you have a user id to use and have to resort to session_id, and these session ids may be reused in the future. If you also stored data about when a specific session id last accessed a search, you could clean up data with a chron script whenever a particular search hasn't been accessed for a set amount of time.