i have a really strange problem getting a mysql query to work.
i have set up a site indexing script for a search-engine, that spiders a whole webpage, reads all html-code, extracts the 'plain text', breaks
the text in single words, and puts every unique word into a db-table called 'words' (Cols: Word, WordID).
At the same time, a gather information about the documents, which results in a table called 'documents' (Cols: DocumentID, DocumentPath, DocumentTitle).
The third step is getting the relation between the 'words' and the 'documents' which results ind a table called 'wordsindocuments'.
Well, the indexer works fine and i get these tables nicely filled with data: 'words' contains about 6000 entries, 'documents' about 540, and 'wordsindocuments' about 56000.
Now i can easily (and fast) find documents containing special words:
query:
SELECT t0.documentID
FROM wordsindocuments t0
WHERE WordID='361'
(WordID is corresponding to the searchword)
for multi-word-searches, i just use a dynamic querystring, depending on the number of words:
SELECT STRAIGHT_JOIN t0.documentTitle, t0.DocumentPath
FROM documents t0, wordsindocuments t1, wordsindocuments t2
WHERE t0.documentID=t1.DocumentID
AND t1.WordID='361'
AND t1.DocumentID=t2.DocumentID
AND t2.WordID='362'
(this is for two word ...)
running this query on my development-system (Win 2k / IIS / PHP 4.3.0 / MySQL 3.23.49-nt ), there are no problems and the query is fast.
well, easy going.
but running this multi-word-searches on the target web server (FreeBSD 4.7-STABLE, PHP 4.2.3, MySQL 3.23.54-log) results in an mysq-error:
(german!) 'Die Ausführung des SELECT würde zu viele Datensätze untersuchen und wahrscheinlich sehr lange daueren. Bitte WHERE überprüfen und SET OPTION SQL_BIG_SELECTS=1 verwenden, sofern SELECT ok ist'
free translated: The execution of SELECT would examine to many data records and maybe take very long. Check WHERE-statement and SET OPTION SQL_BIG_SELECTS=1, if SELECT-statement is ok'
setting this option results in the server not responding (to me) for several minutes, even if i reload the page. it appears to be dead or locked.
but know, that the servers' setup is very restrictiv, eg. PHP as server process is not allowed to write to files, and thins like that.
i tried to change the select statement into something with LEFT JOIN, but JOIN seem to take much longer, than the query i use.
so, at last, i have really no idea where to look for the problem.
is it my mistake? a can it be a configuration fault on the server?
maybe (i don't know) the database needs to use a physical file on the server HD (instead of memory / RAM) to process 'big' queries? (i thought, that this is not a big database, big is much bigger for me .... but maybe i'm am wrong?!)
i'd be happy for any ideas now.
and thanks a lot