As far as escaping for REGEXP goes: since (I'm assuming) you don't want users writing arbitrary regular expressions as their search terms, you'd need to quote anything your DBMS considers having significance in regular expressions (although you probably aren't interested in any of them: just keep whitespace, numberlike, and letterlike characters, and maybe the hyphen and there wouldn't be anything to disrupt a regexp).
If only there were some function like preg_quote for escaping keywords to be fed into an SQL REGEXP expression. I expect this would be a fairly involved problem to solve in any general way -- especially considering that hyphens are meaningful within square brackets but not outside of the square brackets. I'm reminded of the complicated url validation issue, which is what led me to the decision of stripping all but numbers & letters & spaces (and maybe hyphens too).
Weedpacket As far as my soundex suggestion goes, even without that there is still the literal index lookup idea...
I remember seeing this approach used by PHPBB back in the day and, while it seems useful and fairly clever, I expect that there's a fair amount of effort involved both initially and ongoing to munge the various tables' columns and generate the index tables and update these indexes when the data changes, etc. I cannot help but think that to do so would be reinventing the wheel.
I looked into MySQL Natural Language Full-Text Searches and it's pretty exciting. PostgreSQL has something similar.
With MySQL, I can simply feed the user's original search string into my SQL query and it handles all the detail:
SELECT id_code, title, description, MATCH(description) AGAINST ('video games' IN NATURAL LANGUAGE MODE) AS score FROM other_data_table ORDER by score DESC
I've only just done a cursory inspection of the search results but it would appear that this natural language search, like my suggested approach, ignores punctuation. This is hardly scientific or comprehensive, but this search yields essentially the same search results:
SELECT id_code, title, description, MATCH(description) AGAINST ('[video]--! * ., ., ., games\\\\ ' IN NATURAL LANGUAGE MODE) AS score FROM other_data_table ORDER by score DESC
The scores returned in the second gobbledygook punctuation search are slightly lower (maybe 2-5%) than in the first query, but the ids and titles returned are the same and in the same order. I feel that, to some degree, this vindicates my earlier decision to strip out the punctuation.
It is noteworthy that I must add a fulltext index on the columns to be searched, but it's a lot easier to run one SQL command or click a link in phpmyadmin than to create my own indexing scheme. Additionally, I can apparently feed the user's search query right into the query without massaging the keywords and constructing numerous queries of my own. This is a considerable improvement over my code which has 4 queries per keyword.