First off, if it's not PHP code, don't use php tags. Use html tags (if it happens to be html code) otherwise code tags. And please strip away all nonsensical `around identifiers. It just makes the code very very hard to read.
Some thoughts on your table
CREATE TABLE IF NOT EXISTS profiles (
id int(11) NOT NULL auto_increment,
-- data type is INTEGER
-- Yet you default it to a string literal which has to be converted to int
gender tinyint(3) NOT NULL default '0',
-- If you do note accept null dates, why would you default it to something
-- equally useless? Otherwise you might as well let it be null.
birthdate date NOT NULL default '0000-00-00',
PRIMARY KEY (id),
-- Put the birthdate field before gender - it will generally be more
-- restricting.
-- Having the primary key (whihc uniquely identifies each row) as the last
-- part of an index is completely pointless. First you use gender
-- to point out half the records (assuming equal female:male ratio),
-- and then you get the remaining matches on birthdates, say 25%, and finally
-- you also provide an id which points out a single record!?
-- In other words, id here is pointless, so just remove it.
-- And rename the index into something non-confusing, such as
KEY gender_bdate (birthdate, gender)
-- And then, if you _also_ need to perform searches on nothing but gender,
-- add a third index.
-- KEY gender(gender)
)
-- Eek! This db engine is NOT ACID compliant, which means that you have
-- no guarantee that data entered will remain available, be in an acceptable state
-- etc. MyISAM should NOT be used.
-- ENGINE=InnoDB
-- Now you can make use of transactions, foreign key constraints etc.
-- The ONLY thing the other engine provides you with is full text indexing and
-- MATCH(stuff) AGAINST(other stuff)
-- But there are efficient solutions to work around that problem, such as Sphinx,
-- which, iirc is at http://sphinxsearch.com/
-- Another solution is to use a DBMS which provides you with both. PostgreSQL
-- is also free.
ENGINE = InnoDB
-- charset=latin1! Why aren't you using utf-8?
DEFAULT CHARSET=latin1
Even here you are supplying a string literal in a comparison against an integer field. There are several circumstances that doing this, or the reverse and providing an int when searching against a string data type will NOT be using an index. In short: never ever do this. Keep track of what data type a field is, and handle your data accordingly. If it is an integer, you never ever put quotes around whatever you send that way. If it's a text field for some reason containing a numeric value, you never ever provide it with an integer. You provide it with a numeric string literal.
lpa;10990125 wrote:
explain SELECT id FROM profiles where gender='1' AND birthdate <= DATE_SUB(NOW(), INTERVAL 20 YEAR) AND
-> birthdate >= DATE_SUB(NOW(), INTERVAL 30+1 YEAR) order by id desc;
+----+-------------+----------+------+---------------+---------+---------+-------+------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+---------------+---------+---------+-------+------+-----------------------------+
| 1 | SIMPLE | profiles | ref | gender | gender | 1 | const | 3104 | Using where; Using filesort |
+----+-------------+----------+------+---------------+---------+---------+-------+------+-----------------------------+
As you can see, Index is not enough and "filesort" is involved in this query.
How do you arrive at the conclusion that index is not "enough", and what do you mean by "enough". Still, like I said before, I believe that it would make more sense to have gender after birthdate in the index, since the birthdate will generally be more restrictive.
Do note that it says "fileSORT". Why do you think the result would need sorting? Yes, drop the ORDER BY clause, and the filesort will go away. Sorting may still be OK to do (sorting those 3k rows shouldn't take that long), but at least make sure the sorting is of any use to the user. They won't care if they get the results back ordered by id, since the ids mean nothing to an end user.
They might however like having them ordered in lots of other ways: by age, by name, by signup date.
Either way, if you need data ordered, I'd say that you should usually do this in SQL. But, if your DB is the bottleneck and is NOT running on the same maching as your web server(s), then you could of course move the sorting to PHP. But the point is, if you NEED to sort data, you will have to sort it somewhere... And if the sorting is pointless, don't do it.
Also, why are you selecting ids only? Would it not make more sense to select id, name, birthdate, profile photo url and possibly (part of) profile presentation text. That way, you have everything needed to present the end user with a list of matches.
I can see one reason for selecting ids only though, and that is if you present the user with paginated sets of the search result.. You could sort them once, store the ids in session or memchache, and then just grab the 10, 20, 50 or 100 ids the user are currently requesting to view using [man]array_slice[/man] (or is it _splice), implode the resulting part of the array and select ... WHERE id IN (imploded array here).
lpa;10990125 wrote:
"filesort" and "using temporary" is very bad.
Well, if I'm not misstaken, the reason that you get "filesort" here is that you do not search for a specific date, but rather a range of dates, while the birth date is the second key part, and you order by id, the third key part. This ordering would use index instead of filesort if you search for gender=1 AND birthdate='exact birthdate'. Either way, see order by optimization for how this works and possible ways of improving it.
But yes, filesort can be very bad. It especially depends on the number of rows and the size (bytes) of each row. But your db also has ini settings you can change (there's a threshold for when writing entire rows while sorting and when using just key and row pointer, and you could use multiple physical discs (not partitions) for file sorting which are used in round robin fashion.
lpa;10990125 wrote:
So, I want to get rid of "using temporary", "using filesort" in all my queries but it's not possiblem if multicolumn index contains a mix of different data types.
Read the documentation. Mix of data types in indices has nothing to do with using file sort.