am I wrong in thinking that this would probably hurt performance instead of help it by joining 2 extra tables?
Bulletin design help
jasondavis wrote:am I wrong in thinking that this would probably hurt performance instead of help it by joining 2 extra tables?
if you put "Explain" before SELECT, it will describe how the query will work and in what order, to give you an idea of performance.
Here is the explain of a long query using a sub-select:
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+--------------+--------+---------------------+---------+---------+-------------------------+------+---------------------------------+
| 1 | PRIMARY | f | ALL | PRIMARY | NULL | NULL | NULL | 1133 | Using temporary; Using filesort |
| 1 | PRIMARY | u | eq_ref | PRIMARY,usergroupid | PRIMARY | 4 | vbulletin.f.userid | 1 | Using where |
| 2 | DEPENDENT SUBQUERY | usergroup | eq_ref | PRIMARY | PRIMARY | 2 | vbulletin.u.usergroupid | 1 | |
Time was about 0.01 seconds.
If it were written as a join, there wouldn't be a need for a temporary table or subquery, which would improve performance.
Compare to a Join table to get similar information:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE user const PRIMARY PRIMARY 4 const 1
1 SIMPLE userfield const PRIMARY PRIMARY 4 const 1
1 SIMPLE usertextfield const PRIMARY PRIMARY 4 const 1
1 SIMPLE language system PRIMARY 1
Time Taken: 0.00056 seconds
very good point with the EXPLAIN syntax. I fixed your tags to use [code] tags instead of [pre] since there is no [pre] tag on this forum.
I made a new lookup table to test here are my results
the lookup table seems to slow things down
//Version 1 (20 total, Query took 0.0485 sec)
//-------------------------------------------------
SELECT auto_id, user_id, bulletin, subject, color, fb.submit_date, spam
FROM friend_bulletin AS fb
LEFT JOIN friend_friend AS ff ON fb.user_id = ff.userid
WHERE (
ff.status =1
AND ff.friendid =13
AND fb.spam = '1'
)
OR fb.user_id =13
ORDER BY fb.submit_date DESC
LIMIT 0 , 20
//Version 2 (20 total, Query took 0.0643 sec) USING LOOK Up TABLE
//-------------------------------------------------
SELECT fb.auto_id, fb.user_id, fb.bulletin, fb.subject, fb.color, fb.submit_date, fb.spam
FROM friend_bulletin_lookup AS lu
LEFT JOIN friend_friend AS ff
ON lu.f_id = ff.autoid
LEFT JOIN friend_bulletin AS fb
ON lu.b_id = fb.auto_id
WHERE
(
ff.status =1
AND ff.friendid =13
AND fb.spam = '1'
)
OR fb.user_id =13
ORDER BY fb.submit_date DESC
LIMIT 0 , 20
//Version 3 Showing rows 0 - 19 (20 total, Query took 0.0254 sec)
//-------------------------------------------------
SELECT auto_id, user_id, bulletin, subject, color, fb.submit_date, spam
FROM friend_bulletin AS fb
WHERE (user_id IN (SELECT userid FROM friend_friend WHERE friendid = 13 AND status =1) OR user_id = 13)
AND spam = '1'
ORDER BY submit_date DESC
LIMIT 0 , 20
Is the lookup table indexed?
yes
Well, if your tables are small (like less than a few thousand rows) it may be slower to join; however, once the tables grow huge, it should be faster.
Try running the EXPLAIN statement for the query and see what that shoots out at you.
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE lu ALL NULL NULL NULL NULL 3008 Using temporary; Using filesort
1 SIMPLE ff eq_ref PRIMARY PRIMARY 4 friend_main.lu.f_id 1
1 SIMPLE fb eq_ref PRIMARY,user_id PRIMARY 4 friend_main.lu.b_id 1 Using where
i will try it when the DB gets larger I guess
I guy wgho runs a big social network around 1 million members tells me that he uses a different technique, he says he uses hash to look up bulletins entries much faster, has anyone heard of anything like this?
jasondavis wrote:I guy wgho runs a big social network around 1 million members tells me that he uses a different technique,
he says he uses hash to look up bulletins entries much faster,
has anyone heard of anything like this?
No, I havent.
But this wikipedia describes a bit:
http://en.wikipedia.org/wiki/Hash_function
In short:
A larger data can be hashed, to create a smaller fingerprint,
that make a unique identification of this data.
to make an example
how to use hash to speed up
say you create a new table 'userpass_hash', where each row have:
'hashvalue', 'userid'
each hashvalue is created by 2 values from 'users' table:
$hashvalue = hash( username[userid]+password[userid] );
someone submits a login: uname, pword
you create a
$lookfor_hash = hash( uname+pword );
Then if not $lookfor_hash is found in 'userpass_hash' table, then is FALSE login.
If is found, you get the 'userid'
and can select from 'users' table.
that makes a little sense to me but I dont see how this guy could use hash for fast bulletins lookup, It would seem to me no matter what I would have to search whole entire buleltin table to see if the user id that posted that bulletin is in my friend list table
jasondavis wrote:It would seem to me no matter what I would have to search whole entire buleltin table
to see if the user id that posted that bulletin is in my friend list table
Yes, in this case you are right.
It would not be possible to use some hashing.
Not what I can see either.