Such tables are used for many to many relationships.

Say we have people, and favorite movies. We have a table with all the people, and a table with all the movies, and another table that connects them together. I'm guessing that's what's happening in these tables.

    Exactly. These are many-to-many relationships between tables. If any record in A can have multiple records referenced in B and vice versa, then a table in the middle is the only way to maintain the relationships. This is what's formally called normalization (part of the 3rd form IIRC).

    Also be aware that multi-column keys can be created on these ID tables, improving the speed of your many-to-many joins.

      Also, you wind up with really odd looking queries when you have to find things that aren't related. What would be a simple left join on two tables becomes a bit ugly with three or four, and you've got a half dozen or more, they get ugly SUPER fast. Here's one I was working on today...

      select * from (
          select
              m.hostname,
              't' as yn
          from machines m
          join mtou on
          ( mtou.machineid=m.id)
          join users u on
          (u.id=mtou.userid AND mtou.machineid=m.id)
          where username='$username'
          union
          select m.hostname,
              'f' as yn
          from machines m
          where hostname
          not in (
              select
                  m.hostname
              from machines m
                  m.hostname
              from machines m
              join mtou on
              ( mtou.machineid=m.id)
              join users u on
              (u.id=mtou.userid AND mtou.machineid=m.id)
              where username='$username'
              )
          ) as a
      order by 2 desc,1;
      

      Just in case you're wondering, it finds all the machines that ARE linked to a user, and all the machines that aren't and returns the set fo I can build an html form for editing that relationship. woohoo! (took me about 10 minutes to come up with it, and that was fast, normally it's more like 30)

        sxooter -

        Just looking it over quickly, it seems that something like this would give you what you need?

        select m.hostname, count(u.*) 
            from machines m
            inner join mtou on  ( mtou.machineid=m.id)
            left join users u on (u.id=mtou.userid and u.username='$username')
            group by m.hostname
        

          It'll need a case statement where the count is, but I'll give it a whirl and see which plans out as faster.

          Since I'm only dealing with dozens of entrie in each table it's not real important to have the most efficient plan just now. The largest either table would grow to would be ~ 100 or so entries...

            Nope, that query won't work, as it only produces half the output I need. The problem is that with an inner join on machines to mtou, I only get the ones that have entries.

            I do think there's a simpler left join ... where is null thingie I could work out here though. But for now it works... 🙂

              It's pretty cool in Rails, you just put this in your data class:

              has_and_belongs_to_many :table2

              Then it automatically figures out the many-to-many table (if it doesn't, you can explicity tell it which table), then it's all accessible via:

              @record.table1

              Where table1 is the one-to-many table.

                That's impressive. Do you have any benchmarks on how fast it is? I find that sometimes the stuff like that that's so easy to use can be a bear on performance, but as long as I can get under the hood and fix it it's ok.

                  I don't have any benchmarks, but since switching from PHP and Apache to Rails, Lighttpd, and SCGI, performance has been much better - average server load went from about 4 to 0.75. I'd bet that Rails' caching ability accounts for a large part of this (I cache many parts of pages that are the result of queries).

                  With that code in Rails, it'll do the queries behind the scenes. So if you have a page that lists fifty different records, that'd be at least fifty different queries. But Rails has "eager loading", which just means it'll create joins and it'll just be one query. Eager loading is accomplished as so:

                  has_and_belongs_to_many :table2, :include=>'tableX'

                    now, you realize you're going to make me go play with this on my spare time over Xmas weekend, right?

                    dastardly dan... 🙂

                      Heh, I like Rails soooooo much better than PHP. Ruby is pretty nice too. And Lighty. And PostgreSQL (which the Rails community definitely seems to prefer over MySQL). It's taken me about a month and a half to get to the point where I feel fairly proficient.

                        Interesting thread over on Slashdot today about Ruby vs Java (and of course the Pythonistas got involved). They all have a place, I never liked Java but there must be something good about it because plenty of people swear by it.

                        Sometimes I suspect the people who support it the strongest don't know any other languages though. Ruby On Rails is worth looking at, if for no other reason than to expand your mind.

                          Looking at the original table structures and relationships there seems to be too much corelations and dependencies between the tables. While this concept is beneficial for organization. It is usualy not best for real world environment.
                          Remember the rule of thumb of 80/20. Meaning 80% percent of the time database are being READ and 20% being WRITEN. So because of this I believe in designing an infrastructure to accomodate faster and easier READ access.

                          Andrew

                            Write a Reply...