Hello There!

I am in need of some serious sql help. I am working on my PhD and have a mysql table of data. One of my fields in the table is screen resolution "res" it contains hundreds of values like:
300x200
1024x600
1024x800
1024X768

When doing linear regression on this field, I have to "dummy code" the values. with a categorical value like so:
1
2
3
4

I've added a field to my table called resolutionCategory "resCat" What I would like to do is create a query that assigns a number to each value and then inserts it into the resolution category field like so:

res ------------ resCat


300x200 | 1
1024x600| 2
1024x800| 3
1024x768| 4

Anyone have any ideas? I'd hate to to to code these by hand as there are about 900 unique values and thousands of rows in the table.

Thanks in advance for your assistance!

Regards
WebRuss

    you should add a row for res_id witch should then be set to primarary key and auto_increment so if you do add a new data with will get a new id.

    then you will call the ids and filter the id for your desired value.

      Okay I figured out a way to do it in php, but I was hoping to do it on one query.

      //get distinct screen resolutions
      $sql = "SELECT DISTINCT(`resolution`) FROM `visits` ORDER BY `resolution`";
      // Perform Query
      $result1 = mysql_query($sql) or die(mysql_error());
      //set first category number to 1
      $i=1;
      //loop through unique resolutions
      while ($rowRes = mysql_fetch_assoc($result1)) {
          //create sql to update resCategory
          $sql = "UPDATE `visits` SET resolutionCategory = ".$i." WHERE `resolution` = '".$rowRes["resolution"]."'";    
      //exec query mysql_query($sql); //increment category number $i++; }

        Create a new table having all the columns from the original table, plus an additional id column (auto_increment primary key) and INSERT SELET.

        On the other hand, if it's a one time thing, does it really matter wether you use 1 or 1000 queries?

          Actually you might be better off by making a pivot table out of this. Can be done in SQL, or easier in Access or Excel. I'll dig up the SQL code if you're interested.

          The point is: if you need dummy variables for regression analyses you should have one variable for each category with values 0/1 (then include only n-1 of those variables, etc. ). When treating it as a single variable with 1,2,3,4 you implicitly make a lot of assumptions about said variable that conflict with the nature of categorical data.

          regards

          Bjom

            your treatise is interesting. I visualize resolutions being a ratio between w/h, therefore if

            q= w/h

            then

            1024x768 and
            2048x1536

            would have the same q value

            now imagine smart phones which are tall and skinny, the ratio would be < 1.0

            if you were to swing a 1-length vector anchored on upper-left and sloping down and left at 45 deg. (equal width and height), any angle above (less than 45 degrees) = wide and anything below= tall, then you have a number value that might be usable in presentation calculations - I don't know how but just a thought, and also a way to categorize these values.

            Any I'd enjoy hearing your thoughts on this and a copy of your PhD thesis too 🙂

            Best,
            Samuel Fullman

              Write a Reply...