to camelCase hell and back

sneakyimp · Jun 7, 2011

I'm working on a new project and hoping to establish some magic methods in my classes to facilitate the getting/setting of private or protected values without forcing my devs to define a getter/setter for each and every property of each and every class. We're going to have to write a bunch of VO and DAO classes for our various database tables and writing and maintaining all those getters and setters would be a lot of work.

Based on this discussion, i'm hoping to use under_scores for $variable_names and camelCase for methodNames. Naturally, the need arises in this context to convert $some_variable_name automatically into setSomeVariableName and getSomeVariableName. Additionally we must convert them back form camelcase to underscores to get the name of the underlying property (i.e., to know that getSomeVariableName should retrieve the value of $some_variable_name).

I think I've got the basics covered fairly well with these two methods, but realize that their might be trouble with an underscore name such as some_x_var where there is a word with only one char and also with camelcase names with uppercase acronyms like someXVar.

toCamelCase routine turns "some_x_var" into "SomeXVar"
fromCamelCase routine turns "SomeXVar" into "some_xvar"

I think my functions should handle multi-letter acronyms pretty well, but the single-letter words present a problem.

If anyone could recommend a better algorithm, I would appreciate it. I'm also wondering if this is such a good idea. Perhaps there's a way to warn when there are ambiguous conversions?

   /**
    * A function to convert underscore-delimited varnames
    * to CamelCase.  NOTE: it does not leave the first
    * word lowercase
    * @param string str
    * @return string
    */
   public static function toCamelCase($str) {
      $string_parts = preg_split('/_+/', $str);

  if (!is_array($string_parts) || (sizeof($string_parts) < 1)){
     throw new Exception("Unable to split the input string");
  }
  foreach($string_parts as $key => $string_part){
     $string_parts[$key] = ucfirst(strtolower($string_part));
  }
  return implode('', $string_parts);
   } // toCamelCase()
   /**
    * A function to convert camelCase varnames
    * to underscore-delimited ones.
    * @param string str
    * @return string
    */
   public static function fromCamelCase($str) {
      $matches = NULL;
      if (preg_match_all('/(^|[A-Z])+([a-z]|$)*/', $str, $matches)){
         $words = $matches[0];
         $words_clean = array();
         foreach($words as $key => $word){
            if (strlen($word) > 0)
               $words_clean[] = strtolower($word);
         }
         return implode('_', $words_clean);
      } else {
         return strtolower($str);
      }

   } // fromCamelCase()

NogDog · Jun 7, 2011

Maybe I'm being more obtuse than normal, but why? Two separate thoughts:

If you use the "magic" get() and/or set() methods, you can just use the variable name, you don't need to convert it to a method name.
If it's truly private, why do you want/need getters/setters anyway? There are valid reasons, such as only having a getter but no setter so as to emulate a public but read-only variable, or because you want to enforce some sort of validation when setting what otherwise would be a public variable -- but do you truly need to "define a getter/setter for each and every property of each and every class"?

NogDog · Jun 7, 2011

PS: For classes directly tied to DB tables, I generally use a single class variable that is an array for the DB columns. Then I can use set()/get() to only access that array:

<?php
class SomeTable
{
   private $data = array(
      'id'  => null,
      'name'=> null,
      'date'=> null
   );
   public function __set($key, $value)
   {
      if (array_key_exists($key, $this->data)) {
         $this->data[$key] = $value;
      } else {
         throw new Exception("Invalid key '$key'");
      }
   }
   public function __get($key)
   {
      if (array_key_exists($key, $this->data)) {
         return $this->data[$key];
      } else {
         throw new Exception("Invalid key '$key'");
      }
   }
}

$test = new SomeTable();
$test->name = 'NogDog';
echo $test->name;

That makes is simple to set/get table columns while not messing with with other class variables that you would then write explicit get/set methods for only if needed.

sneakyimp · Jun 7, 2011

NogDog;10981766 wrote:
Maybe I'm being more obtuse than normal, but why? Two separate thoughts:

If you use the "magic" get() and/or set() methods, you can just use the variable name, you don't need to convert it to a method name.

If it's truly private, why do you want/need getters/setters anyway? There are valid reasons, such as only having a getter but no setter so as to emulate a public but read-only variable, or because you want to enforce some sort of validation when setting what otherwise would be a public variable -- but do you truly need to "define a getter/setter for each and every property of each and every class"?

1) If someone has bothered to define a getter or setter, I want to force the use of it if anyone tries to set these values directly.

2) The reason for making it private is to protect it from direct modification without the getter or setter being applied.

I realize that I could define my classes manually and make properties public that don't need getters or setters, but I'm auto-generating these classes from a PHP script in the desperate hope that I can create VO/Model classes for all my 40-50 database tables (hundreds of input fields) quickly so i can meet a very unreasonable deadline.

sneakyimp · Jun 7, 2011

NogDog;10981767 wrote:
PS: For classes directly tied to DB tables, I generally use a single class variable that is an array for the DB columns. Then I can use set()/get() to only access that array:

I had considered this but was hoping to gain the advantage of code hinting that comes with defining actual class variables. If you pass everything around in arrays, you sort of have to either memorize your database tables' columns or you have to keep looking them up. It was my feeling that I could help speed things up on this project if my devs, using Eclipse, were to have the code hinting and autocomplete windows popping up the names of the VO/Model properties.

I'm doing more or less what you recommend. What makes it difficult is that we want to use underscores for vars and camelCase for methods. I'm starting to have second thoughts about that.

NogDog · Jun 7, 2011

sneakyimp;10981772 wrote:
I had considered this but was hoping to gain the advantage of code hinting that comes with defining actual class variables. If you pass everything around in arrays, you sort of have to either memorize your database tables' columns or you have to keep looking them up. It was my feeling that I could help speed things up on this project if my devs, using Eclipse, were to have the code hinting and autocomplete windows popping up the names of the VO/Model properties.

I'm doing more or less what you recommend. What makes it difficult is that we want to use underscores for vars and camelCase for methods. I'm starting to have second thoughts about that.

Yep, sounds like a good argument for not using different naming conventions for variables and methods in this case. (I've always just used camelCase for both and never really considered a need for them to be different, so I guess I'm just used to it.)

NogDog · Jun 7, 2011

I decided to play around with it a bit, and came up with this. I think you would want to make part of your naming standard that even acronyms be initial cap only, e.g. $userId, nor $userID.

<?php
class Camel
{
   public static function camel2under($str)
   {
      $regexp = '#(?<!=^)[A-Z]#e';
      return preg_replace($regexp, "'_'.strtolower('\\0')", $str);
   }
   public static function under2camel($str)
   {
      $regexp = '#_(.)#e';
      return preg_replace($regexp, "strtoupper('\\1')", $str);
   }
}

$name =  'thisIsATest';
$newName = Camel::camel2under($name);
$oldName = Camel::under2camel($newName);
echo "<pre>$name\n$newName\n$oldName</pre>";

Output:

thisIsATest
this_is_a_test
thisIsATest

Bjom · Jun 8, 2011

I'm a big fan of using a standard that has been defined already over making up a new one. For PHP I find the Zend Coding Standards quite useful.

Variables and functions are all camelCased according to that standard and I can't see a reason for the underscore_hell. Why not keep it simple?

Just my humble thoughts on that.

regards

Bjom

Weedpacket · Jun 8, 2011

sneakyimp wrote:
I'm doing more or less what you recommend. What makes it difficult is that we want to use underscores for vars and camelCase for methods. I'm starting to have second thoughts about that.

cf.

NogDog wrote:
Yep, sounds like a good argument for not using different naming conventions for variables and methods in this case.

Ironically, if you're going to have underscores for one and camel case for the other, it would make a bit more sense to have it the other way around - underscores for methods, camel case for variables; method names are case-insensitive, so using camel case for them has no significance.

sneakyimp · Jun 8, 2011

Bjom;10981803 wrote:
I'm a big fan of using a standard that has been defined already over making up a new one. For PHP I find the Zend Coding Standards quite useful.

Variables and functions are all camelCased according to that standard and I can't see a reason for the underscore_hell. Why not keep it simple?

Just my humble thoughts on that.

Worthy stuff. Thanks for the link. Interestingly, Zend naming conventions are all camelCase -- and the only use of underscores is to prepend an underscore to private or protected variables and methods. The scheme I'm trying to implement would not be permitted in this naming convention because one would not be able to determine which variable to set should someone call an inaccessible setter method like setSomeVarName. Would I need to set _someVarName or someVarName?

Weedpacket, thanks for that input. Good stuff as always. I did cf before and had adopted laserlight's approach because it seemed to offer the benefit of easy visual distinction between variables and methods. I'm now wondering if my plan to create my abstract class with these set and get and __call methods might be a little too clever. That'll teach me to read a book.

NogDog · Jun 8, 2011

I've always considered the "()" at the end of a method/function name to be a pretty good visual indicator.

to camelCase hell and back

Ssneakyimp

NogDog

NogDog

Ssneakyimp

Ssneakyimp

NogDog

NogDog

BBjom

Weedpacket

Ssneakyimp

NogDog