Hi All,
I'm offering this functiion to the community but also requesting your help; if this function has flaws let me know.
When parsing an SQL query, I want to know what fields, functions and tables are being called. Trouble is, those same keywords will also be found inside quotes, for example:
INSERT INTO queryTable SET querystring="INSERT INTO queryTable SET query=\"get me out of here!\"", date=NOW(), functionname='NOW() is a function'
The function below solves this by converting all characters inside strings into a (rarely) used character, þ (ascii character 254), the above query would become:
INSERT INTO queryTable SET querystring="þþþþþþþþþþþþþþþþþþþþþþþþþþþþþþþþþþþþ",
date=NOW(), functionname='þþþþþþþþþþ'
(you can also pass your own character as a second variable)
The output string is exactly as long as the input string, but you can see it's easy to safely split the query now by keywords like SET, FROM, ORDER BY, etc., and then to get field names and functions.
This function could be applied to php code to see what functions are being called, although it would probably need modification. PLEASE LET ME KNOW IF THE FUNCTION CAN BE BROKEN!
Sincerely,
Sam Fullman
P.S. Just realized one way you could break this is with / comments like this /, they should be preg_matched and preg_replaced I think
/**-----------------------------------------
MySQL Rule inside of strings: (applies for a singe quote wrapper at least), a backslash is required before the same (single) quote, however a backslash is NOT required before the opposite (double) quote. However, if a backslash IS present before the double quote it will be ignored (treated as an escape character for the double quote).
Rather a strange convention, in essence there are two ways to declare the opposite quote type in a string.
The same thing is true in inverse with a double quote wrapper.
-------------------------------------------*/
function string_region_replace($x, $f="þ"){
/*-------------------
this function replaces all characters within a string with þ
(that's character 254, odds are it will NOT be used in most applications)
we then can safely and easily parse field names and functions in the query
---------------------*/
$op['"']='"';
$op["'"]="'";
if(!strlen($x))return false;
for($i=1;$i<=strlen($x);$i++){
$c=substr($x,$i-1,1);
if($mode){
if($c==$mode && $lag=='\\\\' && $dbllag!=='\\\\'){
//escaped string
$str.=$f;
}else if($c==$mode){
$mode='';
$str.=$c;
}else{
$str.=$f;
}
}else{
$mode=$op[$c]; //blank if not a quote
$str.=$c;
}
$dbllag=(strlen($lag)?$lag:'');
$lag=$c;
}
return $str;
}
/**
$string= 'INSERT INTO table set field=\'Nat King Cole sang \\\'Unforgettable, That"s what your are..\\\' and I"d like to get a CD of that\', field2="5"';
echo strlen($string) . "<br>";
echo $string2=string_region_replace($string);
echo "<br>";
echo strlen($string2) . "<br>";
**/