Aden1;10986761 wrote:
$surname=$_POST['surname'];
if ((strstr($surname, "'")) AND (!get_magic_quotes_gpc()))
$surname=addslashes($surname);
I am also wondering whether this approach is secure (if I can get it to work)
or whether I should be trying a different way to deal with the apostrophe issue.
addslashes should NOT be used to sanitize data for use with databases. You should, when applicable, use an escape function written for use with your particular DBMS and DB driver, such as mysql_real_escape_string or mysqli_escape_string for mysql. Should no such function exist, you'd have to write your own.
The reason is that addslashes adds slashes to characters that needs escaping in PHP strings, whereas a DBMS-specific escaping function not only escapes strings for use with that DBMS, but also particularly for the settings for your database, which is why they require a live connection to the database. For example, in MySQL, identifiers are normally delimited by the non-SQL-standards compliant backtick `, while if you configure your database to run in strict mode it uses the SQL-standards compliant double quote ". I am however not certain if strict mode makes MySQL escape single and double quotes according to the SQL standards specification by using single and double quotes, i.e. "" and '' respectively, or if it continues to use \" and \' respectively. But, that's another upside of using the function tailored to the task - you don't have to care about this when executing the queries through PHP.
Aden1;10986761 wrote:
However when it comes to using a surname containing an apostrope
as part of a search query I have a problem.
$surname=stripslashes($surname);
The reason is that you've failed to understand how escaping characters works. The way escaped characters are handled is generic, so it doesn't matter if you look at it through php, sql or other languages. The only thing that (possibly) changes is what characters needs escaping and how they are escaped. So let's have a look at escaping strings in php.
Look at the output and my comments and then compare that to the code.
Escaping characters to provide them with special meaning
# Differnce between characters that get a special meaning and those that don't.
$ds = "\a\t\c";
# I'm applying utf8_decode to stick to iso-8859-1 encoding which encodes one byte
# per non-escaped character. Escape sequences however, consists of the escape
# character, in this case backslash \, and the escaped character(s), together acting as
# a single character (or instruction for the computer)
$ds = utf8_decode($ds);
echo '<pre>'.$ds.'</pre>';
echo 'strlen: ' . strlen($ds);
As such, the output is
\a \c
strlen: 5
Do note the strlen of 5 as opposed to the 6 bytes in the string. This even works if you index into the string for each separate character
echo '<br/><pre>';
for ($i = 0; $i < strlen($ds); ++$i)
{
echo $ds[$i];
}
echo '</pre>';
The backslash character \ is used to escape characters with special meaning, or provide characters with a special meaning. For example, the character t, when escaped by \ represents a horizontal tab, whereas \a and \c has no such meanings - in PHP. \a might however represent the ALERT character (ASCII value 7) in other languages, such as C++.
The backslash character is escaped as any other character, i.e. using backslash. As such, \ becomes \
$ds = "\\a\\t\\c";
$ds = utf8_decode($ds);
echo '<pre>'.$ds.'</pre>';
echo 'strlen: ' . strlen($ds);
echo '<br/><pre>';
for ($i = 0; $i < strlen($ds); ++$i)
{
echo $ds[$i];
}
echo '</pre>';
Output
\a\t\c
strlen: 6
\a\t\c
Escaping is done ONCE. That is, not recursively or iteratively. As such, each \ in the string is treated like a \ AFTER escaping. This turns the output into \a\t\c at which point no further escaping is performed.
The string now shows '\a\t\c', but it does NOT contain those 6 bytes, since the OUTPUT would then be excatly what it was in the first example. It contains 9 bytes, while the string contains 6 characters.
Escaping characters to remove special meaning
# Normally, a double quote terminates a double quoted string literal,
# but by escaping a double quote we strip it of its special meaning
$ds = "(\")";
$ds = utf8_decode($ds);
echo '<pre>'.$ds.'</pre>';
echo 'strlen: ' . strlen($ds);
output
(")
strlen: 3
Should you however place a double backslash before the double quote inside the string literal, that double quote would instead terminate the string since \" is the escaped character \ followed by a non-escaped ".
Also note the difference between double and single quoted strings in php: Escape sequences differ between the two!
$ds = '\a\t\c';
echo '<pre>'.$ds.'</pre>';
$ds = '\\a\\t\\c';
echo '<pre>'.$ds.'</pre>';
For example, \ no longer escapes t, but it still escapes \, so \ is an escaped \, while \t is non-escaped \ followed by non-escaped t, and \t is, like before, escaped \ followed by non-escaped t.
Aden1;10986761 wrote:
I don't understand why the addslashes function is working for entering the
data but the stripslashes function is not working to retrieve the data.
So, with these things in mind, let's look at escaping delimiting characters. The string you want to insert is (unquoted and unescaped)
O'Donell
And since string literals in SQL is delimited by single quotes, you have to escape the single quote WHEN INSERTING the string. According to SQL standards, a single quote is escaped by a single quote (like the backslash is escaping backslash in php), but MySQL ordinarily uses \ to escape single quotes
-- SQL standards compliant
'O''Donell
-- MySQL usually would have it like this
'O\'Donell
But do note that these two escaped strings are exactly the same strings again! The difference in which character is used to escape other characters is the only thing that differs, and the escape character is only there to enable you to provide a string that could otherwise not be represented. The two strings are identical, as long as one is in a DBMS using ' to escape ' and the other is in a DBMS using \ to escape '. And that string literal is: O'Donell.
In other words, if you now look at what is in the database, SELECT ... FROM ... WHERE ..., you would see the string O'Donell.
Now, you say you want to search for this string, that is WHERE somefield = O'Donell.
But wait, O'Donell is still a string literal and still needs to be delimited by single quotes. That means
... WHERE name = 'O'Donell';
Except that ' still is a string delimiter so the string literal is now 'O' followed by "Donell'", which gives the error message:
You have an error in your SQL syntax; syntax to use near 'Donnell''
since Donell means nothing to your database: it isn't a field identifier or reserved keyword, and even if it was a field identifier, after name='O' you would need some other keyword like AND or OR. And even if you had tried inserting 'O' AND stuff' the last ' in there would be the starting delimiter for a string, thus screwing up the rest of the statement.
The solution is once again to escape the ' inside the string literal, i.e.
-- standards compliant
... WHERE name = 'O''Donell';
-- MySQL's usual way of handling this
... WHERE name = 'O\'Donell';
and both statements tells their respective database (with differing escape characters used) to search for the string: O'Donell
So, you still need to escape your strings, not "unescape" them, which even could be straight out dangerous. For example, consider what happens if you unescape
'O\' [HARMFUL STUFF HERE]'
That would then turn into
'O' [HARMFUL STUFF HERE]
where the harmful stuff is now outside the string literal where it will be possible to execute sql statements.