I want to do some simple validation - ie check the posted data came from the page I think it did. I know I could use HTTP_REFERER but I read this can be unreliable. Is there a better way to do this or am I wrong and HTTP_REFERER is perfectly fine?
Whats more reliable than HTTP_REFERER ?
As they say in the security world, it depends on what your attack vector is.
For example, if you are concerned about some random web site putting a form on their web page and posting to your web page, then HTTP_REFERER might be fine because most users at that other web site aren't going to be interested in taking the time to spoof the HTTP_REFERER (if they even know how).
On the other hand, if you are concerned about ONE hacker making a copy of your form, putting it on his local computer, altering the form (changing the hidden variables, for example) and posting it to your site, then no, HTTP_REFERER will not be sufficient because someone sophisticated enough to do that can easy spoof the REFERER and this will only be a tiny speedbump for her. (And this is a really good reason why you always need to validate the incoming data).
One technique you can use is this: When you build the form, pick a large random number and put it in a hidden field in the form. Then write that number to a database. When you process the form, check to see if that number is in the database. If yes, then process the form AND delete the number from the database so that it can't be used again. This will prevent the first problem but not the second one.
Another technique you can use is to md5 encrypt data that you put in hidden fields. This completely prevents the hacker from altering the values of the hidden fields.
If you are concerned about someone hitting your form too often (let's say you have a mortgage calculator or something and they are trying to put the burden on your server to do the calculations), then you might have to check for multiple visits from the same IP. You might put a limit of 10 per hour and no more than two in any given minute or something like that.
I had one client that asked customers to call a phone number where they gave out a password that could be used to submit the form twice. Once the password was used twice, it was deleted from the database and couldn't be used again.
The solution to your problem depends on how people are abusing your form.
Lord thats a lot of info! Thank you. In a minute I shall re-read and properly digest all that.
I just wanted to use it as a simple security measure.
Since the form puts data into my database, I thought it might be sensible to make sure I know where the data comes from.
Because I read HTTP_REFERER was "unreliable" I wondered if that might mean that a http referer page sometimes might not show up, even if the request had come from my own site? Could that happen? Or not?
Yes, it's true. The HTTP_REFERER is supplied by the web browser (not your web site) - and the web browser might choose to not include that data. There are plugins for Firefox that allow the user to specify whatever value they want to be supplied as the REFERER.
For basic, low level, security you could check to see if it DOES have a value and if so, is the domain something other than yours. This way, if the referrer is blank, you let them through but if the referrer is from www.we_are_pirates.com then you can simply exit(); instead of processing the form.
Thats a good idea...
Is there any other low level security measures youd recommend, just in case the browser doesnt send that variable?
Hmm.. I could do the random number thing I guess... maybe thats a bit much.
Im not trying to build a fort.. I just want to make it hard enough that anyone who thinks it might be fun to have a go at my site for a laugh gets bored before they get anywhere.
..basically if I can think of a way to "hack" my own sites I get worried. lol.
Most forms I write don't have any checks because frankly, I don't care where the data comes from. If you want to contact us to do business with us, I don't care if your data was posted from our site or someone else's!!
There are no general "this is the secure way" techniques because it's easier to prevent specific problems that you either (A) already have or ( are certain your are going to have.
It's good of you to be thinking in terms of security but you can really only target specific types of attacks (or abuses).
I suppose it doesnt matter terribly. All the php page does is post the data from the form fields to a database, it isnt secure information or anything.
Ok one final question- Is there anyway to pass a variable in a hidden form field without it showing if someone views your source code? or err.. some other way to pass a hidden var from the sending page to the processing one without any trace of it visible in source code?
If you really have to put the data in the form to pass it to the processing page, the best you can do is encrypt it and then decrypt it when it gets there. This way, if they view source, they will just see the field is gibberish. This also prevents them from altering it.
But in most cases, you don't have to put the data in the form when all you're trying to do is get it back from them. It's better to put it in a database and then put a unique id in the form. When the form gets posted, you take that unique id and look up the secret info in your database. And if you use their session id as the key, then you don't even have to embed the key in the form.
So on submit I would post some data to the database.. and then redirect to the processing page.. using... meta redirect? Cant use headers unless I start output buffering and eugh.
No. You wer asking about a way "to pass a hidden var from the sending page to the processing one without any trace of it visible in source code".
Let's say you have three pages.
one.php which is a form that asks their gender and passes the result to...
two.php which is a form with lots of fields AND passes the gender to...
three.php which is your processing page.
Now, for whatever reason (there are lots of good reasons), you want to pass gender to the 3rd page. You could create a hidden field on the 2nd page but someone could see the field if they do a view source.
On page 2, when you've received the gender, you do three things. First, you pick a random unique id number (uid). Second, you write the uid and the gender to a database. Third, you display a form and you create a hidden field with the uid. If someone does a view source, all they will see is the hidden field uid=123456789 and they will have no idea that you're using that to pass the gender. The form on page 2 also has a bunch of other fields.
Then, on page 3, you take the uid that got passed in and you use that to look up the gender in the database. You can use that and all the other fields in the table to do whatever processing you are going to do.
Session ID's are another way to accomplish this. You assign the secret information to the session variable on page 2 and when the user gets to the 3rd page, you read the value from the session variable. This way, if the user does a view source on the form on page 2, the data won't be there because it's being stored server side.
No need for browser tricks like onSubmit or metarefresh.