Back to the top
References:
A little recap about http, and what and where these "headers" are.
Here is a web client, being run by Alice. A "web client" is something that sends http requests to a web server and receives the results. Often it's a browser, but the fact is that most clients are not browsers at all but automated programs requesting documents for who knows what purpose.
Here is a web server, being run by Barry. A "web server" is a program that sits around waiting for http requests to come in, and responds as it sees fit when they do. By far the most widely-used web server out there at the moment is Apache.
Alice's client sends Barry's server an http request. The server looks it over, reads it, finds out what is wanted, and sends a response back (even if that response is Access Denied).
What do these http requests and responses look like? Both kinds of message have the same format, and that format is what defines the HTTP Protocol.
A typical request
GET [url]http://nz.php.net/faq[/url] HTTP/1.1
Host: nz.php.net
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.5) Gecko/20031007
Accept: text/html;q=0.9,text/xml,...[[i]truncated to save space[/i]]
Accept-Language: en-nz,en-gb;q=0.8,en-au;q=0.6,en;q=0.4,en-us;q=0.2
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Proxy-Connection: keep-alive
Cookie: LAST_LANG=en; COUNTRY=NZL%2C122.37.216.106; LAST_SEARCH=quickref%2Cheader
If-Modified-Since: Mon, 08 Dec 2003 13:12:44 GMT
A typical response (actually, the response to the above request)
HTTP/1.x 200 OK
Date: Tue, 09 Dec 2003 02:02:45 GMT
Server: Apache/1.3.28 (Unix) mod_ssl/2.8.15 OpenSSL/0.9.7c
Content-Language: en
Status: 200 OK
Last-Modified: Tue, 09 Dec 2003 13:13:13 GMT
Vary: Cookie
Content-Type: text/html;charset=ISO-8859-1
Proxy-Connection: close
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>PHP: FAQ: Frequently Asked Questions - Manual</title>
[[i]...etc.etc.etc...[/i]]
The message starts with a line stating what version of http is being used, and also what sort of message this is - whether it's from a client making a GET or POST request, or a server sending back a 200 OK response (the ideal) or a 404 Not Found, or (especially from IIS) an 500 Internal Server Error, or whatever. The line ends with \r\n (Carriage return/line feed).
Following that line are some more lines of text, all ending with \r\n, that describe the content (if any) that follows, a little bit about the program sending the message, its likes and dislikes and its capabilities, and some other bits and pieces intended to ensure that the server can send the most appropriate response back to the client.
Then there is a blank line. That is, \r\n on a line by itself.
Then comes the content of the message.
For GET requests (by far the most common), this is empty. For form POST requests it contains the contents of the form (URLencoded usually, but when files are being uploaded one uses form/multipart and the form contents are MIME-formatted and (typically) base64-encoded). There's also a HEAD request (which works exactly like GET but the response contains only the headers and none of the content) and a PUT request (which is supported by PHP, but I don't know anyone who actually uses it, whether in PHP or not).
For responses, the content depends on what type of response it is: a 200 response contains the document that was requested, a 404 might be empty or it might contain an error page, a 302 Moved Temporarily is usually empty,...
Some of you have probably programmed in PHP and used its header-manipulation functions (header() is the obvious one, but setcookie() and session_start() add headers as well) without knowing any of this. That's because PHP has always known how http works and handled it for you. But it meant that when you got an error "Cannot send headers in foo.php at line 15. Headers already sent by foo.php at line 7" you had no idea what it meant. Here's what it meant:
The most important thing to see about http messages is that you have some lines of text - the headers - then a blank line, and finally some content. The headers come first; that's why they're called "headers". Those headers are always there, even if they're just the bare minimum
HTTP/1.1 200 OK
Content-Type: text/html
PHP is in the business of building http responses, both the headers and the content. If you're sending a PNG image, you write header("Content-Type: image/png"); and PHP will replace the existing HTML Content-Type header with one that says that the following document is a PNG file. Say "setcookie(blahblah);" and PHP will add a Set-Cookie: header. You can stick pretty much any header you like into a response - if the client doesn't understand it then it will be ignored.
But you have to have set all your headers before you start sending any content to the client. Naturally, because the headers have to go first. So as soon as you output any content - HTML, image file, even a single blank line - PHP takes all the headers it has collected so far, fires them off, and then starts sending that content. Once that's happened it's too late to send any other headers with that response. If you try you get the error message this post is about.
That message will tell you where in the script you were adding the (late) header, and it will tell you where in the script PHP decided to send all the headers it had collected and start outputting content.
Often, it's just a stray blank line at the top of the page that you can just delete and all will be well. Sometimes, though, it can be a little bit hairier. There are three main solutions, which I list in decreasing order of preference:
Plan your scripts so that you get all the header-related stuff out of the way before you start outputting any content. In some cases (such as when you're redirecting the client by sending a Location: header) you don't need to send any content at all, and by doing so you're just slowing your site down.
Use ob_start() and ob_end_flush(). These functions have their uses. ob_start() turns on an "output buffer" and any content that would have been sent to the client is instead stored in this buffer, until ob_end_flush() (or the end of the script) is called, where it will output everything that has been buffered. There are three main problems with this. First, by delaying sending output you're slowing your page's response; Apache at least can be sending content to the client and the user will be getting stuff on the screen even while the script is still running (and that offers opportunities for clever little programming tricks in itself). Second, PHP has to hold on to the contents of that buffer, instead of just outputting it and forgetting it (the output buffer is intended to allow PHP to reprocess pages it has already generated); memory has to be allocated to store buffered page; in a heavily-loaded environment this can get quite messy, bogging the memory routines down. Third, it's just plain clumsy.
Switch on output buffering in php.ini. Not entirely sure why this option would be a good idea; it's option 2 applied to every page whether it's needed there or not (I understand this is how ASP works). Not only that, but it assumes that the hosting server has the same ini setting.
One more disadvantage to output buffering that is also a security vulnerability. All too often I see code for "password-protected" pages that works like this:
Turn on output buffering
Starting sending page content
Check the user is logged in, or that they've entered a correct username and password
If they haven't, send a redirect header (header("Location:....")) back to the login page
The problem is that along with the 302 header that tells the client which URL to go to instead, everything in the buffer gets sent as well as the content of the response. So if there was anything in that buffered content that the user shouldn't have seen you still sent it to them anyway! Sure, most clients would ignore it and just follow the redirect, but they don't have to, and anyone with half a mind to snoop is perfectly capable of grabbing it and having a look.