So this is how its done...

News artcles from the front page. Comment on them here.
Post Reply
Site Owner
Site Owner
Posts: 104
Joined: Tue, 19 2004 Oct 23:21:02

So this is how its done...

Post by NeoThermic » Tue, 07 2006 Feb 18:24:08

When it comes to XHTML, most people make the huge mistake of sending it as text/html. There's a good reason why you shouldn't do that.

Thus, most people who realise this try do some fun stuff with the HTTP ACCEPT header. Unfortunatly, 99% of those people don't do it right.

An example from this page:

Code: Select all

if ( stristr($_SERVER["HTTP_ACCEPT"],"application/xhtml+xml") ) {
  header("Content-type: application/xhtml+xml");
else {
  header("Content-type: text/html");

You might ask what is wrong with that. Well, the answer is simple. As it stands, the code quoted isn't RFC compatable. See, you don't just try find application/xhtml+xml in the headers, as someone with this as an ACCEPT will get served XHTML:

Code: Select all


It matters because the ACCEPT listed above clearly states that it wants preference of text/html. Yet with the code quoted above, that is ignored and XHTML is sent.

So, what do you have to do to get it right? Well, its simple. You must take into account the q-values.

The RFC is rather light on what you should do to work through the q-values, but it states that:

  • q-values must be a float if specified.
  • If a q-value is not explicitly specified, then the q-value is taken to be 1.0
  • If two q-values match, preference is given to the most explicit one

Given that, I've come up with this code:

What it does is read the HTTP ACCEPT and parse it to search for application/xhtml+xml. If it finds it, it tries to read the q-value. It then does the same for text/html, and finally compares the two to see which to serve.

The RFC, however, doesn't specifiy what to do with invalid q-values. In my code they are fully ignored if they do not match. Thus if the q-value was 0.5c, it would be treated as if it wasn't specified.

You're free to use this code as you wish, although do note one thing. the w3c validator doesn't send an ACCEPT Header. If you print it out, you'll get nothing, and thus the script will assume that text/html is to be supplied. You're more than welcome to add a quick check on the user_agent to find the w3c validator and send it what you wish.

Finally, I've been pointed to the fact that FireFox lies in its accept header. As per Mozilla's own FAQ:

The preference for application/xhtml+xml was added to the Accept header in order to enable the serving of MathML to both Mozilla and IE
If your document mixes MathML with XHTML, you should use application/xhtml+xml.
However, if you are using the usual HTML features (no MathML) and are serving your content as text/html to other browsers, there is no need to serve application/xhtml+xml to Mozilla.

My code above assumes that the HTTP Accept sent by the browsers is to be trusted, and acts on it as per RFC 2068 Section 14.1. I don't wish to try second-guess the browser in order to work out if the HTTP Accept must be trusted.

Post Reply