Last week I visited the PHPUK2010 conference in London and it has been a great time spending with colleague developers and fellow web addicts. I’m not gonna go into detail about every single talk because it’s probably easier and more interesting to head over to slidehare.net or phpconference.co.uk and check it out for yourself. Just let me say that it was a very well organised conference with a load of great speakers at a superb location. So great job PHP London!
Having said that, there was one specific talk I was really looking forward that. Lorna Mitchell of iBuildings was going to explain us “Best practices in web service design”. I’m currently working on a REST web service and there are some aspects of it that still raise some questions. For instance, unlike SOAP, REST does not really have a description language, a language with a vocabulary that can describe the service. So how do you deal with that? Another thing is the output format. A web service can offer a variety of formats. JSON and XML are the most popular ones, but in the case of XML, would you define your own tags, or would you rather pick XHTML.
It’s the latter issue I’d like to focus on in this blog post.
Lorna’s talk was really interesting, and she obviously has quite some experience with building web services. Unfortunately she didn’t mention anything about the XML/XHTML format. Very understandable of course, as her time was very limited and there are so many aspects about web services.
So afterwards I went to see her to check what her opinion was about the XML/XHTML output, and I must say I was a little bit surprised by her answer. Her reasoning was that no markup should be used in the output of a web service, so one should definitely use general XML.
I kind of had a problem with that argument, as to me XHTML is in the first place XML (though with tags that are predefined and limited) and the markup is achieved by defining CSS for the tags. Remove the CSS, and you end up with a document that looks exactly as if it was a custom XML document. So stating that you should not use XHTML because then you output markup seems incorrect to me. You will output semantical XML tags, but that’s something completely different than markup.
So I can’t see any arguments why not to use XHTML. On the contrary, this is why I think XHTML is in fact the better format:
- XHTML tags, although predefined and limited, will fit all your needs for structuring your data. If the whole web is built with that limited set of tags, you could expect it to be sufficient for your service.
- XHTML tags are semantical, and since every developer knows their meaning they are easy to interpret and read. You can easily use a <dl> for key-value pairs, <li> to represent a list, or a <span> for example if nothing would fit. A “class” attribute for the latter can be used to give more meaning.
- As Lorna mentioned in her talk, documentation is extremely important. When using XHTML, you can just check out a web service in your browser and actually see how it works. The browser will know exactly how to render the responses. And if the service respects a ROA approach, you can even browse from one resource or service to another by clicking around. The web service would almost become the documentation on its own.
- if every web service uses the same (XHTML) tags, it would save a lot of developer work in terms of parsing the response
To finish my point, let me just give you an example of a service respons in both XML and XHMLT, and judge for yourself. I took an extract of a Twitter response, giving you some user information:
I added some hyperlinking to link different services together, so therefore the XHTML version is a bit longer. But is it more complicated? Is there more overhead because of markup? Would it be more difficult to parse? And just try rendering both versions in your browser and tell me which one learns you the most.
If you’d like to read more on this matter, and especially the ROA approach for RESTful services, I would recommend “RESTful Web Services” by Leonard Richardson and Sam Ruby (O’Reilly). The idea behind the book is “web services are web sites for robots”, which is really an interesting way of looking at it.

Last week I visited the PHPUK2010 conference in London and it has been a great time spending with colleague developers and fellow web addicts. I’m not gonna go into detail about every single talk because it’s probably easier and more interesting to head over to slidehare.net or phpconference.co.uk and check it out for yourself. Just let me say that it was a very well organised conference with a load of great speakers at a superb location. So great job PHP London!

Having said that, there was one specific talk I was really looking forward to. Lorna Mitchell of iBuildings was going to explain us “Best practices in web service design”. I’m currently working on a REST web service and there are some aspects of it that still raise some questions. For instance, unlike SOAP, REST does not really have a description language, a language with a vocabulary that can describe the service. So how do you deal with that? Another thing is the output format. A web service can offer a variety of formats. JSON and XML are probably the most popular ones, but in the case of XML, would you define your own tags, or would you rather pick XHTML.

It’s the latter issue I’d like to focus on in this blog post.

Lorna’s talk was really interesting, and she obviously has quite some experience with building web services. It’s also great and inspiring to hear someone talk in such a passionate way as she did. Unfortunately she didn’t mention anything about the XML/XHTML format. Very understandable of course, as her time was very limited and there are so many aspects on this subject.

So afterwards I went to see her to check what her opinion was about this, and I must say I was a little bit surprised and confused by her answer. Her reasoning was that no markup should be used in the output of a web service, so one should definitely use general XML.

You can’t really agree with that argument because XML is a markup language as well. So then I was thinking that maybe I misunderstood and she actually meant there is no need to use a language that has ‘styled’ tags, but that wouldn’t make much sense either because robots that consume web services don’t apply styles to tags, only browsers do. Or maybe she meant that (X)HTML results in too much tags compared to custom XML, but even that could be easily proven incorrect.
Unfortunately since I had a couple of other question to ask her and other people were waiting with more questions, I couldn’t continue the discussion.

To me XHTML is in the first place XML, though with a predefined and limited tag set, and I can’t see many arguments why not to use it. On the contrary, this is why I think XHTML is in fact the better format:

  • XHTML tags, although predefined and limited, will most likely fit all your needs for structuring your data. If the whole web is built with that limited set of tags, you could expect it to be sufficient for your service.
  • The “HyperText” feature of XHTML, or in other words the possibility to link content together, could actually be very useful for web services as well (see code sample below).
  • XHTML tags are semantical, and since every developer knows their meaning they are easy to interpret and read. You can easily use a <dl> for key-value pairs, <li> to represent a list, or for example a <span> if nothing would fit. A “class” attribute can be used to give additional meaning.
  • As Lorna mentioned in her talk, documentation is extremely important. When using XHTML, you can just check out a web service in your browser and actually see how it works. The browser will know exactly how to render the responses. And if the service respects a ROA approach, you can even browse from one resource or service to another by clicking around. The web service would almost become the documentation on its own.
  • If every web service would use the same (XHTML) tags, it would save a lot of developer work in terms of parsing the response.

To finish my point, let me just give you an example of a service response in both XML and XHMLT, and judge for yourself. I took an extract of a Twitter response containing some user information:

XML (actual Twitter response):

<?xml version="1.0" encoding="UTF-8"?>
<user>
<id>1401881</id>
<name>Doug Williams</name>
<screen_name>dougw</screen_name>
<location>San Francisco, CA</location>
<description>Twitter API Support. Internet, greed, users, dougw and opportunities are my passions.</description>
<profile_image_url>http://s3.amazonaws.com/twitter_production/profile_images/59648642/avatar_normal.png</profile_image_url>
<url>http://www.igudo.com</url>
<followers_count>1031</followers_count>
<friends_count>293</friends_count>
<favourites_count>0</favourites_count>
<statuses_count>3390</statuses_count>
</user>

XHTML:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<title>User 1401881 - Doug Williams</title>
</head>
<body>
<dl class="user">
<dt>id</dt> <dd>1401881</dd>
<dt>name</dt> <dd>Doug Williams</dd>
<dt>screen_name</dt> <dd>dougw</dd>
<dt>location</dt> <dd>San Francisco, CA</dd>
<dt>description</dt>
<dd>Twitter API Support. Internet, greed, users, dougw and opportunities are my passions.</dd>
<dt>followers</dt> <dt>1031</dt>
<dt>friends</dt> <dt>293</</dt>
<dt>favourites</dt> <dt>0</dt>
<dt>statuses</dt> <dt>3390</dt>
<dt>more</td>
<dd>
<ul class="more">
<li><a href="http://api.twitter.com/1/user/dougw/statuses">tweets</a></li>
<li><a href="http://api.twitter.com/1/user/dougw/friends">friends</a></li>
<li><a href="http://api.twitter.com/1/user/dougw/followers">followers</a></li>
</ul>
</dd>
</dl>
<form id="searchUsers" method="get" action="">
<p>
Search for other users:
<input id="term" name="q" />
<input type="submit" />
</p>
</form>
</body>
</html>

I added some hyperlinking to link different but related services together, so therefore the XHTML version is a bit longer. But is it more complicated? Is there more overhead because of markup? Would it be more difficult to parse? And just try rendering both versions in your browser and tell me which one learns you the most.

If you’d like to read more on this matter, and especially the ROA approach for RESTful services, I would recommend “RESTful Web Services” by Leonard Richardson and Sam Ruby (O’Reilly). The idea behind the book is “web services are web sites for robots”, which is really an interesting way of looking at it.