Note: the title should make sense by the end of the post…
Kevin Dangoor’s recent announcement of TurboGears has resulted in a dramatic increase of interest in Kid. Kid was first announced on November 30, 2004 as a Pythonic XML-based Templating Language. I remember thinking I was going to do a series of articles on why I wanted this specific combination of features in a library. I never did and Kid progressed into its current form, growing a small community along the way.
Although it’s now used primarily for HTML templating, that wasn’t the initial goal of the project. What I really wanted to do was to illustrate a different way of thinking about “Web Services”.
Not what you were expecting, eh?
But it’s true - Kid was supposed to be a simple device that I would use to start a narrative exploring a variety of topics related to building distributed systems atop the web (i.e. “Web Services”). I had decided that the direction being taken by the industry mainstream was incorrect and that web services would languish until they became more like the existing, working, proven, web.
There was a lot of talk about “Web Services Infrastructure” and framework and that talk continues today. The assumption by nearly everyone was (and still is) that web services would require a whole new set of tooling and paradigm. SOAP/WS was still very much RPC oriented and so the focus was on language bindings, typing, discovery, and the like. Web Services programming had almost no resemblance to existing web programming.
This has only accelerated in the time since I was heavily involved. Now it seems the industry is calling for some mysteriously unspecified SOA toolkit, ESB, or some other insignificant combination of alphanumerics to save Enterprise IT.
I was in the middle of the situation a year ago or so and it troubled me deeply. Long days and nights at work were spent combining technologies like SOAP, WSDL, WS-Security, and BPML with massive Java infrastructure. I ate it all up and knew it like the back of my hand. It is not impossible for someone dedicating 8-12 hours a day to understanding this stuff to love it. I loved it. It’s all very intriguing for someone with my particular constitution.
But so is masturbation.
I’m not sure what happened. I guess I had what alcoholics refer to as “a moment of clarity”. I distinctly remember going over to grab a co-worker for lunch. He was on the phone and so I was talking to some guy in an adjacent cube who did “services work”. The services people were developers but were considered a different department than product development. They dealt with specific customer needs and were tasked with using our platform to craft actual real solutions for actual real business problems. Their world is much different from product development, whose main customers were marketing and the executive team.
Anyway, I noticed a very large stack of printed material on this guys’ desk, entitled “Introduction to Web Services”. Development had collected a set of introductory materials that were to be distributed to all of the services people in the field and, I assume, eventually to external developers using our platform. I had a small part in compiling these materials and had reviewed it at various stages in electronic form.
The hard copy put me on my ass. (I say that metaphorically but if it were to hit me with any significant velocity it very well could have literally put me on my ass.) I picked up the tomb (an “Introduction”, mind), slapped it on my co-worker’s desk, who was now off the phone, and whispered, “this isn’t going to work.”
The services guy doesn’t love this stuff - at all. I’m sure he was completely capable of digesting all of it had he the time and interest of, say, someone in my position, but he doesn’t. He has a customer that wants a bunch of systems to talk to each other and they all have very large and amazingly different framework and infrastructure. More framework and infrastructure is the last thing he needs. The services guy was a wake-up call.
At the time this was happening I had been nursing a long-time interest in true web architecture. This didn’t have anything to do with work - a very long time ago I threw together a little download tool thingy that was capable of resuming failed downloads (with 14.4 baud modems this was a big deal and none of the browsers had native support for resuming). This required that I read bits of RFC 2616 and implement a basic HTTP client. I remember being amazed at some of the capabilities of HTTP because I had assumed that it was mostly a simple file transfer protocol (closer to FTP than, say, CORBA). The spec ended up staying with me and I continued to explore different ways people were using the web and HTTP to do new and exciting things.
At some point, I became convinced that existing, basic web architecture solved many of the problems we were seeing with Web Services adoption. The problem wasn’t that WS was incapable of solving the technology issues (it’s quite adequate), the problem was that it was incapable of solving the social issues. It far exceeded the threshold of acceptable complexity for true adoption by this large community of people whose primary goal is to solve business problems.
The web, on the other hand, was invented to solve the same basic integration problems businesses are experiencing now. Tim Berners-Lee’s dilemma is not so different than our own: a bunch of systems with a bunch of closed data formats and processing capabilities that should be universally accessible. Berners-Lee realized early on that solving this problem would require, above all else, a virus: something that was so simple and lightweight that it would be hard not to adopt. And that’s what the web is: a virus. Just like C and UNIX, Windows, Visual Basic, and a slew of other technologies that kept, as a primary requirement, the ability for real people to solve real problems.
I shifted my thinking drastically and tried to imagine how the integration problems we were seeing would be solved using existing web architecture, which we know to have the traits necessary for mass adoption. I should mention here that by “web architecture”, I mean W3C Tag Web Architecture but also all of the tools and techniques that have evolved above and below it. The servers, proxies, template languages, mod_rewrite, sessions, cookies, load balancing, liberal feed parsers, virtual hosts, MVC, MultiViews/content negotiation, monitoring systems, automated testing tools, dynamic languages, view source, and on and on. All of these little tricks and techniques add up to an extremely powerful and yet fairly simple and understandable toolkit for building distributed systems. What’s more is that there are an unmatched number of people who understand how to build these systems using all variations of platform and language.
Which brings us to Kid as “Web Services Infrastructure”. The concept is simple: for whatever reason, template languages (PHP, ASP, JSP, CFML, ERB, Cheetah, Tapestry, Velocity, etc.) have become a fundamental tool for web development and their usefulness is in no way limited to presentational data (HTML). Templates are simple, templates are cool. You throw some junk in there and look at the result. If the result isn’t right, you tweak your template until it does look right. There’s no layers of magic to get in your way. When something doesn’t come out right, you change the template. There’s no “management” or “container” involved.
There’s no rule that says templates must only be used to generate HTML. Indeed, many of the RSS and Atom feeds in the wild are generated from some form of template. They are never automatically-generated-behind-the-scenes using language bindings and are very rarely generated using some kind of DOM/SAX API.
RSS is the most successful web services data format in existence (after
HTML, of course ;). Successful web services in the future are likely to
resemble it. Is there a use-case for RSS or Atom in WS-* land? There are
thousands of pages of spec text and no one could throw together a simple
use-case for the most successful web service in existence? That’s
The point I’m trying to make is that template based web services are a reality and that we should be thinking about making incremental improvements to the general model to facilitate more machine-centric data formats instead of creating whole new paradigms.
There are a variety of really important issues with using templates for general purpose web services programming, most having to do with the first part of Postel’s Law:
“Be conservative in what you do; be liberal in what you accept from others.”
The problem with using templates to produce XML (including XHTML) is that it is exceedingly hard to be conservative in what you do. Most template engines are text based, making it easy to miss well-formedness errors. There are also a range of character encoding issues that template languages could ease but often simply ignore and sometimes make worse.
Kid is a simple attempt at building features that aid in conservativeness into the template engine. I actually considered tag-lining it The Ultra-Conservative Template Engine as a play on Mark Pilgrim’s Ultra-Liberal Feed Parser whose name comes from the second part of Postel’s law. This is, of course, the whole point of being conservative in the first place: so that we don’t need an Ultra-Liberal parser for each variation of “web service”.
I think some of these features are compelling and would like to see them pursued in other tools that are in common use for web development. For instance, one of the most important and least talked about features of XML is that it provides a reasonable system for encoding the entire range of unicode code points in any character set, including ASCII. A template engine with a basic understanding of XML could process templates authored in utf-8, interpolate data encoded in ISO-8859-1, and output in 7-bit ASCII — if that’s what Postel demanded. Kid does that.
One of the most important features of XSLT is that well-formed templates
are guaranteed to yield well-formed output (with some well understood
exceptions). If the template runs, you know it will provide a basic
level of conservativeness. Kid does that. We’ve
requests because they would break this contract.
Most template languages require you to explicitly encode data that may contain reserved characters. Kid takes the opposite approach and assumes that all content is textual and should be encoded unless you explicitly state that something is XML (in which case it must be well-formed).
There’s also some interesting features around serializing the resulting
XML infoset with different variations. For example, you can author
templates in XHTML 1.0 and output in HTML 4.01. The output
serializer takes care of all the little quirks for things like empty
elements, non-escaping of
STYLE content, boolean
attributes, etc. The result is a clean authoring environment and an
ultra conservative output format. I bring this up in the context of web
services infrastructure only to show that the ability to filter template
output can be very useful when dealing with different types of user
All this to say that if you’re looking for “Web Services Infrastructure” for exposing processes and information, you’re probably looking to hard. If you have a database, templating, and a web server, you have most of the infrastructure and framework required to begin exposing information from each of your systems in a proven and established way. What you want to be on the lookout for are small and specific enhancements to these existing pieces that allow you to interact with other machines in a more predictable manner or in new and different ways.
The next time someone is selling you infrastructure for Web Services, or SOA, or ESB, or whatever they’ll call it next, make sure you ask “Why?” After they tell you, make sure you understand, agree, and have the problems they propose to solve. If not, ask again. New framework and infrastructure is extremely expensive in more ways than one: you have to ramp people on it and then deploy, manage, monitor, and support it. Sometimes new infrastructure is unavoidable but when it overlaps a large portion of your existing infrastructure, you should make sure it’s bringing back a significant return.