lesscode.org


'First they ignore you..' Archives

Many Lives — Just One You  11

Cat.: First they ignore you.., Web as platform
15. April 2006

You lead many lives. You’re a spouse and a parent. A soccer coach and a scout leader. A volunteer and an activist. You are an employee and an entrepreneur. Why do the systems you use not understand this? It’s like somebody out there just doesn’t care.

If you’re like me, and I know I am, you’ve got about a hundred accounts to keep track of on various systems — systems on the Internet, systems running in various corporate data-centers accessible only via VPN. You’ve got a slew of credentials to manage and no ability to leverage all that information at once. You spend your day duplicating and copying information between these systems, like some sort of freaking acolyte — serving The System — when The System should be serving you.

Meanwhile a million Web 2.0 startups are blooming. All DIY’ing for attention-equals-users-equals-dollars. Many of those startups are trying to tap both the consumer market and the corporate one too. It ain’t easy though since consumer buzz or even adoption does not always lead to access to fat corporate accounts.

What would it mean to you if you could use your favorite set of Web Apps all day and never have to log in twice regardless of which life you were leading moment-by-moment? How much more effective would you be if for each application, all your information in that application was available at once instead of walled off in separate accounts?

What would it mean to a Web Application provider if it was able to paint that picture for you? Would you be more loyal to that brand? Would you induce your employer, your client, your scout troop to use/purchase that brand?

I think there is a way and if you’ll lend me a few more minutes of your precious attention I’ll spin it out.

Imagine

The prevailing identity model in information systems arose around the enterprise system deployment model. In an enterprise system deployment, substantially all of the users of the system are employees of the enterprise. The enterprise itself is not represented at all in the model since each system serves only one enterprise.

This identity model has been carried forward to Internet-based applications. Just about all Internet-based applications offer accounts. An account represents a contract between the service provider and an individual. As such, most Internet-based applications are modeled like one big enterprise. Web Applications that offer corporate contracts offer separate identity systems for each corporate entity. At the end of the day, each of these identity systems — the “consumer” space ones and the “corporate” ones are separate.

Now let’s imagine a future where all applications are Internet-based. It’s easy if you try. Further, lets imagine that the market for some key Web Applications has settled down and consolidated a bit. So instead of a hundred wiki providers maybe there are three very popular ones. Kind of the way things were five years ago on the desktop with Word/Excel/Powerpoint/Outlook but this time, Internet-based.

Now let’s take a look at a knowledge worker, “Jessica” using say a wiki service, “WikiGood”. In the morning Jessica spends a few minutes in WikiGood writing out her plans for the day and reviewing progress against yesterday’s. Jessica has consulting engagements with two clients, “BigBoxCorp” and “NowYouSeeIt” but she’s in luck — both of her clients have standardized on the WikiGood service and use it heavily for day-to-day operations. As the day progresses she will interact with both of these corporate wikis as well.

In the first usage instance, Jessica is working on her own behalf, and wishes to keep the content private. What happens when she decides to do some work for BigBoxCorp? Well she logs out of WikiGood and logs back in with the identity assigned to her by BigBoxCorp. No problem. Other than the fact that she has two sets of credentials for WikiGood, oh snap, hang on — three sets of credentials because she’ll also be doing work for NowYouSeeIt.

There is a proliferation of accounts and credentials, but that’s not the worst of it. The deeper problem is that each WikiGood account is an island — even though all of the accounts are Jessica’s. Her preferences, her usage history, in short, everything associated with her in the system is divided, walled off, in three separate spaces. When Jessica is logged in under her NowYouSeeIt account and issues a search for instance, she will not find results for any of her personal wiki pages since the system knows her only (at the moment) as jessica@nowyouseeit. What if WikiGood is using Google AdWords to present Jessica with targeted advertisement? Is she a different Jessica when she’s logged in to her personal account than she is when she is logged in to her BigBoxCorp account? Google would like to think not.

Relate this scenario (personal account, and two corporate accounts) to other applications you use every day like email, contact list, calendaring, meeting scheduling, blogging, idea management, note taking, project management, personal task list, IM, VOIP, search. As these applications learn more about you, and you invest more time creating information in each application it rapidly becomes unacceptable to switch systems or even to switch accounts on the same system.

People increasingly expect and need the systems they use to learn more about them and to provide enhanced service based on that knowledge. Systems architected ignorant of the multiplicity of lives people lead force people to waste time repeating themselves (to systems) and copying information (between systems). More generally users of these systems fail to realize the full potential benefit of the knowledge they have shared. This is not a new problem, but it is certainly seems like a more pronounced missed opportunity in the age of Internet-based applications.

The situation is problematic. Users of systems would like to minimize the proliferation of credentials they must carry. Also they would benefit greatly from the ability to connect their accounts in a single system in such a way that they have visibility across all their content at once, and in such a way that they wouldn’t have to spend time re-entering profile, preference and other information into each account. Corporations seek these same benefits for their workers because it makes the workers, and hence the corporation, more efficient. On the other hand, corporations still have a desire to limit access to information (e.g. to employees only), and to monitor employees activities (shudder) if not for pure evil purposes, for regulatory ones.

But what of the service provider’s interests? WikiGood would like to continue to offer free or cheap service to individuals and premium service to corporations. The logic goes something like “we’ll attract consumers and that will lead to corporate customers”. But from the foregoing it should be clear that WikiGood is missing the opportunity to activate those very consumer-users, by failing to provide a vision for Jessica that looks substantially better than the walled-off nightmare. Users have little incentive to induce employers to switch to gmail for instance, if that would lead to each user having two gmail accounts instead of one. That situation is not appreciably better than the situation where the user has a gmail account and a corporate Exchange account. Contacts are still separate. Searches still fail to find results across accounts. The tags created for one account are completely separate (and return separate results) from the tags created in another account.

Many Lives — Just One You

The key missed opportunity is for WikiGood to acknowledge that users often act as agents for other legal entities. I’m using agency in the legal sense here. First from the American Heritage Dictionary:

One empowered to act for or represent another: an author’s agent; an insurance agent.

And now from the Agency (law) entry in Wikipedia:

The Agent’s primary fiduciary duty is to be loyal to the Principal. This involves duties:

  • not to accept any new obligations that are inconsistent with the duties owed to the Principal. Agents can represent the interests of more than one Principal, conflicting or potentially conflicting, only on the basis of full and timely disclosure or where the different agencies are based on a limited form of authority to prevent a situation where the Agent’s loyalty to any one of the Principals is compromised. For this purpose, express clauses in the agreement signed by each Principal with the Agent may identify specific types or categories of activities that will not breach the duty of loyalty and so long as these exceptions are not unreasonable, they will bind the Principals.
  • not to make a private profit or unjustly enrich himself from the agency relationship.

In return, the Principal must make a full disclosure of all information relevant to the transactions that the Agent is authorized to negotiate and pay the Agent either the commission or fee as agreed, or a reasonable fee if none was agreed.

If WikiGood introduced the concept of agency into its system, it would enable Jessica to have a consistent identity over time as she worked in her own interest, or in the interest BigBoxCorp or NowYouSeeIt, yet retain full visibility across all her content.

The core concept, is that within each application, a person has the ability to act as an “agent of” some identifiable “interest” or “principal” — think “employer” or “customer” or “client”. But the person retains her identity — the system always knows the person’s identity. There is one set of authentication credentials. Knowledge expressed to the system by the person is retained by the system and associated with that person — regardless of which “principal” they represent moment-to-moment.

In this way, principals can have (purchase) rights in the system, and those principals can assign their agents limited rights in the system too. The user switches between agencies as she operates. She expresses this switch through the user interface. If you’ve used a mail client that lets you select from multiple “from” addresses when sending mail you have the basic idea, only you’d make a “working for” selection each time you changed tasks. Perhaps you could “work for” more than one principal at a time — why not?

Instead of requiring the user to do this switching, perhaps smart systems could infer agency on-the-go based on time of day or content or physical location of the user. These smart systems could prominently display their guess and the user could change it if it was wrong.

In this way, no “closed system” like VPN is required to protect a principal’s interests. The principal contracts with various service providers the terms under which its agents can act. Principals then assign rights to their agents. But principals create no identities — they just assign rights/trust levels to identities. In this way one identity can serve many principals. There may be a notion of the “public domain” principal, so an individual has the ability to do work for the “public domain” or in the public domain.

What it Would Mean

The Agency-Aware Identity Model gives rise to a new business model where consumer accounts and consumer loyalty can be parlayed into corporate accounts. Each consumer-customer of a Web Service will now be highly motivated to induce each of the principals on whose behalf they work, to also become (corporate) customers of that same Web Service. Corporations (principals) would retain the important security and control capabilities they have today with their enterprise identity model: a) the right to assign and revoke privileges/trust to individuals (agents), b) the right to protect (privacy) information produced by their agents c) the right to eaves drop and monitor the activities of their agents for e.g. regulatory compliance, sexual harassment violations, EEOC etc. An individual, by agreeing to operate on behalf of an agency (for a period of time, or for a particular task) is potentially forfeiting some privacy or future access rights to her work product.

So what do you think? What would it be like to use WikiGood or your-favorite-web-app-here and never have to log out regardless of which client/principal you were working for moment by moment? How much more effective would you be if for each application, all your information in that application category was available at once? Would you be able to remember to click the drop down list of agencies as you switched between tasks during the day? Would you want that switching integrated with your web SSO system like OpenID? Do you believe that by adopting an Agency-Aware Identity Model, WikiGood could effectively turn its non-corporate customers into a rabid enterprise sales force? I do.

Half a Baby Step  5

Cat.: First they ignore you.., AJAX, microformats, Web as platform
02. November 2005

David Janes and friends over at microformats.org have written an inspiring greasemonkey script that will find microformat structures on web pages. They include some links to pages with hCard and xFolk content.

When content in a supported microformat is encountered on a page, the script embeds a visual indicator like this one for hCard David Janes' hCard indicator which when clicked produces a context menu. The functions on on the context menu depend upon the microformat. The menu for an hCard includes an “add to address book” item that will invoke Microsoft Outlook to create a contact.

Perhaps the most interesting thing though, is the menu item to map the vCard address with Google Maps. How many times per week do you grovel around the web for some physical coordinates only to then select-copy-paste into the browser search box, the first line of the address, then select-cut-copy-paste the second line in order to generate a map page? The “show on Google Maps” item does it all in one swell foop, taking you directly to the map!

dschud’s COinS-PMH shares with microformats the ability to embed into a static page, sufficient metadata and structure, to support extended processing without any need to re-invoke the web application that produced the content. But what does “extended processing” mean in this context? It means doing something with that content that the originator could not foresee. In the case of the microformats example it means a greasemonkey script. In the case of COinS-PMH the primary example is a “hacked ‘COinS Details’ sidebar”.

So repurposing content, conveniently, in ways unforeseen by its originator is what its all about. The greasemonkey script and the sidebar provide a glimpse of what’s in store. How much unforeseen functionality do the browser add-ons really provide though? In terms of scalability of functionality, should we expect the plug-ins to keep growing to encompass more functionality? For instance, should the hCard script grow to support mash-ups with more web applications beyond Google Maps? How about adding a function to send hCard holder a gift from Amazon?

The beauty of the hCard-Google-Map-mashup-script is that it didn’t require Google’s web app to understand hCard. The script-mashup did the translation. But that translation is at the heart of the scalability shortcoming. If the script has to contain mappings from each microformat to each way I want to use content conformant to that microformat then the opportunity for explosive functionality growth is missed. If on the other hand, the target web application understood the microformat in question, and we had a way to feed the target app conformant content then the scalability bottleneck would be broken. The wheels would be greased.

In a previous post I proposed the notion of a web app clipboard. The basic idea was that AJAX could be used to request content from a source application. The content would land on the operating system clipboard in a simple format (the web app clipboard format) and could then be transmitted to a destination application. Now I’m thinking that the original proposal was overbold. As Ryan Tomayko intimated, there ought to be a way to marry the microformats approach with the web app clipboard. Perhaps its profitable to think initially in terms of the source content being scraped directly off the page, without requiring the additional request to the sourcing web app. We’ve got that functionality in hand. With that in hand we could focus on the other side of the equation — where to send the content. The effect would be that we would no longer have to create both producing and consuming applications — we could start with just consuming ones. Anything that increases the likelihood that two web apps will communicate is a Good Thing. I’m with Phil Bogle and others who’ve pointed out that we need to bootstrap by finding those first two applications.

If a menu item called “copy” was added to the hCard context menu of the greasemonkey script, and that item caused the hCard to land on the operating system clipboard in a simple standard format, then it would just be a question of where the content could be pasted. Then our task is reduced to inducing a few web apps to support the paste action. A really simple first approach to paste support doesn’t even require XMLHttpRequest at all. A receiving web app could simply provide a text box wired to understand the web app clipboard format wrapping an hCard for instance. Paste into a text box works “out of the box” with all browsers. It’s not all that elegant, but it would work.

Giving microformats a place to go (destination applications) and a simple way to get there (the web app clipboard) will break a key bottleneck hindering broad interoperability.

The Zen of Microformats  13

Cat.: First they ignore you.., microformats
28. October 2005

For some time now, I’ve wanted to increase my understanding of microformats. If you’re unfamilar with the term or want to understand the basic purpose of this technology better, I suggest reading Phil Windley’s Microformats: Paving the Cowpaths. I read it some time ago and was intrigued but had very little time to do further research.

I have had a chance to dive in a bit more over the past few weeks and am excited at what I’ve found. I’ve trawled the mailing list archives, spent some time on the wiki, and read what people are saying on the blogs. I have yet to spend a lot of time in the guts of the individual specifications (e.g., hCard, XOXO, hCalendar, etc.) because, frankly, the nitty-gritty is a very small potion of what’s really grabbing my interest here.

There seems to be a bit of confussion around what “microformats” actually are and I think I know why. From what I’m seeing, the term “microformats” has two separate meanings - one is obvious and one comes after interacting with the community a bit.

  1. “Microformats” are a set of specifications describing interoperable machine formats that work on the web.

  2. “Microformats” is a process for arriving at interoperable machine formats that work on the web.

In general, when someone says “microformats”, they are most likely talking about the specifications. What I’ve found after lurking on the mailing list and watching the community is that when someone very close to the project says “microformats”, they are more often talking about the process that is evolving there. This is much harder to explain but it’s definitely worth trying because the core values, the Zen, these guys are carving out have very strong parallels to those of the less code movement, I think.

Luckily, I don’t think I’ll have to do much explaining myself because there are some gems from the microformats-discuss mailing list that I think go a long way in describing what’s going on there:

Mark Pilgrim in RE: Educating Others:

The regulars on this list grok this intuitively, or at least have accepted it for so long that they’ve forgotten how to articulate it. We simply don’t care about the general case.

Some people (Scott, Lisa, others) look at this and say “what about this edge case?” or “how do you combine them?” or “I need something with rigid structure” or “how do you validate them” or whatever. And these are all obvious questions that form interesting permathreads on mailing lists around the world. And we just don’t care. Not because we’re lazy or sloppy or naive — in fact, just the opposite. Our apathy towards the edge case is born out of bitter experience. We all bear the scars of drawn-out battles over edge cases that satisfied someone’s sense of “completeness” or “aesthetics” or “perfection”, but ultimately made the common cases harder and solved no real problem.

Ryan said microformats are all about 80/20. He’s right, but unless you’ve share [sic] our common experience, he may as well be speaking in Zen koans. Most standards go like this:

  1. Solve 80% of the problem in a matter of weeks.
  2. Spend two years arguing about the last 20%. (cough Atom cough)
  3. Implement the 80% in a matter of weeks. Wonder why everything is so hard.
  4. Spend months implementing the last 20%. Realize why the first 80% was so hard. Curse a lot.
  5. Discover that the last 20% wasn’t really worth all the time spent arguing about it or implementing it.

Microformats, on the other hand, go like this:

  1. Solve the 80% in a matter of weeks.
  2. Promise (wink wink) to tackle the 20% “later”.
  3. Implement the 80% in a matter of days.
  4. Watch people use it for a few months. Maybe tweak a little here and there.
  5. Discover that the last 20% wasn’t really necessary after all. Breathe a sigh of relief that you never bothered. Move on to the next problem.

The regulars on this list have all been through the full standards cycle many times. We know about edge cases, we know about validators, we know about standards. We know. We’ve been there. We’ve all decided that this way is better. Not because it’s easier or faster or sloppier, but because it leads to a better result. Really. The fact that it happens to be easier and faster is just a karmic coincidence.

Mark’s description of the mainstream standardization process vs. that of microformats could easily be used to describe the difference in technique employed by the mainstream software industry vs. that of the less code crowd.

Ryan King in RE: Educating Others:

… we’ve proven that microformats (at least, the ones developed so far) work in practice, we just need to show that they work in theory.

The arguments in this thread are theoretical–in theory there’s no difference between theory and practice, but in practice there is.

Luke Arno in Evolution vs. Intelligent Design:

It’s evolution not “intelligent design.”

Tantek Çelik in Microformats and the Semantic Web:

Microformats essentially ask:

Can we do more (practical use and applications) with less (logical formalism, formats, namespaces, etc.)?

Tantek Çelik in Microformats and the Semantic Web , and this one’s a gem:

I hardly ever look at PDF docs.

Without even looking at the actual technical specifications, I think these quotes say a lot about microformats’ potential.

To me, what’s exciting about microformats is the same thing that’s exciting about dynamic languages, REST, F/OSS, and other seemingly unconnected technologies and concepts: they are all evolving under the fundamental principle that you cannot adequately plan/design large systems upfront and expect them to be successful. That we don’t know anything until we’ve observed it. Respect for this simple truth leads to dramatic changes in how one builds and evaluates technology and you can see this happening to great effect (and with equally great results) in each of these communities.

More…

Web Services Infrastructure: Kid Templating  7

Cat.: First they ignore you..
24. September 2005

Note: the title should make sense by the end of the post…

Kevin Dangoor’s recent announcement of TurboGears has resulted in a dramatic increase of interest in Kid. Kid was first announced on November 30, 2004 as a Pythonic XML-based Templating Language. I remember thinking I was going to do a series of articles on why I wanted this specific combination of features in a library. I never did and Kid progressed into its current form, growing a small community along the way.

Although it’s now used primarily for HTML templating, that wasn’t the initial goal of the project. What I really wanted to do was to illustrate a different way of thinking about “Web Services”.

Not what you were expecting, eh?

But it’s true - Kid was supposed to be a simple device that I would use to start a narrative exploring a variety of topics related to building distributed systems atop the web (i.e. “Web Services”). I had decided that the direction being taken by the industry mainstream was incorrect and that web services would languish until they became more like the existing, working, proven, web.

There was a lot of talk about “Web Services Infrastructure” and framework and that talk continues today. The assumption by nearly everyone was (and still is) that web services would require a whole new set of tooling and paradigm. SOAP/WS was still very much RPC oriented and so the focus was on language bindings, typing, discovery, and the like. Web Services programming had almost no resemblance to existing web programming.

This has only accelerated in the time since I was heavily involved. Now it seems the industry is calling for some mysteriously unspecified SOA toolkit, ESB, or some other insignificant combination of alphanumerics to save Enterprise IT.

I was in the middle of the situation a year ago or so and it troubled me deeply. Long days and nights at work were spent combining technologies like SOAP, WSDL, WS-Security, and BPML with massive Java infrastructure. I ate it all up and knew it like the back of my hand. It is not impossible for someone dedicating 8-12 hours a day to understanding this stuff to love it. I loved it. It’s all very intriguing for someone with my particular constitution.

But so is masturbation.

I’m not sure what happened. I guess I had what alcoholics refer to as “a moment of clarity”. I distinctly remember going over to grab a co-worker for lunch. He was on the phone and so I was talking to some guy in an adjacent cube who did “services work”. The services people were developers but were considered a different department than product development. They dealt with specific customer needs and were tasked with using our platform to craft actual real solutions for actual real business problems. Their world is much different from product development, whose main customers were marketing and the executive team.

Anyway, I noticed a very large stack of printed material on this guys’ desk, entitled “Introduction to Web Services”. Development had collected a set of introductory materials that were to be distributed to all of the services people in the field and, I assume, eventually to external developers using our platform. I had a small part in compiling these materials and had reviewed it at various stages in electronic form.

The hard copy put me on my ass. (I say that metaphorically but if it were to hit me with any significant velocity it very well could have literally put me on my ass.) I picked up the tomb (an “Introduction”, mind), slapped it on my co-worker’s desk, who was now off the phone, and whispered, “this isn’t going to work.”

The services guy doesn’t love this stuff - at all. I’m sure he was completely capable of digesting all of it had he the time and interest of, say, someone in my position, but he doesn’t. He has a customer that wants a bunch of systems to talk to each other and they all have very large and amazingly different framework and infrastructure. More framework and infrastructure is the last thing he needs. The services guy was a wake-up call.

At the time this was happening I had been nursing a long-time interest in true web architecture. This didn’t have anything to do with work - a very long time ago I threw together a little download tool thingy that was capable of resuming failed downloads (with 14.4 baud modems this was a big deal and none of the browsers had native support for resuming). This required that I read bits of RFC 2616 and implement a basic HTTP client. I remember being amazed at some of the capabilities of HTTP because I had assumed that it was mostly a simple file transfer protocol (closer to FTP than, say, CORBA). The spec ended up staying with me and I continued to explore different ways people were using the web and HTTP to do new and exciting things.

At some point, I became convinced that existing, basic web architecture solved many of the problems we were seeing with Web Services adoption. The problem wasn’t that WS was incapable of solving the technology issues (it’s quite adequate), the problem was that it was incapable of solving the social issues. It far exceeded the threshold of acceptable complexity for true adoption by this large community of people whose primary goal is to solve business problems.

The web, on the other hand, was invented to solve the same basic integration problems businesses are experiencing now. Tim Berners-Lee’s dilemma is not so different than our own: a bunch of systems with a bunch of closed data formats and processing capabilities that should be universally accessible. Berners-Lee realized early on that solving this problem would require, above all else, a virus: something that was so simple and lightweight that it would be hard not to adopt. And that’s what the web is: a virus. Just like C and UNIX, Windows, Visual Basic, and a slew of other technologies that kept, as a primary requirement, the ability for real people to solve real problems.

I shifted my thinking drastically and tried to imagine how the integration problems we were seeing would be solved using existing web architecture, which we know to have the traits necessary for mass adoption. I should mention here that by “web architecture”, I mean W3C Tag Web Architecture but also all of the tools and techniques that have evolved above and below it. The servers, proxies, template languages, mod_rewrite, sessions, cookies, load balancing, liberal feed parsers, virtual hosts, MVC, MultiViews/content negotiation, monitoring systems, automated testing tools, dynamic languages, view source, and on and on. All of these little tricks and techniques add up to an extremely powerful and yet fairly simple and understandable toolkit for building distributed systems. What’s more is that there are an unmatched number of people who understand how to build these systems using all variations of platform and language.

Which brings us to Kid as “Web Services Infrastructure”. The concept is simple: for whatever reason, template languages (PHP, ASP, JSP, CFML, ERB, Cheetah, Tapestry, Velocity, etc.) have become a fundamental tool for web development and their usefulness is in no way limited to presentational data (HTML). Templates are simple, templates are cool. You throw some junk in there and look at the result. If the result isn’t right, you tweak your template until it does look right. There’s no layers of magic to get in your way. When something doesn’t come out right, you change the template. There’s no “management” or “container” involved.

There’s no rule that says templates must only be used to generate HTML. Indeed, many of the RSS and Atom feeds in the wild are generated from some form of template. They are never automatically-generated-behind-the-scenes using language bindings and are very rarely generated using some kind of DOM/SAX API.

<rant> RSS is the most successful web services data format in existence (after HTML, of course ;). Successful web services in the future are likely to resemble it. Is there a use-case for RSS or Atom in WS-* land? There are thousands of pages of spec text and no one could throw together a simple use-case for the most successful web service in existence? That’s irresponsible. </rant>

The point I’m trying to make is that template based web services are a reality and that we should be thinking about making incremental improvements to the general model to facilitate more machine-centric data formats instead of creating whole new paradigms.

There are a variety of really important issues with using templates for general purpose web services programming, most having to do with the first part of Postel’s Law:

Be conservative in what you do; be liberal in what you accept from others.”

The problem with using templates to produce XML (including XHTML) is that it is exceedingly hard to be conservative in what you do. Most template engines are text based, making it easy to miss well-formedness errors. There are also a range of character encoding issues that template languages could ease but often simply ignore and sometimes make worse.

Kid is a simple attempt at building features that aid in conservativeness into the template engine. I actually considered tag-lining it The Ultra-Conservative Template Engine as a play on Mark Pilgrim’s Ultra-Liberal Feed Parser whose name comes from the second part of Postel’s law. This is, of course, the whole point of being conservative in the first place: so that we don’t need an Ultra-Liberal parser for each variation of “web service”.

I think some of these features are compelling and would like to see them pursued in other tools that are in common use for web development. For instance, one of the most important and least talked about features of XML is that it provides a reasonable system for encoding the entire range of unicode code points in any character set, including ASCII. A template engine with a basic understanding of XML could process templates authored in utf-8, interpolate data encoded in ISO-8859-1, and output in 7-bit ASCII — if that’s what Postel demanded. Kid does that.

One of the most important features of XSLT is that well-formed templates are guaranteed to yield well-formed output (with some well understood exceptions). If the template runs, you know it will provide a basic level of conservativeness. Kid does that. We’ve wontfix‘d feature requests because they would break this contract.

Most template languages require you to explicitly encode data that may contain reserved characters. Kid takes the opposite approach and assumes that all content is textual and should be encoded unless you explicitly state that something is XML (in which case it must be well-formed).

There’s also some interesting features around serializing the resulting XML infoset with different variations. For example, you can author templates in XHTML 1.0 and output in HTML 4.01. The output serializer takes care of all the little quirks for things like empty elements, non-escaping of SCRIPT and STYLE content, boolean attributes, etc. The result is a clean authoring environment and an ultra conservative output format. I bring this up in the context of web services infrastructure only to show that the ability to filter template output can be very useful when dealing with different types of user agents.

All this to say that if you’re looking for “Web Services Infrastructure” for exposing processes and information, you’re probably looking to hard. If you have a database, templating, and a web server, you have most of the infrastructure and framework required to begin exposing information from each of your systems in a proven and established way. What you want to be on the lookout for are small and specific enhancements to these existing pieces that allow you to interact with other machines in a more predictable manner or in new and different ways.

The next time someone is selling you infrastructure for Web Services, or SOA, or ESB, or whatever they’ll call it next, make sure you ask “Why?” After they tell you, make sure you understand, agree, and have the problems they propose to solve. If not, ask again. New framework and infrastructure is extremely expensive in more ways than one: you have to ramp people on it and then deploy, manage, monitor, and support it. Sometimes new infrastructure is unavoidable but when it overlaps a large portion of your existing infrastructure, you should make sure it’s bringing back a significant return.

lesscode … more docs?  21

Cat.: Ruby, PHP, First they ignore you.., Rails
20. September 2005

I’ll take it as a given that if you’re reading this then you agree that, for the sake of sanity and productivity, it’s time coders gave up on roll-your-own, and moved over to modern frameworks where one can concentrate on business logic rather than request parsing (and get all those AJAX goodies for free ;-) ).

I’ve been looking on with interest for the last year and a bit, and as I’ve watched the pioneers blaze their XP, RoR, lesscode trail across the web-firmament, I’ve begun to suspect that I must have missed something. Yes, it’s powerful stuff, and yes it isn’t all smoke and mirrors - there really is “gold in them thar hills…” - but, and for me it’s a big but, we seem to be missing the big picture. Where are the map makers? Where’s the documentation for the second (or third) wave?

Self-documenting code is all very well, and having a common vocabulary of design patterns helps when discussing solutions to individual problems. But what second-wavers really need (and I include myself here - no, actually, put me down as a third-waver) are more pictures. More exposition. Road maps.

Is there a way to add XD (eXtreme Documentation?) back into the XP mix? Writing elegant code is hard, and people who do it earn the admiration they received. But I would argue that writing good documentation is harder, and that it shouldn’t be left for the second-wavers to do.

People who’ve moved to XP have already gone thorugh the pain barrier of

  • write the tests
  • then write the code
  • (then refactor)

and have proved that in the long run, it means better code, less debugging, in less time. But having proved that that works, might there be some benefit in switching to;

  • write the spec
  • then write the tests
  • then write the code
  • (then refactor)
  • then write an overview!!!

Might this result in (my h-nought) more easily modfied code, quicker adoption by other coders, greater community support?

I’m genuinely interested in other people’s views on moving documentation down(?) the food-chain, so that it’s non-optional, and as integral to writing new code (and frameworks) as writing good tests. Yes, there are good auto-doc tools and methodologies out there, but that right now they still seem to be seen as secondary to the process by Joe Frontiersman, and they only deal with what’s in the file, not what’s in the coder’s/architect’s head. (There’s the nine diagrams of UML, yes, but who on the bleeding-wrist of open-source technology is actively using/sharing designs via UML?)

Let me know if I’ve missed a trick somewhere.

[A few thoughts for the pot: I believe the that the reason the Open Source model works because it’s based on non-coercive collaboration. But Source Forge is littered with unfinished, half-baked projects because someone didn’t think to check that there wasn’t already a project out there that they could use. (How many PHPUnits does the community really need?) Should there be a ‘requirement’ for documentation before a project gets listed? Perhaps it’s time for ‘ideaforge’, or ‘architectureforge”?]