2005 October

Archive for October, 2005

The Zen of Microformats 13

For some time now, I’ve wanted to increase my understanding of microformats. If you’re unfamilar with the term or want to understand the basic purpose of this technology better, I suggest reading Phil Windley’s Microformats: Paving the Cowpaths. I read it some time ago and was intrigued but had very little time to do further research.

I have had a chance to dive in a bit more over the past few weeks and am excited at what I’ve found. I’ve trawled the mailing list archives, spent some time on the wiki, and read what people are saying on the blogs. I have yet to spend a lot of time in the guts of the individual specifications (e.g., hCard, XOXO, hCalendar, etc.) because, frankly, the nitty-gritty is a very small potion of what’s really grabbing my interest here.

There seems to be a bit of confussion around what “microformats” actually are and I think I know why. From what I’m seeing, the term “microformats” has two separate meanings - one is obvious and one comes after interacting with the community a bit.

“Microformats” are a set of specifications describing interoperable machine formats that work on the web.
“Microformats” is a process for arriving at interoperable machine formats that work on the web.

In general, when someone says “microformats”, they are most likely talking about the specifications. What I’ve found after lurking on the mailing list and watching the community is that when someone very close to the project says “microformats”, they are more often talking about the process that is evolving there. This is much harder to explain but it’s definitely worth trying because the core values, the Zen, these guys are carving out have very strong parallels to those of the less code movement, I think.

Luckily, I don’t think I’ll have to do much explaining myself because there are some gems from the microformats-discuss mailing list that I think go a long way in describing what’s going on there:

Mark Pilgrim in RE: Educating Others:

The regulars on this list grok this intuitively, or at least have accepted it for so long that they’ve forgotten how to articulate it. We simply don’t care about the general case.

Some people (Scott, Lisa, others) look at this and say “what about this edge case?” or “how do you combine them?” or “I need something with rigid structure” or “how do you validate them” or whatever. And these are all obvious questions that form interesting permathreads on mailing lists around the world. And we just don’t care. Not because we’re lazy or sloppy or naive — in fact, just the opposite. Our apathy towards the edge case is born out of bitter experience. We all bear the scars of drawn-out battles over edge cases that satisfied someone’s sense of “completeness” or “aesthetics” or “perfection”, but ultimately made the common cases harder and solved no real problem.

Ryan said microformats are all about 80/20. He’s right, but unless you’ve share [sic] our common experience, he may as well be speaking in Zen koans. Most standards go like this:

Solve 80% of the problem in a matter of weeks.

Spend two years arguing about the last 20%. (cough Atom cough)

Implement the 80% in a matter of weeks. Wonder why everything is so hard.

Spend months implementing the last 20%. Realize why the first 80% was so hard. Curse a lot.

Discover that the last 20% wasn’t really worth all the time spent arguing about it or implementing it.

Microformats, on the other hand, go like this:

Solve the 80% in a matter of weeks.

Promise (wink wink) to tackle the 20% “later”.

Implement the 80% in a matter of days.

Watch people use it for a few months. Maybe tweak a little here and there.

Discover that the last 20% wasn’t really necessary after all. Breathe a sigh of relief that you never bothered. Move on to the next problem.

The regulars on this list have all been through the full standards cycle many times. We know about edge cases, we know about validators, we know about standards. We know. We’ve been there. We’ve all decided that this way is better. Not because it’s easier or faster or sloppier, but because it leads to a better result. Really. The fact that it happens to be easier and faster is just a karmic coincidence.

Mark’s description of the mainstream standardization process vs. that of microformats could easily be used to describe the difference in technique employed by the mainstream software industry vs. that of the less code crowd.

Ryan King in RE: Educating Others:

… we’ve proven that microformats (at least, the ones developed so far) work in practice, we just need to show that they work in theory.

The arguments in this thread are theoretical–in theory there’s no difference between theory and practice, but in practice there is.

Luke Arno in Evolution vs. Intelligent Design:

It’s evolution not “intelligent design.”

Tantek Çelik in Microformats and the Semantic Web:

Microformats essentially ask:

Can we do more (practical use and applications) with less (logical formalism, formats, namespaces, etc.)?

Tantek Çelik in Microformats and the Semantic Web , and this one’s a gem:

I hardly ever look at PDF docs.

Without even looking at the actual technical specifications, I think these quotes say a lot about microformats’ potential.

To me, what’s exciting about microformats is the same thing that’s exciting about dynamic languages, REST, F/OSS, and other seemingly unconnected technologies and concepts: they are all evolving under the fundamental principle that you cannot adequately plan/design large systems upfront and expect them to be successful. That we don’t know anything until we’ve observed it. Respect for this simple truth leads to dramatic changes in how one builds and evaluates technology and you can see this happening to great effect (and with equally great results) in each of these communities.

More…

Simplest Possible Plugin Manager For Rails 6

Cat.: Ruby, Rails
27. October 2005

UPDATE: I ended up making some pretty massive changes. You can configure multiple plugin repositories, install, update, remove, and discover plugins. The directions for installation are still valid but you’ll need to run plugin --help to get a feel for the changes in usage.

UPDATE: The plugin manager has been included with Rails 1.0 RC4. Run script/plugin --help from a fresh Rails app for usage information.

Rails 1.0 RC1 shipped with a simple plugin system - drop a directory under vendor/plugins that contains an init.rb file to be executed at configuration time and an optional lib directory to be placed on the path. Do whatever you please from there. It’s a simple hook into the startup cycle and a much needed addition.

About 19 hours ago, David suggested that people link to their plugins from the Rails Wiki as a kind of interim solution to the problem of not having a standard means of packaging and managing these things. They did and with links to their plugins’ subversion repositories.

Here’s a simple (150 line) plugin manager.

Install it like this:

$ cd my-rails-app
$ curl http://lesscode.org/svn/rtomayko/rails/scripts/plugin > script/plugin
$ chmod +x script/plugin

Then see what plugins are available:

$ ./script/plugin
continuous_builder  http://dev.rubyonrails.com/svn/rails/plugins/continuous_builder
asset_timestamping  http://svn.aviditybytes.com/rails/plugins/asset_timestamping
enumerations_mixin  http://svn.protocool.com/rails/plugins/enumerations_mixin/trunk
calculations        http://techno-weenie.net/svn/projects/calculations/
...

Next, install stuff to your vendor/plugins directory:

$ ./script/plugin continuous_builder asset_timestamping

Here’s how it works:

Scrape the Plugin page for things that look like subversion repositories with plugins. (Yes, I’m using regular expressions. Yes, I understand the issues. No, I don’t care.)
If vendor/plugins is under subversion control, the script will modify the svn:externals property on that directory and perform an update. You can use normal subversion commands to keep the plugins up to date.
Or, if vendor/plugins is not under subversion control, the plugin is pulled via svn export.

If you want to use svn:externals, make sure you have your vendor/plugins directory under subversion’s control before installing any plugins . If your not sure, do something like this:

$ svn info vendor/plugins
foo:    (Not a versioned resource)
$ svn mkdir vendor/plugins
$ svn ci -m "adding teh plugins directory so I can use this r0x3ring plugin manager..."

This probably won’t work on Windows at the moment and assumes you have the command line subversion client utilities available (svn).

It’s useful as is, but please, make it better.

Verbal Communication 9

Cat.: Rails, Theory
26. October 2005

I’ve started writing a blurb for lesscode.org on some of the fundamental axioms of web information processing, but my pen took me down the rambling path and I’ve ended up with a longish article on my hands. So instead of clogging the lesscode.org’s bandwidth, I’ve posted it on my blog.

If you’re interested in examining certain long-standing challenges related to the web computing, and how Ruby and Rails approach the solution, you may find some meat in there.

P.S. I’ll be blunt and admit right away that I’m slamming the role of RDBMS in the web architecture, so I’m not really expecting that most people will agree with my analysis. Oh well, c’est la vie…

WASP - Easing the Switch from Java to PHP 28

Cat.: PHP, LAMP
23. October 2005

Author’s Note: I originally wrote this article on the WASP homepage. Ryan has graciously allowed me to post it here. WASP was partly inspired by lesscode.org, and hopefully it’ll make a good contribution to this community.

The year 2005, so far, has been the year of scripting languages. Across the web-application programming sector there has been a growing movement toward acceptance and general usage of dynamic languages like Ruby, Python, and PHP. Fundamentally, these languages have been present in the industry and in use by developers for a long time, and really aren’t anything new. Lately, however, due to advances in server technology, scripting language maturity, and improved development libraries, it is possible to write scalable, well architected, “enterprise” applications in less time with less code using frameworks like WASP for PHP.

Scripting languages have been used to build millions of applications on the Web, but in general have not been adopted widely by corporate developers. But more and more businesses and IT professionals are looking to these languages as a way to simplify and speed the creation of custom in-house programs, thus avoiding the now all-too-common logjam of late or overbudget applications. — CNET

It has always been faster to write applications for the web using scripting languages. PHP has long been accessible to the fledgling developer. It has been widely used for prototyping of large applications written in languages like Java simply because web designers, most often specialized in design artistry rather than computer science theory, are able to quickly grasp the syntax and embed it in their HTML code.

Interestingly enough, the same reasons why languages like PHP are so easy to learn and use are what often keeps seasoned software engineers from wanting to use them. Deemed “hacker” languages, scripting languages can be quick to write, but since they do not have many of the advanced features of compiled languages like C++ and Java, they have been prone to lax design practices, leading to code that isn’t efficient, stable, or maintainable enough for large solutions. With the correct mindset and help from structured frameworks like WASP, this no longer has to be the case.

Making use of the advancements made to PHP in version 5, web application architects can implement structure to their code in the form of tested design patterns and full-featured frameworks, like WASP. In fact, WASP was written to make the most pedantic software architect feel at home, in an effort to ease the transition for Java developers to coding in PHP.

It’s important to resist the gut reaction most people have to these statements. Most people’s perceptions of PHP are from the PHP 4 days, where “object oriented” frameworks existed, but were crippled by the loose OO implementation of the language. While PHP 5 is mostly backward compatible with PHP 4, it is almost completely different when it comes to things like abstract classes, interfaces, private and protected methods, and exceptions. Sure, you can write spagetti code in PHP 5, but if you have a well designed framework that keeps PHP code outside of your HTML and in tightly structured classes, you’re more likely to end up with code that looks and works and feels like Java.

But will using PHP confine application developers to small customers and fringe, open source communities? Not for long. The big guys are starting to catch on to this shift.

PHP, like open-source projects including Linux and Apache, now has received the blessing of major powers in the computing industry. IBM and Oracle are working on software that let PHP-powered applications pull information from their databases. — CNET

As the user base of PHP and other scripting languages continues to grow, broad support is becoming available on platforms trusted by the Fortune 500 crowd. This exposure will increase the rate of improvement to the efficiency of these languages. Early in its life, Java was highly criticized as not being scalable since it runs on a virtual machine, and therefore could never achieve the speed of C++. As Java matured, advances were made in optimizations to alleviate much of these concerns. The same sorts of advances are being made in the PHP language, and the hardware and software that drives it.

The goal of any good software development department or organization is to efficiently turn out code on-budget, and on-schedule. Until recently, platforms like Java were more likely to provide a stable, proven foundation to design and build well designed code, however by their nature they introduce a level of complexity that takes extra time to overcome. Using scripting languages like PHP tended to produce code in a faster time frame, but it was often impossible to maintain the architectural integrity necessary for building maintainable, extendable applications. With its strong design foundations, the WASP framework makes achieving all of these goals possible, providing the means to creating world class software to anyone with basic skills in PHP.

Baby Steps to Synergistic Web Apps 24

Cat.: AJAX, microformats, Web as platform
21. October 2005

The Web Standards Project holds as a fundamental truth: that document structure should be separated from presentation or visual style. This two-tier division is embodied by the XHTML standard for document structure and the CSS standard for visual style specification. Adoption of this strict separation of concerns provides revolutionary opportunities for the economical production of highly accessible, aesthetically pleasing web sites. At the same time huge strides are being made by AJAX innovators like 37signals to provide web application user interfaces on a par with those of desktop applications.

Concurrent with these breakthroughs in web user interface technology and practice, there is a an explosion of XML vocabularies, exemplified by OASIS, OPML, Microsoft Office XML, Open Office XML. A zillion XML vocabularies are blooming. These vocabulares are not presentational in nature like CSS, nor are they addressing document structure like XHTML. These vocabularies represent various other domains, commerce domains like order management, personal collaboration domains like calendar and contact management, media distribution and syndication domains, and myriad technical domains such as remote procedure call, database query, spreadsheet structural model. These vocabularies constitute a huge and growing, free, “domain model” for our world. Development, refinement and adoption of these shared, free domain models presents an unprecedented opportunity for both producers and users of information systems.

Enter web applications. Not static content-oriented websites, but honest to goodness web applications. Think email, calendaring, authoring, supply chain management, customer relationship management. These web applications are manipulating at their core, domain models. Sure, these applications have to present an interface to the user, but the bulk of what the application does is manipulate and manage domain-specific information.

While these web applications are manipulating domain-specific information they are doing precious little to expose that information in interoperable form. At best a web application may support download or upload of files. A contact management web application may for instance support bulk import or export — but where is the ability to conveniently “pick” an address from that application and use it as a “ship to” address at an online buying site?

Each web application is an island. This monolithic approach to web applications favors those product vendors who can make big bets — vendors who can tackle and deliver huge applications. Mash Ups provide a glimmer of hope — but only for developers — not really for end users.

It is not a new situation, that applications manipulate domain-specific information — information that users need to share with other applications. It has ever been so. So what’s the difference with web applications? The difference is that in the case of web applications, the world of document structure and presentation has not been bridged to the world of domain models. If Web 2.0 truly represents a new platform then perhaps we should consider whether there is anything to be learned from the older platforms in this regard. Where was presentation bridged to domain model for instance on the Macintosh, Windows and X-Windows platforms?

Remember Me — I Sat Behind You in Freshman English

On those predecessor platforms, the clipboard paradigm was one very important bridge. In the clipboard paradigm a platform-wide set of gestures is supported by all applications. Those gestures enable a user to designate content, to copy that content to a clipboard, and to later “paste” that content into a different location — either in the original application or in a separate application. The range of gestures was expanded on some platforms to include drag and drop with added semantics at the drop location to support the notion of linking in addition to the traditional copy-paste and cut-paste a.k.a. “move”.

Fundamental to the notion of clipboard is the idea that there are potentially many representations for a given piece of content. This is where the bridging between presentation and domain model takes place. The clipboard can capture many representations at once — allowing the receiving application, in concert with the user to select precisely the level of presentation versus domain meaning desired. When pasting a circuit diagram (model) into an engineering report a presentational representation may be chosen, whereas when pasting the same content into a circuit simulation tool, a domain-specific (circuit modeling) representation may be chosen.

Could the original notion of the clipboard be updated to fit the Web Application architecture? Clearly the benefits would be significant. A new model, allowing web applications to leverage one another under user control means that smaller bets and therefore more bets can be made (by product vendors). More bets means more vendors, more applications, and more functionality delivered sooner to the 2.0 Web. While less can certainly be a competitive advantage to a product vendor, that advantage is only amplified in an environment where users can combine the functions of multiple products. If cut and paste and even Compound Documents were useful in the old platforms then they’re potentially much more useful now given the explosion of open, standard XML vocabularies. The opportunity for interoperability is greater than ever. Yet here we all sit, uploading and downloading files between applications.

Some will say, “but my browser supports cut and paste”. Well yes, the browser does support a limited form of cut and paste. But without access to the domain model, the browser is incapable (without help) of providing a cut and paste capability that goes deeper than document structure. Since the browser sees only the style sheet and the document — and not the underlying domain objects, it has no way (on its own) of loading a clipboard with a RDF vCard for instance, because that RDF vCard structure is never seen in its native form by the browser at all. At best an XHTML representation of that RDF vCard is seen at the browser.

Even if the RDF vCard could make its way onto the clipboard somehow, perhaps along with an XHTML representation and maybe a styled representation as well — there is no standard for presenting the multifaceted contents of the clipboard to the receiving web application. Is there a way to bridge the gap between domain structures and document structure?

What Are We Waiting For

Hang on… doesn’t AJAX provide a way out? Doesn’t AJAX allow various gestures to be captured and acted upon? Doesn’t AJAX allow arbitrary requests and responses between the browser and a web application in response to those gestures? Well sure it does!

If a source application could somehow get a standard “web application clipboard” structure onto the desktop clipboard, and if that structure could be sent by the user to a destination web application then the opportunity would arise for users to create “on the fly” Mash Ups. Users would recapture their lost ability to bridge applications requiring various levels of meaning (remember the circuit diagram example?) Web Applications could be combined by mere mortals in ways unforeseen by the original designers.

So what do we need to do to make it happen? Do we need to form a technical committee in a standards body? Do we need to go make a pitch to the software industry Old Guard? Sounds like it could take a long time and involve a lot of risk.

No I don’t think we need to do anything so heavy weight. Who’s in charge here anyway? We don’t have to ask anyone’s permission, and we really don’t have to ask for anyone’s help. Here’s how I see it going down:

we agree on a standard, almost trivial, vocabulary for describing the clipboard itself. http://www.lesscode.org/clipboard/v0p1
ECMAscript for capturing the cut and copy gestures (in a source application), sends an XMLHttpRequest to the source web application, along with the current “selection” and a command verb — one of “identify”, “copy” or “cut”. Here “selection” is a URI ending in an XPointer identifying the XHTML designated by the user. http://www.lesscode.org/clip-source-request/v0p1
The web application returns a clipboard structure. The structure may contain multiple representation elements. The elements have a well-defined order so that the receiving application knows where the most “presentational” representations are and where the most “domain specific” ones are. Note that the source application may choose in the case of “copy” to place content in the representations either “by value” or “by reference”. The latter necessitates subsequent communication between the destination application and the source application. In the case of “cut”, the content is always returned by value. In the case of “identify” it’s always returned by reference. The “identify” verb is for future expansion to support drag and drop gestures. schema: http://www.lesscode.org/clip-source-response/v0p1 and sample instance document: http://www.lesscode.org/clip-source-response/sample-1
ECMAscript for capturing paste gesture (in destination applications), sends an XMLHttpRequest to the destination application along with the current destination selection (a URI ending in an XPointer) and the current web application clipboard structure and a command verb — either “paste” or “link”. In the case of “paste” the destination application will either paste the content directly from the clipboard (if the content is provided by value) or will acquire the content from the source application (if the content was provided by reference). http://www.lesscode.org/clip-destination-request/v0p1
The destination web application returns a status code. http://www.lesscode.org/clip-destination-response/v0p1

This basic scheme will support copy/paste and cut/paste gestures today and set the stage for drag with drop-copy, drop-move and drop-link later. By focusing first on “by value” cut/copy and paste we avoid thorny issues of distributed trust. The user is in control of the clipboard and there is no direct application to application communication required. Let Big Brains working on things like SAML tackle federated identity management issues. Addressing the clipboard now, and composite applications later constitutes walking before running — or flying. We leave the door open, however, for eventual application to application communication — both for simple efficiency, i.e. avoiding unnecessary routing content through the browser, and for composite application construction.

What Could Happen After That

Picking up what was lost with the move to the Web platform sets the stage for going beyond the predecessor platforms:

bridge desktop platform clipboards to web application clipboard - this sounds like a job for the Firefox folks!
browser gesture standards for drag and drop including copy, move, link between applications
deployment of federated trust standards to support cross-linking of content between web applications
A pragmatic approach to exposing the domain model to the browser is to use the class attribute in XHTML to attach semantic or domain meaning to what would otherwise be solely a representation of document structure. This is the approach taken by microformats.org. In fact there is even an XHTML alternative to RDF vCard called hCard. The microformats.org approach unifies document structure with domain models by merging the two. Will XHTML with class tags obviate the need for a web application clipboard protocol? Will microformats make the browser all knowing?