2005 September

Archive for September, 2005

Should Database Manage The Meaning? 51

Cat.: Talk, Theory
29. September 2005

I couldn’t resist jumping into the Choose a single layer of cleverness discussion, that is raging on David Heinemeier Hansson’s blog. The majority of the challenges to David’s thesis were so widly off mark, that it had left me completely bewildered. What’s even more bewildering, to me at least, is that many of the misplaced comments seem to be coming from established Ruby and Rails practitioners.

Anywho, the comment that got my particular attention is quoted below; my full reply is reproduced below the quoted comment:

“Just like I’d expect my operating system to respond if I try to write to a file I don’t have permission on, I want my database server to manage the basic rules of the DATA, ie, what relates to what, and which columns should be unique. This is to prevent anything out of the ordinary from affecting the consistancy of the database. The minute you let bad data get in there is the minute any maintainability you love in your application tier goes to hell.”

Bzzzt! Right here, we have the crux of the problem.

I think the cognitive discrepancy lies in equating RDBMS with an operating system. Nothing justifies that parallel.

If we step back and look at what RDBMS is, we’ll no doubt be able to conclude that, as its name suggests (i.e. Relational Database Management System), it is a system that specializes in managing the data in a relational fashion. Nothing more.

Folks, it’s important to keep in mind that it manages the data, not the MEANING of the data!

And if you really need a parallel, RDBMS is much more akin to a word processor than to an operating system.

A word processor (such as the much maligned MS Word, or a much nicer WordPress, for example) specializes in managing words. It does not specialize in managing the meaning of the words.

So who is then responsible for managing the meaning of the words? It’s the author, who else?

Same is with Rails. Rails is the author of the data. As an author, it uses the RDBMS to manage that data in a relational fashion. But, just as we, as the authors of the words, do not expect WordPress to manage the meaning of our words, Rails does not expect the RDBMS to manage the meaning of its data.

As a matter of fact, it would be really terrible if those tools would assume the management of the meaning of the information that is being fed into them. Imagine typing up a letter, only to be jolted when your favorite editor refuses to take the word you’ve just typed, deeming it ‘incoherent’, or ‘not complying with certain constraints’. You’d toss that piece of junk out the window in no time.

Why should Rails developers be any different? Why should we tolerate RDBMS opinions on our data? We’re the masters, RDBMS is the servant, it should shut up and serve. End of discussion.

As for the ongoing ‘the sky is falling’ discussion about what if some other device accesses the RDBMS, it’s the same dilemma as ‘what if some other person accesses our document, and starts changing it?’ There are ways to manage that. Yes, we’re always exposed, always vulnerable to all kinds of attacks, but that’s how life is. You should start getting used to it by now.

Lesscode using Domain Specific Languages (DSL) 16

Cat.: Talk
28. September 2005

Having spent some time in the VS2005 environment, I can say the following about DSL’s:

A domain-specific language (DSL) is a language designed to be useful for a specific task in a fixed problem domain, in contrast to a general-purpose language. DSLs are gaining popularity in the field of software engineering to enhance productivity, maintainability, and reusability of software artifacts, and enable expression and validation of concepts at the level of abstraction of the problem domain.

Using domain-specific languages, one can build customized modeling tools and essentially define a new modeling language and implement it very simply. For example, a specialized language may be used to describe a user interface, a business process, a database, or the flow of information, and then used to generate code from those descriptions.

I built a (small) DSL for modeling application integration scenarios, which is always an issue in the IT business world. First, I defined all of the specific domain model terms used in application integrations scenarios such as XML messages, source – destination applications, XSL maps, business units, protocols, business rules, etc. Then I described the designer definitions that make up the visualization tool. Once the meta data was defined in a supplied Visual Studio template, you build the solution and another instance of Visual Studio fires up with your visual designer implemented.

You then use the visual designer you just created to draw/model the application integration scenario and when you run this solution, it code generates the solution.

Of course, I have left out a lot of detail and it was not easy the first couple of times. I have left out a set of code generators, which take a domain model definition and a designer definition as input, and produce code that implements both of the components as output. The code generators also validate the domain model and designer definition, and raise errors and warnings accordingly. But eventually, you get the hand of it.

Think of it this way, DSL is a tool for building tools :-) For example, anyone that has used the Class Designer in Visual Studio is using a DSL outputted visual designer, specifically designed for building classes.

The process has been quite the learning experience for me and has proven to be very enlightening. I am an old guy, been writing code for 15 years. Quite frankly, I don’t give a hoot about which programming language I use cause I see them all as being the same, some better than others, but still hand crafting code. I don’t want to hand craft code anymore – every time I get involved in a software development project I a) have done it before and b) oh man, this is going to take 6 months of grinding it out. In other words, it’s gonna hurt.

I see DSL’s as a evolution in our software industry to use higher level abstractions in the way of visual designers to so that I can spend time “designing” the solution and have most of the infrastructure code generated for me, that I otherwise would have to grind it out.

If anyone is interested in more information about DSL’s and don’t mind reading it from Microsoft, download the DSL Toolkit and have a read of the documents at: Microsoft Domain-Specific Language (DSL) Tools

Lesscode Is Not About Quantity 13

Cat.: Talk
28. September 2005

A number of people that are making the painful transition to lesscode seem to misunderstand the underlying philosophy of the movement. For some reason, some of them get hung up on the quantity (read: number of lines of code). But, by doing that, I’m afraid they are completely missing the boat.

Let me state it in no uncertain terms: lesscode is not about the number of lines of code. In other words, we can revisit some legacy app and rewrite it so that in the end it results in dramatically smaller amount of lines of code, and still not produce an app that would qualify as being ‘lesscode’ worthy.

So if not the number of lines of code metric, what then would qualify an app as being a bona fide lesscode product?

The thing that often gets overlooked when we’re talking about the lesscode discipline is that there is a qualitative aspect to it. Yes, the most obvious factor that hits us upon inspecting a typical lesscode app is the drastically reduced number of lines of code, but that phenomenon is merely the outcome of some deeper, less visible causes and conditions underlying the discipline. We need to now expose and examine those causes and conditions (and stop fixating on mere symptoms):

Je ne sais quoi (“I know not what”) – the indefinable quality. Christopher Alexander calls it Quality Without A Name

This sounds like a bailout answer; however, it’s actually very true. Lesscode is qualified by something that is utterly satisfying, yet no one can put their finger on it.
Language choice – the language we choose for writing the code determines how we think about designing and programming. Choosing to write code in Assembler or in COBOL would not be very conducive to producing the lesscode app that would be utterly satisfying.
Smart Servant – if the platform/language of our choice is very needy, very fussy and expects us to exert exorbitant amounts of effort just to keep it from throwing a fit any time something changes, that situation will not be very conducive to producing a lesscode worthy product.

What we really need if we are to practice the discipline of lesscode is a Smart Servant product. What that means is that such a product must be very non-intrusive and very supportive of our flakey short term memory. The only realistic way to develop a truly lesscode app is to work in an environment that minimizes the cognitive friction.

Fun with Fixtures 2

Cat.: Ruby, Rails
25. September 2005

My favorite part of Agile Web Development with Rails is the section on testing. I’ve found the framework around testing included with Rails to be a wonderful blend of simplicity and power and this chapter in the book is the perfect compliment. It’s based largely on the core Ruby Test::Unit module but adds some important features on top.

One of those features is Fixtures. Fixtures provide a simple, YAML-based file format for storing database state that should be loaded before test runs. The database is wiped and the fixtures are loaded before each individual test executes, providing consistent state for tests. See the section on fixtures in A Guide to Testing the Rails for more information. Here’s an example fixture from said section:

# low & behold!  I am a YAML comment!
david:
 id: 1 
 name: David Heinemeier Hansson 
 birthday: 1979-10-15 
 profession: Systems development

steve:
 id: 2
 name: Steve Ross Kellock
 birthday: 1974-09-27
 profession: guy with keyboard

The cool thing is that the top level fixture names (in this case, “david” and “steve”) become instance variables in your test case, allowing you to access fixture data in a very intuitively way from tests. The result of this is test code that reads like a story and is often almost humorous.

The book includes a sidebar with a little David head lamenting the importance of “Picking Good Fixture Names”:

Just like the names of variables in general, you want to keep the names of fixtures as self-explanatory as possible. This increases the readability of the tests when you’re asserting that @valid_order_for_fred is indeed Fred’s valid order. It also makes it a lot easier to remember which fixture you’re supposed to test against without having to look up p1 or order4. The more fixtures you get, the more important it is to pick good fixture names. So, starting early keeps you happy later.

But what to do with fixtures that can’t easily get a self-explanatory name like @valid_order_for_fred? Pick natural names that you have an easier time associating to a role. For example, instead of using order1, use christmas_order. Instead of customer1, use fred. Once you get into the habit of natural names, you’ll soon be weaving a nice little story about how fred is paying for his christmas_order with his invalid_credit_card first, then paying his valid_credit_card, and finally choosing to ship it all off to aunt_mary.

Association-based stories are key to remembering large worlds of fixtures with ease.

Taking this advice, I started in on testing a part of an application I’m working on now. The result is worth posting in it’s entirety (hint: it gets interesting towards the middle):

require File.dirname(__FILE__) + ‘/../test_helper’

class EnrolleeTest < Test::Unit::TestCase
  fixtures :coverage_types, :plans, :coverages, :plan_levels, :ppo_options,
           :elections, :enrollees, :enrollments, :tpas, :tpas_users, :users,
           :roles

  def setup
    @joe = Enrollee.find(@joe_the_policy_holder.id)
    @rita = Enrollee.find(@rita_the_spouse.id)
    @billy = Enrollee.find(@billy_the_dependent.id)
    @alice = Enrollee.find(@alice_the_dependent.id)
    @the_family = [@joe, @rita, @billy, @alice]
    @enrollment = Enrollment.find(@test_enrollment.id)
  end

  
  def test_common_attrs
    @the_family.each do |enrollee|
      assert_kind_of Enrollee, enrollee
      assert_equal @enrollment.id, enrollee.enrollment_id
      assert_equal @enrollment, enrollee.enrollment
      assert_equal @joe.id, enrollee.policy_holder_id
      assert_equal @joe, enrollee.policy_holder
    end
  end

  def test_get_gender
    assert_equal :male, @joe.gender, "Joe is a male"
    assert @joe.male?, "Joe is a male"
    assert_equal :female, @rita.gender, "Rita is a female"
    assert @rita.female?, "Rita is a female"
    assert !@billy.female?, "Billy is not a female"
    assert !@alice.male?, "Sally is not a male"
  end

  def test_set_gender
    [:female, ‘F’, 2].each do |tok|
      @joe.gender = tok
      assert_equal :female, @joe.gender, "Joe is now a female"
      assert @joe.save, "Joe could not be saved after sex change"
      @joe.reload
      assert_equal :female, @joe.gender, "Joe is now a female"
    end

    [:male, ‘M’, 1].each do |tok|
      @rita.gender = tok
      assert_equal :male, @rita.gender, "Rita is now a male"
      assert @rita.save, "Rita could not be saved after sex change"
      @rita.reload
      assert_equal :male, @rita.gender, "Rita is now a male"
    end

    # make sure that we can set it to nil
    @joe.gender = nil
    assert_equal nil, @joe.gender
    assert @joe.save(false), "Joe couldn’t be saved after neutering…"
    @joe.reload
    assert_equal nil, @joe.gender

    @joe.gender = ‘’
    assert_equal nil, @joe.gender
    assert @joe.save(false), "Joe couldn’t be saved after neutering…"
    @joe.reload
    assert_equal nil, @joe.gender
  end

  def test_marital_status
    assert_equal :married, @joe.marital_status
    assert @joe.married?, "Joe is married"
    assert !@joe.single?, "Joe is not single"
    assert @billy.single?, "Billy is single"
    assert !@billy.married?, "Billy is not married"
    assert_equal :single, @billy.marital_status
  end

  def test_set_marital_status
    [:married, 2].each do |tok|
      @billy.marital_status = tok
      assert_equal :married, @billy.marital_status
      assert @billy.save, "Billy could not be saved after he got married"
      @billy.reload
      assert_equal :married, @billy.marital_status
    end

    [:single, 1].each do |tok|
      @rita.marital_status = tok
      assert_equal :single, @rita.marital_status
      assert @rita.save, "Rita could not be saved after her divorce"
      @rita.reload
      assert_equal :single, @rita.marital_status
    end
  end
end

Web Services Infrastructure: Kid Templating 7

Cat.: First they ignore you..
24. September 2005

Note: the title should make sense by the end of the post…

Kevin Dangoor’s recent announcement of TurboGears has resulted in a dramatic increase of interest in Kid. Kid was first announced on November 30, 2004 as a Pythonic XML-based Templating Language. I remember thinking I was going to do a series of articles on why I wanted this specific combination of features in a library. I never did and Kid progressed into its current form, growing a small community along the way.

Although it’s now used primarily for HTML templating, that wasn’t the initial goal of the project. What I really wanted to do was to illustrate a different way of thinking about “Web Services”.

Not what you were expecting, eh?

But it’s true - Kid was supposed to be a simple device that I would use to start a narrative exploring a variety of topics related to building distributed systems atop the web (i.e. “Web Services”). I had decided that the direction being taken by the industry mainstream was incorrect and that web services would languish until they became more like the existing, working, proven, web.

There was a lot of talk about “Web Services Infrastructure” and framework and that talk continues today. The assumption by nearly everyone was (and still is) that web services would require a whole new set of tooling and paradigm. SOAP/WS was still very much RPC oriented and so the focus was on language bindings, typing, discovery, and the like. Web Services programming had almost no resemblance to existing web programming.

This has only accelerated in the time since I was heavily involved. Now it seems the industry is calling for some mysteriously unspecified SOA toolkit, ESB, or some other insignificant combination of alphanumerics to save Enterprise IT.

I was in the middle of the situation a year ago or so and it troubled me deeply. Long days and nights at work were spent combining technologies like SOAP, WSDL, WS-Security, and BPML with massive Java infrastructure. I ate it all up and knew it like the back of my hand. It is not impossible for someone dedicating 8-12 hours a day to understanding this stuff to love it. I loved it. It’s all very intriguing for someone with my particular constitution.

But so is masturbation.

I’m not sure what happened. I guess I had what alcoholics refer to as “a moment of clarity”. I distinctly remember going over to grab a co-worker for lunch. He was on the phone and so I was talking to some guy in an adjacent cube who did “services work”. The services people were developers but were considered a different department than product development. They dealt with specific customer needs and were tasked with using our platform to craft actual real solutions for actual real business problems. Their world is much different from product development, whose main customers were marketing and the executive team.

Anyway, I noticed a very large stack of printed material on this guys’ desk, entitled “Introduction to Web Services”. Development had collected a set of introductory materials that were to be distributed to all of the services people in the field and, I assume, eventually to external developers using our platform. I had a small part in compiling these materials and had reviewed it at various stages in electronic form.

The hard copy put me on my ass. (I say that metaphorically but if it were to hit me with any significant velocity it very well could have literally put me on my ass.) I picked up the tomb (an “Introduction”, mind), slapped it on my co-worker’s desk, who was now off the phone, and whispered, “this isn’t going to work.”

The services guy doesn’t love this stuff - at all. I’m sure he was completely capable of digesting all of it had he the time and interest of, say, someone in my position, but he doesn’t. He has a customer that wants a bunch of systems to talk to each other and they all have very large and amazingly different framework and infrastructure. More framework and infrastructure is the last thing he needs. The services guy was a wake-up call.

At the time this was happening I had been nursing a long-time interest in true web architecture. This didn’t have anything to do with work - a very long time ago I threw together a little download tool thingy that was capable of resuming failed downloads (with 14.4 baud modems this was a big deal and none of the browsers had native support for resuming). This required that I read bits of RFC 2616 and implement a basic HTTP client. I remember being amazed at some of the capabilities of HTTP because I had assumed that it was mostly a simple file transfer protocol (closer to FTP than, say, CORBA). The spec ended up staying with me and I continued to explore different ways people were using the web and HTTP to do new and exciting things.

At some point, I became convinced that existing, basic web architecture solved many of the problems we were seeing with Web Services adoption. The problem wasn’t that WS was incapable of solving the technology issues (it’s quite adequate), the problem was that it was incapable of solving the social issues. It far exceeded the threshold of acceptable complexity for true adoption by this large community of people whose primary goal is to solve business problems.

The web, on the other hand, was invented to solve the same basic integration problems businesses are experiencing now. Tim Berners-Lee’s dilemma is not so different than our own: a bunch of systems with a bunch of closed data formats and processing capabilities that should be universally accessible. Berners-Lee realized early on that solving this problem would require, above all else, a virus: something that was so simple and lightweight that it would be hard not to adopt. And that’s what the web is: a virus. Just like C and UNIX, Windows, Visual Basic, and a slew of other technologies that kept, as a primary requirement, the ability for real people to solve real problems.

I shifted my thinking drastically and tried to imagine how the integration problems we were seeing would be solved using existing web architecture, which we know to have the traits necessary for mass adoption. I should mention here that by “web architecture”, I mean W3C Tag Web Architecture but also all of the tools and techniques that have evolved above and below it. The servers, proxies, template languages, mod_rewrite, sessions, cookies, load balancing, liberal feed parsers, virtual hosts, MVC, MultiViews/content negotiation, monitoring systems, automated testing tools, dynamic languages, view source, and on and on. All of these little tricks and techniques add up to an extremely powerful and yet fairly simple and understandable toolkit for building distributed systems. What’s more is that there are an unmatched number of people who understand how to build these systems using all variations of platform and language.

Which brings us to Kid as “Web Services Infrastructure”. The concept is simple: for whatever reason, template languages (PHP, ASP, JSP, CFML, ERB, Cheetah, Tapestry, Velocity, etc.) have become a fundamental tool for web development and their usefulness is in no way limited to presentational data (HTML). Templates are simple, templates are cool. You throw some junk in there and look at the result. If the result isn’t right, you tweak your template until it does look right. There’s no layers of magic to get in your way. When something doesn’t come out right, you change the template. There’s no “management” or “container” involved.

There’s no rule that says templates must only be used to generate HTML. Indeed, many of the RSS and Atom feeds in the wild are generated from some form of template. They are never automatically-generated-behind-the-scenes using language bindings and are very rarely generated using some kind of DOM/SAX API.

<rant> RSS is the most successful web services data format in existence (after HTML, of course ;). Successful web services in the future are likely to resemble it. Is there a use-case for RSS or Atom in WS-* land? There are thousands of pages of spec text and no one could throw together a simple use-case for the most successful web service in existence? That’s irresponsible. </rant>

The point I’m trying to make is that template based web services are a reality and that we should be thinking about making incremental improvements to the general model to facilitate more machine-centric data formats instead of creating whole new paradigms.

There are a variety of really important issues with using templates for general purpose web services programming, most having to do with the first part of Postel’s Law:

“Be conservative in what you do; be liberal in what you accept from others.”

The problem with using templates to produce XML (including XHTML) is that it is exceedingly hard to be conservative in what you do. Most template engines are text based, making it easy to miss well-formedness errors. There are also a range of character encoding issues that template languages could ease but often simply ignore and sometimes make worse.

Kid is a simple attempt at building features that aid in conservativeness into the template engine. I actually considered tag-lining it The Ultra-Conservative Template Engine as a play on Mark Pilgrim’s Ultra-Liberal Feed Parser whose name comes from the second part of Postel’s law. This is, of course, the whole point of being conservative in the first place: so that we don’t need an Ultra-Liberal parser for each variation of “web service”.

I think some of these features are compelling and would like to see them pursued in other tools that are in common use for web development. For instance, one of the most important and least talked about features of XML is that it provides a reasonable system for encoding the entire range of unicode code points in any character set, including ASCII. A template engine with a basic understanding of XML could process templates authored in utf-8, interpolate data encoded in ISO-8859-1, and output in 7-bit ASCII — if that’s what Postel demanded. Kid does that.

One of the most important features of XSLT is that well-formed templates are guaranteed to yield well-formed output (with some well understood exceptions). If the template runs, you know it will provide a basic level of conservativeness. Kid does that. We’ve wontfix‘d feature requests because they would break this contract.

Most template languages require you to explicitly encode data that may contain reserved characters. Kid takes the opposite approach and assumes that all content is textual and should be encoded unless you explicitly state that something is XML (in which case it must be well-formed).

There’s also some interesting features around serializing the resulting XML infoset with different variations. For example, you can author templates in XHTML 1.0 and output in HTML 4.01. The output serializer takes care of all the little quirks for things like empty elements, non-escaping of SCRIPT and STYLE content, boolean attributes, etc. The result is a clean authoring environment and an ultra conservative output format. I bring this up in the context of web services infrastructure only to show that the ability to filter template output can be very useful when dealing with different types of user agents.

All this to say that if you’re looking for “Web Services Infrastructure” for exposing processes and information, you’re probably looking to hard. If you have a database, templating, and a web server, you have most of the infrastructure and framework required to begin exposing information from each of your systems in a proven and established way. What you want to be on the lookout for are small and specific enhancements to these existing pieces that allow you to interact with other machines in a more predictable manner or in new and different ways.

The next time someone is selling you infrastructure for Web Services, or SOA, or ESB, or whatever they’ll call it next, make sure you ask “Why?” After they tell you, make sure you understand, agree, and have the problems they propose to solve. If not, ask again. New framework and infrastructure is extremely expensive in more ways than one: you have to ramp people on it and then deploy, manage, monitor, and support it. Sometimes new infrastructure is unavoidable but when it overlaps a large portion of your existing infrastructure, you should make sure it’s bringing back a significant return.