Theory

'Theory' Archives

Code is Model 27

Cat.: Theory
19. October 2005

There is a great post by Harry Pierson over at his DevHawk blog site describing “what can we learn from looking at the success of mainstream text-based programming languages to help us in the development of higher abstraction modeling languages that are actually useful.”

As I have written elsewhere, I am a huge fan of raising the level of abstraction to deal with complexity in our software world. This is part of what I call the industrialization of software. No, I don’t mean making programming fully automatic, as it will never be that way. It is too large and complex to do so. However, I am probably as frustrated as Mini-Microsoft’s quest to make Microsoft a leaner meaner machine in my quest to make software development more of a predictable and repeatable process. It seemingly ain’t going to happen over night and may not happen in my lifetime!

John Walker, figured out the industrialization of the engineering design world when his invention, AutoCAD hit the market in 1982. He said, “if you can’t model it, you can’t build it”. Damn right! In 2005 and in the software industry, we still have not figured it out, yet. We are still bashing away with the stone age equivalent of hammers and chisels, whereas the engineering design world has AutoCAD to describe incredibly large and complex building structures, airplanes, engines, electronic circuit diagrams and just about everything else you can think of by modeling blueprints that are meaningful. People use these blueprints to turn models into real world constructs that you and I use every day. How about that cell phone? Or your iPOD? Or your car? Or that plane you just flew in on? Or this?

So what’s up with our software world? Why do we refuse to adapt the successes of other industries, like the engineering design world, and leverage those successes in our software world? What are we afraid of? Why are we stuck using (still) low level programming languages that we toil with at all hours of the day and night to produce inferior software products? If we were designing and constructing commodity cars by hand and even fabricating the tools used to build the cars by hand, we would be laughed out of the industry. How come software development seems to be different? Where is our AutoCAD for software development?

This may seem like a bit of a rant or maybe I have unrealistic expectations as to the maturity level of our software industry. However, I still can’t believe, yours truly after being in the software development business for 15 years, still have to manually add an imports or using source code statement every time I add a reference to an assembly in Visual Studio. What the ?? Not that I am picking on Visual Studio – I happen to think it is an excellent IDE along with introducing Software Factories, Domain Specific Languages, Guidance Automation Toolkits, etc. While these tools and processes are definitely raising the level of abstraction in dealing with software complexity and advancing the industrialization of software, it still seems not enough to me. We still lack standards like in the electronics design world for plug and play integrated circuits that I can order from a catalog. Or even standard electronic diagram symbols that describe everything electronic in which anyone trained in the industry can glean, in moments, what the circuit diagram is saying, with no ambiguity. Are we there yet in our software world?

Moreover, we still have a world of programmers that refuse to even consider modeling as a first class software artifact – they consider it pretty pictures. I know we have had issues in the past with CASE tools and with earlier versions of UML and other code generation tools. But, I have software developers that I have worked with that won’t even give it a thought – they immediately get out their favorite source code editor and start writing code - so much for design. And it seems the younger they are, the more I see this behavior or even the old school guys who have become “code crafters” where they take their craft extremely seriously and are totally affronted in the thought of using a “modeling” tool. I am not sure I understand either group’s motivation for this behavior. Where did it come from? How come I (and a few others) don’t have this behavior?

What’s my point? I don’t have one – I am just pondering, out loud, what it is going to take to bring the industrialization of software into reality – and how long?

New Take On Scalability 29

Cat.: Theory
08. October 2005

Apologies for dragging this corpse into the discussion again, but some latest development prompted me to try and share this with others.

Whenever I give presentations on Ruby and Rails, the number one question invariably pops up from the audience: will it scale?

At first, I was allowing myself to fall prey and to drop down into the detailed and quite meaningless discussion. But then I’ve changed the tactics, and began countering the question with: “Scale to what?”

Amazingly, most people don’t know the answer to that question. They just throw in abstract answers, like to thousands simultaneous requests, etc. But pretty much no one can supply a real life example that is more concrete than the yahoo or amazon or google. In other words, it seems that not too many people are working on the super busy web applications.

If that’s the case, the scalability question is, in most situations, quite moot.

But the real change in the scalability landscape is now dawning with the advent of the rich internet applications. Here is how it goes:

Originally, when we’ve made the transition from fat clients in the client/server world to dumb terminals (i.e. HTML documents rendered in the browser) living in the web 1.0 world, we’ve effectivelly sucked all life out of the clients. The clients got demoted to stupid braindead appliances for rendering HTML.

But someone had to continue doing their work. That someone was the app servers. The central role of the app servers (such as WebLogic, WebSphere, etc.) was impersonating numerous clients out there. The app servers had to do incredible amount of work impersonating the clients who were only known via their requests peppered with some cookies.

This situation resulted in incredible strain placed on the app servers. The whole multi-million (or is it billion?) dollars industry sprang around those beasts, and new careers were forged around WebLogic, WebSphere, JBoss etc. servers.

Seeing how huge the burden of impersonating numerous clients on the app server is, the question of scalability became one of the central issues.

However, now that the clients are finally reclaiming their state, and that the app servers do not have to continue impersonating each and every client, the whole scalability issue becomes meaningless. Let the clients do all the processing that governs their behavior. Suddenly, app servers are left with very little to do. And as the clients are happily humming and taking care of their own state, all that the back-end resources need to worry about is the centralized business logic. And in that arena, there is very little scalability issues.

So, now this whole topic is even more of a moot point. It’s time we gave it a decent burial, and moved on to the greener pastures.

Yes, But What About The Legacy? 1

Cat.: Theory
03. October 2005

Once again, I need to branch out into a new thread, for the benefit of bringing to everyone’s attention the importance of the legacy stuff. Here is what Kevin Smith wrote in response to my post Shared Data And Mobile Data:

Alex, I completely agree with you, but one big obstacle is legacy stuff: Legacy apps which already have data in rdbms’s. Legacy code. Legacy coding patterns. Legacy frameworks. Traditionally, exchanging data has been quite difficult.

True. I read somewhere that the data that was collected from the Moon back in the late sixties is unusuable today, because we lost the hardware that could replay them. Urban legend?

Beyond that, if you expose an rdbms today, people can immediately generate useful and productive reports from it. They can query it. With a custom app that is “willing to” exchange data, there’s a steeper curve for other apps, and unless you have pre-exposed all your data through published api’s, you’ll have to keep extending your code to meet the data exchange needs of other apps. So with today’s technologies and patterns, there are some immediate benefits of the “data is shared” model.

This is also true. However, since legacy systems are preponderantly bureaucratic, they should be viewed as such — they are huge bureucracies. And also, we should treat them as such and approach them as such.

What do we mean by that? Well, same as in real world, bureaucratic systems need to be approached prudently, or elase nothing ever gets done.

If I want to renew my passport, for example, there is presently no other way for me to do it but to approach the bureaucratic system that is called the Government Passport Office. And that system is very finicky (as most of you probably know first hand). So, it is quite clear to me that I need to play by their rules if I were to ever get a valid passport.

Now, in order for me to engage in a dialog with the Passport Office, I need to learn their lingo. In other words, I need to get all those forms, study them, make sure I go to the authorized photographer who will make me a kosher passport photo with a neat little dated stamp at the back. I also need to find me a judge, or a high school principal, or a university professor, etc., someone from those professions who claims to know me for more than 5 years. That person will be my guarantor. And on and on….

Once I’ve completed all my legwork, I submit my passport application to the office.

At no point during this process is the Passport Office going to open up its bowels and let me query its state. I can only engage in a conversation with it through a very tight protocol.

We must approach software legacy systems in the same fashion. Treat them as first class citizens (which they are), and learn their lingo. Not naively expect that we could put our diving suit on and jump in and start querying and interrogating the bowels of a first class citizen.

There is (or should be) a protocol specifying how we can talk to the first class citizens, or bureaucratic systems. If we learn how to compose the complex messages they expect us to give them, we can get exactly what we’re looking for from them.

I don’t see absolutely any need for us to assume that the information we’re looking to get from the bureaucratic systems is stored in an RDBMS. Consequently, SQL should not necessarily be our lingua franca when conversing with legacy systems.

Many developers are so immersed in rdbms’s that they don’t even realize that the concepts of data storage and data exchange are separable. I figure we’re at least a few years away from the point that your “data is mobile” idea is widely accepted in by mainstream (especially corporate) developers.

I’m not sure that I agree. I see a big push towards SOA, which is exactly what I’ve described above. The mainstream businesses have realized that only if they place their legacy systems in a role of being first class citizens do they have a chance to play in the 21st century economy.

Should Database Manage The Meaning? 51

Cat.: Talk, Theory
29. September 2005

I couldn’t resist jumping into the Choose a single layer of cleverness discussion, that is raging on David Heinemeier Hansson’s blog. The majority of the challenges to David’s thesis were so widly off mark, that it had left me completely bewildered. What’s even more bewildering, to me at least, is that many of the misplaced comments seem to be coming from established Ruby and Rails practitioners.

Anywho, the comment that got my particular attention is quoted below; my full reply is reproduced below the quoted comment:

“Just like I’d expect my operating system to respond if I try to write to a file I don’t have permission on, I want my database server to manage the basic rules of the DATA, ie, what relates to what, and which columns should be unique. This is to prevent anything out of the ordinary from affecting the consistancy of the database. The minute you let bad data get in there is the minute any maintainability you love in your application tier goes to hell.”

Bzzzt! Right here, we have the crux of the problem.

I think the cognitive discrepancy lies in equating RDBMS with an operating system. Nothing justifies that parallel.

If we step back and look at what RDBMS is, we’ll no doubt be able to conclude that, as its name suggests (i.e. Relational Database Management System), it is a system that specializes in managing the data in a relational fashion. Nothing more.

Folks, it’s important to keep in mind that it manages the data, not the MEANING of the data!

And if you really need a parallel, RDBMS is much more akin to a word processor than to an operating system.

A word processor (such as the much maligned MS Word, or a much nicer WordPress, for example) specializes in managing words. It does not specialize in managing the meaning of the words.

So who is then responsible for managing the meaning of the words? It’s the author, who else?

Same is with Rails. Rails is the author of the data. As an author, it uses the RDBMS to manage that data in a relational fashion. But, just as we, as the authors of the words, do not expect WordPress to manage the meaning of our words, Rails does not expect the RDBMS to manage the meaning of its data.

As a matter of fact, it would be really terrible if those tools would assume the management of the meaning of the information that is being fed into them. Imagine typing up a letter, only to be jolted when your favorite editor refuses to take the word you’ve just typed, deeming it ‘incoherent’, or ‘not complying with certain constraints’. You’d toss that piece of junk out the window in no time.

Why should Rails developers be any different? Why should we tolerate RDBMS opinions on our data? We’re the masters, RDBMS is the servant, it should shut up and serve. End of discussion.

As for the ongoing ‘the sky is falling’ discussion about what if some other device accesses the RDBMS, it’s the same dilemma as ‘what if some other person accesses our document, and starts changing it?’ There are ways to manage that. Yes, we’re always exposed, always vulnerable to all kinds of attacks, but that’s how life is. You should start getting used to it by now.

Code Snippets and Systems of Ends 10

Cat.: Theory
30. August 2005

This post is a bit all-over-the-place, sorry.

I’ve stumbled across Code Snippets at least 5 times in the past couple of days. It’s basically del.icio.us for small pieces of code. Each snippet gets a title, description, actual code snippet, a set of tags, and most importantly a URL. The result of this situation is obviously great search engine indexability because, as I said, I’ve happened upon it at least 5 times now through basic Google usage.

What’s interesting is that, as far as I can tell, they’ve placed NO limitations on what type of snippets can be posted. There’s a quick bash two-liner for Automatically adding a bunch of stuff to CVS next to a Generic XHTML Template next to a Python snippet for generating midi tones on a Series 60 cell.

Considering this from more abstract level, you might call this a demonstration of a few bits of theory laid out by Doc Searls and David Weinberger in World of Ends. Here we have two ends (Google and Code Snippets) that work well together due to a common level of understanding of what’s desirable in the larger system they both operate in. The value of each end seems to increase with each new end that it touches. I think this basically follows Metcalfe’s Law, which states that “the value of a network equals approximately the square of the number of users of the system (n2).” Only “users” in this context can mean “other systems”. In this case, Code Snippets enhances Google and Google enhances Code Snippets. You might also say that Code Snippets gets more value from Google than Google gets from Code Snippets and that the actual value each obtain is close to that predicted by Metcalfe (or not).

Anyway, the enhanced searchability this style of organization facilitates got me thinking about the quality of metadata at del.icio.us proper… Did you know that Joshua disallows access to / for all robots? and that Google is not spidering the amazing set of collobrative metadata available there? Having the Googlebot run through del.icio.us on a regular basis would be insanely expensive for del.icio.us - Technorati did it for about two weeks and then were cut off if I remember correctly.

But why shouldn’t Google/Yahoo/whoever purchase the bandwidth and other resources necessary to run their spiders over del.icio.us? If you did pay, spidering probably wouldn’t be the best method of getting at the meat on del.icio.us. If I were a search engine, I would try to convince Joshua to lease me a basically persistent stream from this RSS 1.0 Feed. You should be able to put something together that improved the quality of your search dramatically with the quality of data in that stream.

Finally, it’s interesting to note that “/rss” is the only resource robots are allowed to access:

http://del.icio.us/robots.txt:

User-agent: *
Disallow: /
Allow: /rss

If del.icio.us were so inclined, I think it’s reasonable to believe that they could start picking up revenue by leasing high quality access to that URL.