Thursday, April 30, 2009

I don't think that word (web) means what you think it means

I recently watched a Q and A session with Tim Bray on the future of the web from QCon San Francisco 2008. I enjoyed the talk, if you have not seen it already it is well worth a view. One of the responses to the talk by a flex architect and consultant Yakov Fain raised some interest points that I would like to discuss.

"you can create RIA that support Back button - just decide what application view (no page) to show when the user hits the Back button."

A disadvantage of RIA's (Rich Internet Applications) is their inter operation with the web (lack there of), you lose a lot of the shared semantic meaning of the web in RIA's. The site Mr. Fain gave actually provides a excellent example of this weakness. You achieve back button functionality by leveraging url fragment identifiers, however in doing so you are hijacking the agreed semantic meaning of fragment identifiers. "For HTML, the fragment ID is an SGML ID of an element within the HTML object." Fragment identifiers are meant to identify elements within a document and not separate documents which is what mbusa.com uses them for.

A direct result of abandoning this agreed semantic meaning is that the content on those "views" will not be indexed by search engines like Google. You can demonstrate this to yourself very easily, lets take: the mercedes couple cl550 as an example.

A search on google for "mercedes CL550 2009 coupe" does not return a single result to your RIA or even a Mercedes site in the first 5 pages (I didn't bother checking results after the 5th page).

Now compare that to a search for "honda civic 2009 coupe" the first result is Honda's civic coupe page.

The RIA approach of utilizing fragment identifiers is inherently limited and this becomes obvious when contrasted to a more RESTful approach that doesn't hijack the agreed semantics of urls.

Mr Fain goes on to say:

"There’s a bunch of interdependent rules for each car model that should enable/disable UI controls depending on the user’s selection. For example, if you ordered white leather interior, you can’t have yellow exterior. We’ve implemented the entire rule engine on the client, which makes the entire system a lot more responsive."

Many other e-commerce and product sites have similar rules around products, however they do not give up being indexed by a search engine for responsiveness. For me as a online consumer, responsiveness has not been a pain point or something I long for when using sites like newegg, amazon, or apple. However if their content was no longer indexable by google I would feel that pain and would be less likely to purchase from them. It seems like a pretty shitty trade off doesn't it?

Mr. Fain goes on to attack PHP and Rails via the twitter straw man:
"Mr. Bray believes that the direction of Web applications is moving form J2EE to PHP and Rails. He didn’t make it clear for what kind of applications though. Is he talking about applications like Twitter that can go down several times a day and people will keep using it because they don’t have any better choice? ... But if you take any application that handles your money (Banking, eCommerce, auctions) I’d rather stick to tried and true J2EE on the server"

I think one should be careful in stereotyping a language/platform by the performance problems of one application, you may find yourself mistaken more often then not. Shopify.com is a great example of a Rails e-commerce. Mr Fain seems to forget that one of the big advantages of the web is that I can use whatever I want on my server and my clients don't care.

"I believe that Web moves to a VM-based stateful clients with fast communication lines between the client and the server moving tons of strongly-typed data back and forth."

I disagree with this statement and here's why: as my very knowledgeable colleague Jim Webber put it: the web built on three principles that conventional thinking teaches are bad, it is dynamic/late bound, text based, and is built on polling. These are not weakness of the web rather they are its strengths. You should bear in mind the web is not some system that we were just handed and have to live with, rather the web is what resulted from a form of technological natural selection; it obliterated other more statically typed protocols like DCOM, CORBA and Gopher, not vice versa.

Monday, April 20, 2009

What are your wastes?

The Toyota production system starts with a conversation between Norman Bodex and Taiichi Ohno. Mr Bodex asks Mr Ohno where Toyota is today, by now they must have reduced all work-in-process inventory - enabling them to chip away at all the problems. "What is Toyota doing now?" he asked. Taiichi Ohno's reply is simple but brilliant. "All we are doing is looking at the time line, from the moment the customer gives us an order to the point when we collect the cash. And we are reducing that time line by removing the non-value-added wastes."
For the sake of this post I would translate this picture to the world of software by replacing order with "Feature Conceived". Although I think you could make compelling arguments to expand the time line to a earlier point in time such as "customer demand".

Fundamentally the Toyota production system is based on the elimination of waste, and In my experience the most productive teams I have been on followed this practice of constantly removing waste. Becoming such a team is not a destination but a constant journey, as Kent Beck says in xp explained; "Perfect is a verb, not a adjective". A helpful tool to enable this continuous improvement is the 5 why's, for example:

Why did the server go down?
The wrong privileges were set when the application was deployed.
Why were the wrong privileges set?
The wrong account was used to push to production.
Why was the wrong account used?
Bob was sick so Fred pushed the deploy.
Why does Bob's account have to push the build?
He has always done the deployments.
Why has Bob always done the deployments?
No one ever took the time to automate the deploy to production.

The five why's are by no means perfect, Wikipedia provides a list of good criticisms of the methodology. I have seen many of the anti-pattern's they describe occur, I have also seen a team apply techniques like the five why's to a point where we got our release cycle down from two-four weeks to one day. Retrospectives can also be helpful in identifying and eliminating waste, however again it is important to stress there are no silver bullets and you should not limit your improvements to scheduled meetings. Keep in mind: "It is said that improvement is eternal and infinite." - Taiichi Ohno

There are issues in drawing parallels between software and manufacturing; one is a far more creative process then the other, however the advantages that result from eliminating wastes can be realized for both. One should bear in mind that Just-in-time was far more heavily influenced by American supermarkets then Automotive manufactures. That being said as software and manufacturing are analogous, so the sources of/kinds of waste may differ between the two. What would you identify as the wastes in your organization? There are some common places you can look; Do you rely on manual testing? Are there manual steps in your deployment? Do you spend a lot of time merging code branches? Do you develop large feature sets to find they are unused? How long does your code sit in source control before it goes out to production? Do you have fat requirement documents that quickly go out of date?

Think you have no wastes? Think you can't get working software into the hands of your users any faster? If so let me leave you with this quote:

"No one has more trouble than one who says that he has no trouble." -Taiichi Ohno

Wednesday, April 15, 2009

Magellan 0.1.3 Gem Released

So what is magellan and what does it do? Magellan is a web testing tool that embraces the discoverable nature of the web.

What does that mean practically? Simply put it is a web crawler written in ruby that has 2 rake tasks built around it:

The first task will explore sites by following //script[@src] //img[@srg] and //a[@href] tags and look for documents that return http status codes of 4** or 5**. The second task lets you specify a url pattern and an expected link to look for if the current url matches that pattern. For example you can say product pages should contain a link to /sizing.html or that all pages should contain a link to /about_us.html.

Can magellan help you?

I see magellan being able to help two groups of people, those whom have low test coverage and would like an easy way to get started in testing their web application.

The second group I see magellan being able to help is those moving towards/practicing continuous deployment. Magellan can supplement your existing tests/continuous integration process with exploratory testing to find any broken links/missing documents, or verify the interconnectedness of your resources.

How does magellan replace selenium or watir?
Magellan is not meant to eliminate the need for higher level acceptance tests. Frameworks like selenium or watir will remain a key part of any healthy suite of tests. However any browser based testing framework will involve more moving parts then may be necessary to test part of a web application. As a result of this theses tests will always be slower and have more potential points of failure than alternatives without them. Magellan will let you focus your higher level acceptance testing on the key parts of the business and the integration of your javascript with the browser.

Interested in giving it a go?

Because magellan leverages the agreed semantics of the web to crawl your site, getting started with it could not be easier. You can find install instructions and examples at: github

Your feedback is welcome at: rubyforge