ranjan sakalley

{thoughts on software delivery}

Monday, October 24, 2011

Introducing automation to an existing large team

A reader on my article at infoq asked an interesting question - introducing automation to a big project which has been worked on for some time. I plan to write more posts on this topic, have a lot of thoughts, but here's my immediate answer -

A very strong actionable technique that I have seen work well is that you create a small/minimal smoke test suite for you larger app. You can decide on what comprises a suite of smoke test -
  • Basic user actions - some 4-5 use cases without which a user cannot use your application. (for e.g. on an eCommerce site, my smoke tests would comprise of login, user registration, search, end-to-end purchase and payment).
  • Integration points - choose must-have integrations - like payment gateway (same eCommerce motif), oauth for authentication against Facebook or some such. Write smoke tests that touch these integrations to know if every build that you process is verified for these.
  • Attack a small part/feature of the application. You can choose the most important, the most buggy etc. Write tests only for this part.
  •  Concentrate on the currently "under-development" work, which will give the team immediate value in terms of coverage. This is a bit counter-intuitive as there would be a lot of changes while you work. But I have seen this work wonders at times, and your team starts worrying about test maintenance from the first step on.
You can choose more ways according to your project, better still, mix and match. But make sure you choose the most important 5-10 and not 50-100 test cases for your smoke suite. Your smoke suite shouldn't take much time to run aim for < 5 minutes. The correct way to work with these would be to make these smoke tests as part of your CI process so that the team get immediate benefits out of the smoke tests.
 

Your team will start seeing value in automating regression scenarios with this smoke suite in place. From then on, you can expand your suite from smoke upwards, adding more complex or more detailed scenarios. Another thing to note on big complex applications - aim for small tests, get quick wins. Then combine to make bigger ones. This will keep your team involved and constantly working towards covering every possible nook and keep being excited about it. But do make sure everything is part of CI, because if its not, then people lose faith.

Finally, as ever, keep showing value, make sure the stakeholders understand the importance, you can do this by sending them reports, and talk about gains etc. If your team buys in with the idea, there's no stopping.

Thursday, August 25, 2011

Why is test automation the backbone of Continuous Delivery?

Software testing and verification needs a careful and diligent process of impersonating an end user, trying various usages and input scenarios, comparing and asserting expected behaviours. Directly, the words "careful and diligent" invoke the idea of letting a computer program do the job. Automating certain programmable aspects of your test suite thus can help software delivery massively. In most of the projects that I have worked on, there were aspects of testing which could be automated, and then there were some that couldn't. Nonetheless, my teams could rely heavily on our automation suite when we had one, and spend our energies testing aspects of the application we could not cover with automated functional tests. Also, automating tests helped us immensely to meet customer demands for quick changes, and subsequently reaching a stage where every build, even ones with very small changes went out tested and verified from our stable. As Jez rightly says in his excellent text about Continuous Delivery, automated tests "take delivery teams beyond basic continuous integration" and on to the path of continuous delivery. In fact, I believe they are of such paramount importance, that to prepare yourself for continuous delivery, you must invest in automation. In this text, I explain why I believe so.

How much does it cost to take one small change to production?

As the complexity of software grows, the amount of effort verifying changes as well as features already built grows, at least linearly. This means that testing time is directly proportional to the number of test cases needed to verify correctness. Thus, adding new features means that testing either increases the time it takes a team to deliver software from the time development is complete, or it adds cost of delivery if the team adds more testers to cover the increased work (assuming all testing tasks are independent of each other). A lot of teams, and I have worked with some, tackle this by keeping a pool of testers working on "regression" suites throughout the length of a release verifying if new changes break already built functionality. This is not only costly, its ineffective, slow and error prone.

Automating test scenarios where you can lets you cut this time/money it takes to verify if a user's interaction with the application works as designed. At this point, let us assume that a reasonable number of your test scenarios can be automated, say 50%, as this is often the least bound in software projects.If your team can and does automate this set to a certain number of repeatable tests, it frees up people to concentrate more on immediate changes. Also, lets suppose that it takes as much as 3 hours to run your tests (it should take as less as possible, less than 20 minutes even). This directly impacts the amount of time it takes to push a build out to customers. Increasing the number of automated tests, and also investing in getting the test-run time down, your agility and ability to respond increases massively, while also reducing the cost. I explain this with some very simple numbers (taking an average case) below -

Team A -
1. Number of scenarios to test - 500 and growing.
2. Number of minutes to setup environment for a build - 10 minutes.
3. Number of minutes to test one scenario - 10 minutes.
4. Number of testers in your team  - 5.
5. Assume that there are no blockers.

If you were to have no automated tests, the amount of time it would take to test one single check-in (in minutes) -    
      10 + (500*10)/5 = 1010 minutes.

This is close to 2 working days (standard 8 hours each). Not only is this costly, it means that if developers get feedback 2 days later. This kind of a setup further encourages mini-waterfalls in your iteration.

Team B -
Same as Team A, but we've automated 50% (250 test cases) of our suite. Also, assume that running these 250 test cases take a whopping 3 hours to complete.
Now, the amount of time it would take to test one single check-in (in minutes) -

      task 1 (manual) -   10 + (250*10)/5 = 510 minutes.
      task 2 (automated) - 10 + 180 minutes.

This is close to 1 working day. This is not ideal, but just to prove the fact about reduced cost, we turned around the build one day before. We halved the cost of testing. We also covered 50% of our cases in 3 hours, and that  Now to a more ideal and achievable case -

Team C -
Same as Team B, but we threw in some good hardware to run the tests faster (say 20 minutes), and automated a good 80% of our tests (10% cannot be automated and 10% is new functionality).
Now, the amount of time it would take to test one single check-in (in minutes) -

     task 1 (manual) - 10 + (100*10)/5 = 210 minutes.
     task 2 (automated) - 10 + 20 minutes = 30 minutes.

So in effect, we cover 80% of our tests in 30 minutes, and overall take 3.5 hours to turn around a build. Moreover, the probability of finding a blocker earlier, by covering 80% of our cases in 30 minutes, means that we can suspend further manual testing if we need to. Our costs are lower, we get feedback faster. This changes the game a bit, doesn't it.

Impossibility of verification on time

Team A that I mentioned above would need 50 testers to certify a build in less than 2 hours. That cost is not surprisingly unattractive to customers. In most cases, it is almost impossible to turn around a build from development to delivery within a day, without automation. I say almost impossible as this would prove to be extremely costly in cases where it is. So, if assuming that my team doesn't automate and hasn't got an infinite amount of money, every time a developer on the team checks-in one line of code, our time to verify a build completely increases by hours and days. This discourages a manager to schedule running these tests every time on every build, which consequently decreases the quality coverage for builds, and the amount of time bugs stay in the system. It also, in some cases I have experienced, dis-incentivizes frequent checking in of code, which is not healthy.

Early and often feedback


One of the most important aspects of automation is the quick feedback that a team gets from a build process. Every check-in is tested without prejudice, and the team gets a report card as soon as it can. Getting quicker feedback means that less code gets built on top of buggy code, which in turn increases the credibility of software. To extend the example of teams A, B and C above -

For Team A - the probability of finding a blocker on day one is 1/2. Which basically means that there is a good risk of finding a bug on the second day of testing, which completely lays the first days of work to waste. That blocker would need to be fixed, and all the tests need to be re-verified. The worst case is that a bug is found after 2 days of an inclement line of code getting checked-in.

For Team B - the worst case is that you find a blocker during the last few hours of the day. This is still much better than for Team A. Better still, as 50% of test cases are automated, the chance of finding a blocker within 3 hours is very high (50%). This quick feedback lets you find and fix issues faster, and therefore respond to customer requests very quickly.

For Team C - the best case of all 3. The worst scenario is that Team C will know after 3 hours if they checked-in a blocker. As 80% of test cases are automated, by 20 minutes, they would know that they made a mistake. They have come a long way from where Team A is, 20 minutes is way better than 2 days.


Opportunity cost

Economists use an apt term - opportunity cost to define what is lost if one choice amongst many is taken. The opportunity cost of re-verifying tedious test cases build after build is the loss of time spent on exploratory testing. More often than not a bug leads to many, but by concentrating on manual scenarios, and while catching up to do so, testers hardly find any time to create new scenarios and follow up on issues. Not only this, it is imperative that by concentrating on regression tests all the time, testers spend proportionately less time on newer features, where there is a higher probability of bugs to be found. By automating as much as possible, a team can free up testers to be more creative and explore an application from the "human angle" and thus increase the depth of coverage and quality. On projects I have worked on, whenever we have had automated tests aiding manual testing I have noticed better and in-depth testing which has results in better quality.

Another disadvantage is that manual testing involved tedious re-verification of the same cases day after day. Even if they are creative to distribute tests to different people every day, the cycle would inadvertently repeat after a short period of time. Testers have less time to be creative, and therefore their jobs less gratifying. Testers are creative beings and their forte is to act as end-users and find new ways to test and break an application, not in repeating a set process time after time. The opportunity cost in terms of keeping and satisfying the best testers around is enormous without automation.

Error prone human behaviour

Believe it or not, even the best of us are prone to making mistakes doing our day to day jobs. Given how good or bad we are it, the probability of making a mistake while working is higher or lower, but mostly a number greater than 0. It is important to keep this risk in mind while ascertaining the quality of a build. Indeed, human errors are generally behind most bugs that we see in software applications that we see, error during development and testing. Computers are extremely efficient doing repetitive tasks - they are diligent and careful, which makes automation a risk mitigation strategy.

Tests as executable documentation

Test scenarios provide an excellent source of knowledge about the state of an application. Manual test results provide a good view of what an application can do for an end user, and also tell the development team about quirky components in their code. There are two components to documenting test results - showing what an application can do, and upon failures, documenting what fails and how, so its easy to manage application abnormalities. If testers are diligent and make sure they keep their documentation up to date (another overhead for them), it is possible to know the state of play through a glance at test results. The amount of work increases drastically with failures, as testers then need to document each step, take screenshots, maybe even videos of crash situations. Adding the time spent on these increase the cost of making changes, in fact in a way the added cost dis-incentivizes documenting the state with every release.

With automated tests, and by choosing the right tools, the process of documenting the state of an application becomes a very low cost affair. Automation testing tools provide a very good way of executing tests, collating results in categories, publishing results to a web page, and also let you visualize test result data to monitor progress and get relevant feedback from test result data. With tools like Twist, Concordian, Cucumber and the lot, it becomes really easy to show your test results, even authoring, to your customers and this reduces the losses in translation with the added benefit of customer getting more involved in application development. Upon failures, a multitude of testing tools automate the process of taking screenshots, even videos, to document failures and errors in a more meaningful way. Results could be mailed to people, much better served as RSS feeds per build to people who are interested.

Technology facing tests

Testing non-functional aspects of an application - like testing application performance upon a user action, testing latency over a network and its effect on an end-users interaction with the application etc. have traditionally been partially automated (although very early during my work life I have sat with a stop watch in my hand to test performance, low-fi but effective). It is easy to take advantage of automated tests and reuse them to tests such non-functional tests. For example, running an automated functional test over a number of times can tell you average performance of an action on your web-page. The model is easy to set up, put a number of your automated functional tests inside a chosen framework that lets you setup and probe non-functional properties while the tests are run. Testing and monitoring aspects like role-based security, effects of latency, query performance etc. can all be automated by re-using an existing set of automated tests, an added benefit.

Conclusion

On your journey to Continuous Delivery, small and big, you would have to take many steps. My understanding and suggestion would be to start small with a good investment on a robust automation suite, give it your best people, cultivate habits in your team that respect tests and results, build this backbone first, and then off you would be. Have a smooth ride.










Friday, June 17, 2011

On durability

When I got married back in 2009, I bought a bunch of consumer durables - refrigerator, washing machine, a microwave oven and more. Although it was a big dent in my pocket, I chose to go for premium vendors. I was confident that my decision to do this was a good long term solution. However, during the last few months, I am having trouble with these automatons. To an extent that I have come to believe that the at least one person is evil in this realm. For the record, I am not against things breaking, not least when they belong to someone else. But these things cost me money, a lot of it. I picked up the phone, yelled eventually and got some service - eventually. We got things fixed for some time, and then again after some days I could notice people eyeing a couple of stains on my whites. Too bad, I called them again. Long story short, I was induced to buy an extended warranty and a service agreement with a vendor, and then again with another. I was fairly doubtful that people would believe me when I told them I was wearing beige these days. Not surprisingly, these agreements cost more money. I was sad for a while. Then the naive businessman in me, for a moment admired the business-model - build cheap, sell maintenance. I was pissed. Understanding the nuances of how one is getting shafted is not a funny moment. Its like a daemon, back of your mind but keeps following a schedule to pop up once in a while when you are down. But all this is not about me. Its about all of us. And it is, because I had another thought.

As software service providers, independent consultants, vendors and what not, how much time and money do we spend on creating a durable product. Personally, I believe this whole outsourcing business gets a bad name because of the aforementioned quackery and bad designs that follow. Most businesses end up paying multiples of the wads they paid for building, for maintenance. I am a bit thick, but eventually I get the pain of customers worrying about all this. If maintenance contracts are a pain for me, they are a pain for them. It is inevitable that a flaw will crop up once in a while, but designing your revenue model, as a vendor, based on tomfoolery like this is downright risky. The infinite improbability drive might just go faster by a couple of light years/second when you think of building a 100 year company this way. But frankly, a lot can be done to ensure you produce something durable. Its not all that tough if you are not lazy. There are three parts to it. Good design, great quality assurance and a pinch of foresight.

In what measures you need a good design, I would leave to your intelligence. But you need some good developers, and couple of great ones. If someone is still using ie6, chances are that they never thought long term. Spy on their file system - if their folders are modular, chances are that they think long term. They might componentise your system, and thus increase the shelf-life. When you say Facebook, do they start talking about data architectures, or do they send you a friend request? Ask them what they think of Twitter. If you get a tweet mention, thats bad. If they start telling you stories about architectural evolution of Twitter, it might be a good sign. Do they debate ruby/python over lunch, or are they worried about Monica Bellucci's next motion picture. Well actually there's nothing wrong worrying about Monica Bellucci. But you get the picture. I will summarize anyways. You don't need just a good architect. You need good developers passionate about craftsmanship. And not more than two of the cowboy variety. Do they know how to crash the JVM?

Great quality assurance. Books have been written about this. But the thing with these books is that its all kind of like religious books. Theoretical and dogmatic. And even when some people follow them by rote, they are living in the past. You need good QAs, diligent QAs and then some people who just can't think straight. And a couple of people pissed off at those developers as well. Innovation in this stream of work is highly underrated - do not make that mistake and hire a lateral thinker if you don't have one already. You need to test things when they are checked-in, not three months hence. Building a good automation framework might be a good idea. Releasing software every once in a while has been proven to have quite a good effect on software quality. Think not about 2 times a year, but 2 times a month. And then even earlier. It doesn't need to be on a consumer box, it just has to be releasable. People should be worried about those kind of things, at least more than they are about their next trip to the park. Are they proud of your product?

Foresight. Difficult. Being progressive is important to build durable products. I cannot explain much about why it seems intuitive to me. Whatever little I can, in a short sentence is - if you don't know what's the future of computing, longevity of your software becomes a probability - fairly low number, and not certainty - fairly close to it at least. Someone in the team should be reading blogs. Others should be reading some books. About technology.

Enough said. Life's too short to keep squabbling about incompetence. Build better products.

Wednesday, February 16, 2011

subversion keychain issue

I had been ignoring the annoyance of subversion not remembering my username and password on my new mac (basically svn not pulling in saved credentials from KeyChain). This meant that I had to keep appending all svn commands with --username every time I wrote one and then type in my password, got more annoying because my office LDAP username is different from my local user name. Well my fingers gave up after 2 days and I decided to fix it today. 
Seems like the .subversion folder was created by root (need to check if brew did this), and was accessible to root only (see below). The auth folder is where subversion keeps user names to pick up the respective password off KeyChain. Wasn't accessible to me, hence the issue.



This fixed it

Monday, January 3, 2011

Who should write functional tests?

Functional testing code is more often than not treated as a second class citizen. Delivery teams tend to ignore problems with test code over a period of time, and worry more about test results. This leads to poor code quality and bad test architecture, which in turn hurts the maintainability of a test suite. Its this negative feedback cycle that a team should be worried about. In my opinion, treating test code as responsibly as we treat functional code fixes this.

Understand the importance - The greatest benefit of a robust automation suite is the ability to deliver quickly, even continuously, with a known state of software quality. Given that business today demands quicker delivery of software to production, it seems quite imperative that teams take the one big block of code that enables them to do so very seriously.

Complexity - Lets face it, writing an effective and maintainable automation suite is not an easy job. Just like writing production code, it needs thoughtful design and care. Test infrastructure design is an ongoing process, and just like functional architecture it needs continuous re-design based on functional and non-functional changes. An ill-designed suite is a waste of time and money (no surprises there), it also influences decision makers incorrectly. You should be asking (some of) your best people to work on it.

Expertise - There are two parts to a maintainable functional suite, the framework (test infrastructure) which defines a DSL, and then the tests written using that DSL. The drawing below shows what I mean by this division.



  Dividing coding responsibilities based on the lines above is quite necessary. Developers are your experts when it comes to writing frameworks and maintaining them. Testers are really good at writing intentions and maintaining functional sanity. In fact with this virtual ownership division, testers become customers for developers, which distributes responsibility quite elegantly.

Ownership - If developers don't write functional tests, its imperative that they are not super-motivated to fix them when they fail as new features are added/amended. Having different owners of different parts of code not only increases cycle time, it also reduces the quality of feedback that the developers get from automation.

Work definition and process - When we write functional code, we tend to have well-written narratives, which have acceptance criteria, and are treated well. When writing the framework for the automation suite, or a new functional test, such narratives should be well defined and the same workflow should be maintained. Any work done on functional tests should not become a secondary process for the team. Such treatment de-prioritizes work on functional tests at the outset itself, and should therefore be avoided.

Conclusion - My conclusion here is that people who are skilled to write production code should design and write test infrastructure/frameworks, while analysts should write test specifications, as they are best placed to do these. And also, work on functional tests (stories etc.) should be treated as mainstream functional work.

Monday, December 6, 2010

Why teams lose faith in their functional test suite

In my opinion, there are three main reasons why a functional automation suite loses its value (or the respect that a delivery team should pay it). This leads to a variety of problems, but I will save that for a later post.

The reasons for me are -

1. Non-deterministic tests - run the same test again (without changing the code) and the test gives different results.

This, in my opinion is the least attractive aspect, a true demotivator for a delivery team to believe in its functional test suite. One of the reasons I have seen non-deterministic failures is when the tests run on hardware that's performance is non-deterministic, for example VMs which share hardware resources, and hence are dependent on how the other VMs on the same host are performing during the test run. Sometimes, tests are non-deterministic when they access external systems which are non-deterministic. Unless testing the robustness of the system under test, stubbing out such external dependencies have worked out well. Another reason is just bad tests, like tests which depend on the time which they are running at (unless functionally driven to), tests which have time-outs which are not well researched and more. Sometimes the tools used to drive tests are non-deterministic (we found a couple of them driving Silverlight tests) which makes it really difficult to believe in the results of the tests you write.
All non-deterministic tests can and should be fixed, but I have seen the effort to fix these being directly proportional to the amount of customer involvement/value to delivery. People also cite them as "random failures" and ignore them. I believe that every developer who calls a program "random" loses respect by a notch everytime they call it so.

2. Lot of failures in a functional suite - if there are 50 out of 100 tests failing, the team's belief dwindles.

The first question that you ask is how did we end up here? This has a lot to do with the broken window theory. If all tests pass and the next check-in fails some tests, the team immediately works to fix them. If a regular bunch keep failing and are thus ignored, any new failures will become part of the ignored tests. Again, this can be fixed. Reporting and analysing the failed tests, and constant effort improves the state and builds a positive cycle. However, as before, the intent to improve the state depends on how valuable the tests are for delivery/customer.

3. Time taken to run the suite once - feedback comes in very late.

If a single run tests a lot of check-ins, it becomes really hard to analyse/debug the failures. If this is coupled with 1 and 2 above, it results in a total loss of faith, as fixing takes a lot of time and frustration levels go up. So even if this might not be a primary reason, coupled with the ones above, it deteriorates the dependency the delivery team has on automated functional tests to certify a good build. One way to fix performance is to distribute tests to a grid infrastructure, where different nodes are responsible for a subset of tests, thus parallelizing the execution and cutting down run time.

The factors listed above can all be fixed, with investigation and investment. My experience has been that the intent to spend time and money on fixing these issues is directly proportional to how much a customer believes in them, and asks the team to be accountable for. Teams can also increase their own accountability by exposing results and analysis. If, for example, a delivery team publishes test results as part of release notes, or publishes/exposes the results in some other way to customers, this creates awareness as well as increases accountability of its work on the suite. That for me is the optimal solution.

Thanks - to Martin Fowler, Manish Kumar and Chethan V for helping me crystallize my thoughts on this.


Thursday, September 10, 2009

Lean as a toolkit towards agility

Over the last few years, agile software development processes like XP and Scrum have gained a lot of prominence, and a majority of organizations have tried one of these one time or another. I mention two here - Scrum is an agile management methodology, XP more of agile engineering practices, because on various fora, you will find debates on one over the other, and sometimes hopefully one mixed with the other. Both the methodologies are practical extensions to the agile manifesto, concrete processes that supplement a philosophy. Unfortunately, there seems to be a majority which assumes complete agility is achieved when practicing one of these methodologies or a mix of them. Blindly adopting a methodology is no different from following older so called non-agile practices. Granted there might be immediate visible improvements with better communication and development techniques, this ends up as just a linear function of the ability to change for the better. Given time, if self organized teams do not retrospect and improve on past performances, they hit roadblocks as before, may be just a little further, but they do. The last line of the manifesto -"At regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behavior accordingly" is a statement that is very important, yet ignored, or practiced weakly.

Indeed, what the final tenet on the manifesto encourages a team to do is make sure it is learning and improving its processes over time. Scrum or XP are just basic frameworks which facilitate an initial shift, and then the team must make sure that they choose the right processes under given circumstances to be effective and inclined towards a delivery. During such a process a team might decide to break one or more basic tenets of the framework they choose to follow, if it improves their effectiveness and delivery. For example, YAGNI is a fundamental tenet of XP, but a team might choose to tweak it if they find it necessary to design upfront on certain tasks. Theorists and fanboys will shout foul, but realists and seasoned practitioners will understand the needs to do this. (Having said this, the team must at all times be congnizant of the risks taken when making these decisions, and observe changes closely for a few iterations.) These tenets are not commandments to follow staunchly, they are guidelines to follow a philosophy of continuous improvement to deliver the best value to a customer. They provide a meta-plan, or a planning approach, which is not any more important than responding to change. This realization itself facilitates better localized processes. The argument above is an example of improvement within a process. The shift towards learning from Lean Manufacturing is an excellent example of how not just a team, but the community as a whole looks around for synergies in different workplaces to improve business processes and deliver more value.

The commercialization and marketing of XP or Scrum as a development practice, sadly has not done justice to Agile as a philosophy. As more and more corporations try to join the bandwagon, I have noticed frequent claims of agility subsequent to following an XP or Scrum approach. While adapting to XP and Scrum practices is not wrong, it does benefit a team to do so, but if the team does not understand the underlying philosophy behind the practices, it fails to see a fundamental dimension of growth. By looking at just the practices within the methodology for improvement and not without, it ignores tools and practices that might help itself to improve on a larger scale. Any failures in such a premise becomes then a failure of the process followed, and management decisions being made to repeal processes. Such superficial following is thus harmful for the corporation as it misses out on a good philosophy as it chooses to discard it, based on an implementation fault.

Lean software development is the new kid on the block. Lean manufacturing and Agile Development are very aligned towards each other, for starters both are people centric and concentrate on reducing wastes. As the hype to be Lean increases, similar superficial dabs will increase, some will succeed and some will fail. Should such failures count as failures of the Lean model? Should teams stop striving to be Lean when they figure out that they would not be able to stop the line, or follow another much proclaimed Lean tenet? Shouldn't teams be looking at Lean practices as empowering tools and not laws? By educating itself towards a mindset and not just manifest of it(read Lean), wouldn't a team benefit more?

In fact what a team should figure out is that XP, Scrum and now Lean are just toolkits that facilitate aligned and effective delivery of customer demands. Stand-ups, TDD, Scrum of Scrum, Frequent releases are all a part of a greater toolkit that facilitate agility, and a team should pick and choose what is most effective for itself. Lean, like XP is a set of practices teams have evolved over a period of time to be more effective. It has evolved over prior practices like Taylorism and Fordism at Toyota. The point to note is that Toyota did not invent or create a lean process out of nowhere. They optimized based on a given set of circumstances and experimented new approaches towards manufacturing. That is an instance of pure agility. The current state of Lean is just a product of agility, and thus provides a good toolkit, just like XP and Scrum and other agile manifests do.

By treating these manifests as toolkits, we open up our perspective and become more responsible to better delivery, by following them blindly, we close our eyes and wait for good things to happen, just because we are following a process. Indeed, true agile teams improve continuously. Upon hitting roadblocks, they come up with solutions which are measurable in terms of effectiveness. They break dogma and blind faith to increase effectiveness and thus deliver more for the same dollar spent than the last iteration. And when do they stop looking for improvements? Never.

Tuesday, February 10, 2009

Is your Functional suite done right?

On the last two projects that I have worked on, both being fairly sized in terms of people (40+), I have seen enormous effort being spent on functional testing. The effort, though not completely wasted, hasn’t yielded proportional gains in terms of quality improvements and quicker feedback on a higher integration level. The following list tries to address issues and my take on fixing them.

Separation of Concerns
Functional suites suffer most from a lack of clear directive on what they are written for. Adding view tests (testing windows UI/html output) which cannot be tested by your regular unit tests to your functional suite is a recipe for disaster. View tests do not belong in the function suite. Unfortunately such tests form almost half of the suite. Coupling these not only increases the run time of a suite, it also mandates that the same testing tool is used for both these sets. Think of an ASP.Net website you are developing. A view test suite can be written using the lightening fast NUnitASP toolset, because you wouldn’t need to attack cross browser compatibility issues and integration between your user interface and services, and you can write your functional suite with Selenium or your favorite browser based testing tool. Also, view tests should be a part of the tests that a pair runs before they check-in, while all functional tests might not (teams generally decide on a subset as smoke), so dividing your tests judiciously between view tests and functional tests is of utmost importance.

Learning: There should be a clear divide between regression functional tests and view tests. Before adding a new test, ask yourself whether you want to test just the view or integration and functionality.


Performance
Functional tests need to be extremely fast. If not, then scalable with or without parallelization. The functional suite has more often than not the longest feedback cycle on any project. The longer running time forces teams to not include this feedback in their primary builds. A functional suite that takes 12 hours to run is not going to help you a lot with continuous integration. It’s not just the wait that hurts, it’s the lack of granularity at the individual check-in level on how new development affects existing functionality.
If the testing tools you are using are not extremely fast, you must look at creating parallel streams of tests to be run. Continuous Integration tools like Cruise with agent based architectures allow you to divide tests in separate meaningful suites and parallelize their run. For browser based applications, tools like Selenium Grid help you do that a level lower than CI (if you use selenium for your tests), and are an excellent choice for distributing your tests across multiple machines. But before you jump onto this, your suite must be ready to be distributed. A classic example of bad design is when you login with one user’s credentials in all your tests. Most sites/applications don’t allow this and hence you cannot parallelize such tests (this one is easy enough to fix, but the team should be on the lookout for such issues). An idea that I am trying to push within ThoughtWorks is to have a small SeleniumRC cloud, which any developer can use to run their tests, hosting the server on their machine (this can be taken further to have a SETI like “use hardware when idle” approach, but I will be happy with 10 regular boxes dedicated right now). This would make the feedback cycle shorter and probably help include regression tests as a primary test target rather than a nightly build target.

Learning: Design with parallelization in mind. Keep the suite run-time within a decided time span (like say 20 minutes) and keep analyzing + refactoring the suite to keep this below the limit.

Ownership
Having all your functional tests in one suite makes it very difficult to divide tests into regression tests and specifications. When this happens, developers and analysts are all involved in writing functional tests which is a mess because developers want refactorable tests, written in their favorite language and analysts want an easy way to specify tests and run, preferably a tool with recording support, which developers don’t like because they are fragile, and hence analysts are forced into complicated object oriented functional suites which takes them away from their primary concern. Fixing tests broken by new functionality is another ownership issue between developers and analysts.
Ideally, the analysis team and the quality team should specify new behavior for a release using tests written with their choice of tools, and the development team should work towards making these tests green. This suite may not be monitored for failure, but must be monitored for progress. This suite doesn’t need to be highly performant and can be run per iteration. The development team on the other hand should have its own refactorable regression suite, based on the specifications. Failures in this suite must be monitored closely, and this suite should be written with quick feedback in mind.
Ideals aside, it’s much easier to manage the quality process of a team this way because all suites have their dedicated owners with concrete responsibilities on who runs what, and what runs when. It is easier to track progress, and people who write code are responsible for fixing tests they break by adding new functionality, not just leave them by to be attended to by someone else.
Learning: Do not hesitate to use different tools for different goals for suites within a team. Have a specification suite and a regression suite to define clear responsibilities.


The most wonderful thing about functional automation is its fun and easy to do. To write an effective suite, one must treat it as one treats the product it’s supposed to test, with clear design goals and responsibilities, an eye on performance every now and then, and effective reporting on progress. Don’t leave it as the dark side of the project, but as the release mechanism that gives you confidence.