Enter the maze

Usability: When too much science can be bad for you

Science is the best way we know for generating reliable knowledge and our modern technological society wouldn't exist without it. But are there situations where science hinders rather than helps us when developing new technology? Perhaps surprisingly Microsoft's real-time strategy game 'Age of Empire II' has played a part in suggesting there are.

Inspecting one of many

Dennis Wixon of Microsoft believes scientific thinking has taken us down a blind alley. He has argued that scientists have got their priorities all wrong when they try and devise better ways of checking if new software is easy to use.

According to Wixon, the thing about scientists is they want to do science, and science is all about having ways of collecting knowledge based on strong evidence. Give them problems like "how many problems are there in this design that make it harder to use?" and they have reliable ways of finding out.

To answer that kind of question, you would, as a scientist, get lots of people to help evaluate your new software. That is important as the more people who use it, the more potential problems you will find. You would also include men and women, people of different ages and educational background and from different cultures. That is really important as otherwise any answer you get might only be true of a particular group (eg you might just show that the software is easy for men who went to university to study computer science to use but despite that 'normal' people might still really struggle). Only by testing it with a wide range of people from different groups can you be sure it is usable by those groups. Similarly each person should use the gadget lots as if they only use it once or twice the mistakes they make might just be due to them being unfamiliar with it and not long-term problems at all.

The trouble is that coming up with an exhaustive list of faults is not in practice that helpful. What really matters is how many problems can be fixed by the time the product is released. When you present the long list of faults to the programmers they might not believe they really were a problem. They may even see such a long list as an attack on their programming skills and so discount it - they are the programmer and they know best! Worse, though, by the time you have collected this long list of problems it may well be too late to fix them anyway as the design will have changed several times since then and anyway is already on sale!

So where does that leave the usability consultant? Does it mean that they should give up? Or perhaps just find as many problems as they can and hope some get fixed? Well no. Wixon argues we were asking the wrong question in the first place. When developing a usable product to a budget and deadline, the question is not in practice about "How many problems can a method find?". The important question is "How many problems can you identify and fix in the time available?" What you need from a practical usability method is that it convinces the development team that there are problems to be fixed and that they do then fix them.

"How many problems can you identify and fix in the time available?"

The kind of method that actually get used tend not to follow the scientific method but are what are called 'discount usability methods'. They aim to be quick and easy, and above all persuasive. Wixon himself advocates his own such quick and dirty method called RITE. In this approach, you do user testing but only a couple of users at a time. "What!!!", cry the scientists...but remember we aren't trying to do science but improve a product. The key thing in RITE is that the whole development team are there to watch each usability test. They are told to keep in mind three questions: what problems are seen, can they be explained and can they be fixed before the next test.

When users struggle they see exactly what is happening. Seeing a real person struggle can be much more persuasive than just having an entry in a list saying that people couldn't find the power button. Some things may not be worth fixing or it may be better for some to wait and see if others have the same problem. However, if the design team are there watching they can decide how important it is to fix compared to other problems seen, and also how easy. Any problems seen that the design team agree are worth fixing and can be fixed are fixed...before the next round of user testing is done.

That is actually a big advantage. Problems that can be fixed are fixed quickly. That also means that the later users definitely won't encounter that problem. They will have an experience closer to that of using the final product and are more likely to come across other problems that were being masked by the one fixed. The development team also can get quick feedback in the next round of testing as to whether their fixes helped. If not they can try again. Better still, at whatever point you run out of time, you have the most usable software that you could.

So where does 'Age of Empire II' come in? It was one of the early industrial level case studies where Wixon's team used RITE. Wixon's team used it to develop an engaging tutorial game for newbies to the game. While one case study does not prove a method, the resulting game gained glowing reviews, was praised by people who used it and won a string of awards. One way or another they had ended up with a great tutorial game and all without following the scientist's method.