At TestBash Brighton 2016 one of the superb talks was on “Building the Right Thing” with Lisa Crispin and Emma Armstrong and the talk started with a very basic task – “make something that flies through the air”. My first thought on hearing this task was, “I’ll scrunch up the paper and throw it; that would be funny” quickly followed by “but I’m sure they don’t mean that and anyway I can make superb paper planes which would be more fun”
I made a plane ready to fly and then we were told that the requirement was one of Minimum Viable Product (I hate this phrase but that’s a separate blog post!) so my initial thought would have been spot on. Gutted.
I’d completely over thought the task at hand.
This year at TestBash Brighton in their excellent mentoring workshop Shey Crompton and Nicola Sedgewick tasked us with something similar; the task was to build an actual “paper plane” this time though so I got to make a great paper plane and be on task for the challenge. Win win!
This time I knew that as it was a mentoring workshop there would likely be a challenge to teach and learn what we created so I decided to design my plane in as simple a way as I could think of; I based it on a simple dart shape rather than complex wing design with the idea being that if I threw it hard enough and it would fly a nice long way anyway but there would be a benefit of being able to easily teach the design. As predicted it flew across the room and happy was I!
We were then told to describe the process of building our planes to a partner (nice to know my intuition was right on that).
I was also confident that I could easily describe the process for my partner to replicate my plane…
Except that what is simple for me isn’t necessarily simple for others. I understand what my mind can process, understand and retain easily but I can’t know that of someone else, especially someone I’d never met until that moment. My partner did successfully follow my instructions although it was far from as easy as I’d envisioned it – I’d have been better off making an even more simple plane that was not as detailed with the trade off in flight capability being matched in reproduction capability.
(K)eep (I)t (S)imple (S)tupid
As testers we often can’t see the wood for the trees. We look at a system or a website or an app to be tested and immediately our mind fills with the potential things we can subject the system to in order to test it. Sometimes our mind is exactly on task and the intended tests are precisely what’s needed. Other times though we over-think it and end up wasting time in convoluted tests or bug investigations that could have been much simpler and allowed us to move on to other important tests.
We also try to make judgements and estimations of others’ capabilities whether it’s in our bug reports, test scripts (ugh!) clients demand or in meetings we have with our team.
Let’s look at some practical examples:
Example #1 – “I’m going to create a tool for test data creation”
Except that an account takes 2 minutes to create and the project only has 2 days of test time so in reality you would have used 2/3 the amount of time just doing it manually for the duration of the project.
Example #2 – “The main banner image is missing on one specific browser”
You’re testing a website and your baseline is Google Chrome. You’re happy with your coverage on the baseline and you move on to test on other browsers. Firefox passes with flying colours, Edge is happy and so is Internet Explorer 11. Safari has a couple of minor issues that you report and the next browser on your hit list is Internet Explorer 10.
You boot the clean image test machine, clear down cookies and cache and load the page but the main banner is missing.
A quick investigation verifies that the image is declared correctly in the code and it exists on the server when the URL is added straight into the navigation bar of the browser so it seems like a rendering issue.
Better bug it!
Except that you reckon you can track down the exact cause of the problem and provide the developer with a load more info than a simple “Banner image on Internet Explorer 10 does not display” bug report.
You end up spending a further hour figuring out the full problem (the CMS uses a handler that’s declaring a different mime-type than the image actually had) and you report the full details. How great is that?
Wait… you used an entire test session to investigate that? So now there’s another charter than needs to be dropped or postponed?
Maybe just bugging it after 2 mins of investigation and letting the developers investigate the actual cause was a better plan even if you were providing far less information with that bug report.
Over thought it again (and ironically thought nothing about the other charters impacted by over-thinking this one thing)
So you see as testers it’s very easy for us to get carried away with our work, over-think the task at hand and produce less value to the long term project targets in favour of short term value to that single task.
Try to be mindful of Pareto’s Principle – you’ll get 80% of the output from 20% of the input and most of the time 80% is more than adequate. If you know you can investigate further perhaps add that offer into the bug and discuss it at the next scrum or with your project manager and allow them to decide.
Recently I did a 99 second talk at TestBash in Brighton in which I compared golf to testing:
Now I know golf is a rather dull subject to many people who may have never played it so I’ll try not to bore you all too much. 😝
The thing with golf as anyone who has ever watched it on TV knows is that you only ever need a single golf club to play golf at the highest level… Rory McIlroy, Sergio Garcia, Seve Ballasteros, Tiger Woods… They all only ever used one club! Let me explain…
- Those guys tee off using a driver to hit the ball as far as possible. BOOM and down the fairway the ball flies.
- Then they use the driver again to punch the ball down the fairway or onto the green.
- And if they hit it into the rough or a green side bunker guess what club they teach for? That’s right, the driver!
- Then they’re a foot from the pin where just a nice little tap in will win the hole. What club do they use?
Except that actually nobody uses one club all the time because at the very best professional level in golf it would be inefficient (and for the vast majority of weekend warriors it makes the game completely inaccessible and not even remotely enjoyable.)
Every golf club has a purpose and an ideal use case; some clubs have multiple purposes or several use cases but no club fits all the purposes and all the use cases and yet I’ve met testers who apply that “one tool fits all” mentality to their testing daily!
I’ve met testers who use one set of browser developer tools and never try others (even berating others without trying them!) and I’ve met testers who have a certain choice of tracking their exploratory test coverage and they never look into other possibilities:
- Is a notepad doc sufficient, useful and logical for others to read?
- Would Rapid Reporter give the notes more structure?
- Is the time trade-off worth the benefit to YOUR context?
- Should you be tracking coverage in a spreadsheet because your client is a 3rd party who requires a visual representation of the testing done?
- Does your tool integrate directly with your issue tracker? Should it?
Do people in YOUR context regularly evaluate your testing tools and techniques to make suggestions on improvements or do you sit quietly and not question the status quo?
For a long time I myself had one single choice of Proxy software to analyse what’s being sent from and fired back to a test app. I knew what I could do with it so I never looked into others, what was the point when I knew my choice of app well?
Sometimes tools can be used for purposes other than the intended one, much like someone may choose to “bump and run” a golf ball from the fringe of the green with a club normally intended for long shots off the fairway – For that specific shot it’s a great option. For that specific test you want to do perhaps cURL + JQ is a superb option to pull in some JSON and reorder it for comparison but for the rest of your testing there may be little value in those tools.
As testers we should strive to read about tools, try tools out, make notes on how you might use those tools best in your daily testing work and then maybe leave the tool alone until the task at hand DEMANDS that tool!
Maybe that will put us in a far less biased position when it comes to using the right tool for the job and it will expose us to more tools, better tools and ultimately make us more efficient in our day to day work.
The driver is not the best club for all shots all the time.
EDIT: This post was mid-draft when I was reminded by @NicolaSedgwick on Twitter of @PaulHolland_TWN‘s #TestBash #99SecondTalk about not attacking people with a different understanding of terminology than you. I’m pretty sure there are improvements yet to be made on it but for the sake of remaining relevant I’m publishing it now. 🙂
In life there are misnomers everywhere:
- “Coffee beans” – they’re actually coffee “seeds”
- “Dr Spock” (Star Trek) is actually Mr Spock
- Panama hats were designed in Ecuador
- “Personal PIN Number” – PIN is an abbreviation for “Personal Identification Number” so in effect you’re saying “Personal Personal Identification Number Number”
- “Wherefore art thou Romeo,” is often understood to mean “where are you Romeo?” When in actual fact “wherefore” means “why” – Juliet is asking why he is a Romeo, not where he is.
- “Peanuts are nuts” – They’re not, they’re legumes as they grow under the ground, not on trees.
- “A tomato is a vegetable” – It’s a fruit.
- Koala Bears are not bears.
- “Slider” is sometimes used to reference a gallery/slideshow/carousel function
- “Checkbox” is frequently used to reference a radio button. Checkboxes though are multiple-selection “AND/OR” elements whereas radio buttons are single choice buttons.
- “Drop down” has been know to reference an expandable <div> because the panel drops-down to display content.
- “Quality Assurance” is regularly used to reference “Testing”
And so on.
So let’s use an example misnomer and discuss the potential states of it.
We’re working on a website that is early in development and has had no technical team members on it. There is a header section <div> that expands when clicked to reveal additional content.
The project manager and project sponsors have always referred to the <div> as a “header dropdown” as from their perspective that’s what it is; the header “drops down” when clicked.
With misnomers there are several states of understanding:
- Misnomer is accepted as being factually correct and not questioned. This often occurs on projects where the beginning of the project starts with substantial planning from non-technical team members or those with less domain knowledge.
- Misnomer is known to be factually incorrect but the person understands the context and sees no reason to correct the misnomer or hold a discussion about it. Using our example misnomer, a developer may start working on the project and as there is no other way to implement the required functionality (it is not achievable via an actual “drop down” element) there really is only one possible thing it can be right now – an expandable <div>, so why waste everyone’s time holding a conversation about it?
- Misnomer is understood to be factually incorrect. The person feels it is necessary to correct the originator of the misnomer with their own understanding of the misnomer. OK so let’s say a tester joins the project and decides that in order to make sure everyone is using the correct terminology and avoid any ambiguity the current “header dropdown” misnomer should be corrected. This will ensure that any future functionality that may be added to the header is not confused with the misnomer, say for example a “language” dropdown is added. There is no room for discussion here; it must be corrected.
- Misnomer is known to be factually incorrect. A conversation takes place with the project facilitator to gauge any potential risk associated with the misnomer. An agreement is reached whereby the most efficient solution is implemented for the misnomer. Using our example let’s say the tester joins but instead of taking a “must have” approach they explain to the project facilitator that there are ambiguity risks with using the terminology but that those risks may be extremely low depending on the development plan for the project. For example if there will never be anything added to the header of the site that resembles a dropdown then referring to the expandable <div> as a “dropdown” will have little impact. The project manager explains that once the site development is complete it will be handed over to a 3rd party and there will be no maintenance period and no future development on the project for the development team. The project is also only a 3 month project and there are no team member changes planned that could cause confusion with terminology. The “header dropdown” misnomer is then deemed to be unimportant and everyone is happy to use the misnomer rather than set about re-educating the team.
Inadvertent use of misnomers allows quicker communication in groups where there is a joint understanding of the domain being discussed and the context of the misnomer itself by all parties but herein lies the risk of not correcting or discussing a misnomer; there is an inevitable assumption that everyone has similar domain knowledge and understands the terminology used equally, hence there is no need to discuss or correct it.
Using terminology or descriptions which are even slightly ambiguous to those with different domain knowledge can create issues though.
From my perspective what we need to strive for is the establishment of a common “language” on projects that allow the participants the ability to discuss things in the most efficient manner possible adding as little ambiguity into the discussion as possible, not to pick apart people’s differing descriptions of the same thing.
I test software. I perform quality analysis. If you say I am “QA’ing” your software and I understand the task to be “testing” the software then there’s little value in me trying to redefine your vocabulary or use of the English language. If anything it will make everything even less efficient as there will be a period of readjustment before the new vocabulary is “normal” in day to day work.
The approach I normally take is to reiterate the required task using the correct terminology, for example:
“We’ve added Search functionality to the header dropdown on the staging environment. Can you exploratory test it this morning please?”
“Oh cool, yep absolutely! Exploratory test the Search function in the expandable header <div> – got it. I’ll start right after this test session and feed back as soon as possible.”
Now having said all if the above I do think there’s a huge benefit in establishing a common vocabulary within an industry or community where our “domain” is the industry itself. But that’s a whole new blog post… 😉
Once upon a time, in a company (thankfully) far far away I had an idea for a blog post I titled Testilocks and the three cares where I planned to detail the extremes of testing and what could be considered “just right”
Recently the excellent Katrina Clokie blogged about the testing pendulum which is along a similar vein although truth be told her analogy is a lot more fluid than mine and allows for the reality of testing where every single level between too deep and too shallow would be explored. Mine at the time was going to be about three individual states to summarise just the extremes – a lazy blog post in all truthfulness so perhaps that’s why it disappeared from my mind until it was blown away by Katrina’s excellent post – http://katrinatester.blogspot.com.au/2016/12/the-testing-pendulum-finding-balance-in.html
“too hot” was going to be “too deep”
“too cold” was going to be “too shallow”
“Just right” didn’t need a comparison. 😉
In any case there’s little value in posting details of that phantom blog post beyond the ones above. Instead I’ll skip forward to a short conversation Katrina and I had on Twitter where I said there would be times I would deliberately deviate from the ideal testing position on the pendulum Katrina describes.
For me everything is context and as a micro-term contract tester my daily testing is fairly different to the majority of testers out there I think. I tend to work on projects that range from half a day to three month extremes but that are generally three or so days long. That exposes me to everything from iOS apps to Android apps to Responsive websites to API verification to standard “we built this and we’d like it tested” web design.
Because of the micro nature of contracts quite often I have to risk-analyse the project, prioritise specific functionality as more important than other functionality based on designs, user stories, quizzing product owners etc and concentrating on those areas as priority, sometimes at the detriment of other functionality that I’d like to have also covered in an ideal world. And still get the project tested to a good enough standard within what are normally extremely tight constraints.
In the vast majority of cases the choice to deviate from more or less ideal testing depth is driven by time or budget constraints the client imposes.
Potential candidates for “deliberately too deep” testing on functions might include:
- The website is brand new and been built in-house by a client’s dev team and based on no known CMS. It needs a extensive look at layout and how specific objects render and affect others. Also CMS to front end functionality which I’d normally explain to the client we don’t test.
- The site is live and they’ve added a key new section detailing terms, conditions, business details etc and they’re worried about public perception if this info is incorrect. They’ve placed a higher value on this than the site’s actual functionality.
In the majority of “too deep” scenarios we have plenty of time and plenty of budget to thoroughly test many functions to a much higher level than would normally be suggested. The client has peace of mind that we’ve done an extremely thorough job of the testing and tested much more than we feel is necessary.
And for “deliberately too shallow”:
- It’s an existing website where page templates have been tested extensively already so layout is lowered in priority and testing will be shallower than ideal.
- The app is live and they’ve added a singe new feature. They have limited budget/time and want that feature tested thoroughly but are confident nothing else will be affected and are happy to ignore any of our advice in favour of adding other correlated or key functions into focus.
- The website is a framework and the client has said they are aware of content issues and are not fussed as they will update content at a later date. I then lower content’s priority and test at a shallower (or non-existent if needed) level.
In almost all cases shallower testing occurs because my focus is on more business-critical functionality or an area that the client has expressed concern about. It’s inevitably a calculated risk but a necessary one.
Now by no means is this an exhaustive list but I’m hoping it will be a step toward explaining my mindset when certain functions are deprioritised into shallow testing or others are prioritised enough to go deeper than I’d ideally like.
If I had full control over everything that came in through the door I’d also aim for the Goldilocks zone:
Comments always welcome. 🙂
Recently at an event I overheard a conversation between two testers in which one tester was describing some examples he’d apply to an input field.
As the conversation went on there were a lot of “tests” (which I’d call “checks”) I found to be basically pointless and it got me thinking about how as professional testers we automatically use our experience, knowledge and heuristics to decide on the best approach and best set of tests/checks to run – from applying every test known to man to no tests at all, with anything in between.
The tester in question had used an example of:
“So in the First Name field you might try a vowel, a consonant, two consonants, two vowels, a vowel and a consonant etc”
To quote the phenomenal Michael Bolton (@michaelbolton) – “In traditional parlance, domain testing involves identifying equivalence classes— sets of things that we expect the program to (mis)treat in the same way” and to my mind from a pure data perspective, whether a consonant or vowel is used to check an input field’s validation or submission makes no difference.
If I was looking at physical size, letter widths etc that’s a different story completely, as would checking Unicode characters which are most definitely not an equivalent to non-Unicode characters but to my mind those tests I overheard are simply a waste of time that could be better spent testing something that could reveal valuable information.
Sometimes testing isn’t just about what we DO test/check, it’s about what we DON’T as well…
A few weeks ago there was a problem with our phone system in the office; whenever the main phone rang and was picked up from a different phone the main phone registered the call as “missed”
There was a discussion about how nobody was answering calls except for the office manager and various people were told off who had been in the office at the time even though everybody insisted that no calls had been missed.
A discussion took place the following morning where the following phrase came up:
“It’s a simple phone system so it can’t be broken; it has to be the testers not picking up incoming calls!”
As you’d imagine this assertion immediately got my Testy Senses™ tingling so I set out to prove the opposite; that the phone system can indeed have a defect and quite possibly did (or our domain knowledge of the phone system needs improving to understand why it’s behaving in a way that seems unnatural).
My internal response to the assertion was instant, powerful and absolutely 100% necessary for me to investigate immediately – this couldn’t wait and I would stay late that day if I needed to!
I time boxed 15 minutes to investigated the issue, started straight away and did indeed find a bug with the system as expected.
After the fact I realised that the intensity of the reaction I’d had (which made me want to investigate desperately) was caused by the assertion itself – it was a statement of believing absolute fact with no evidence to back it up and my experience as a veteran tester immediately disagreed with it being “absolute”.
And that was the moment my “red rag” heuristic was born:
“The red rag heuristic” – Having an almost uncontrollable urge to charge straight into a situation with no pre-planning, no thought for the consequences or what effect there will be on other things that require your time or skill.