Are you Making it, Or Testing It?
Illustration of happy productive beavers generated by Dall-E. No dam beavers were harmed in the creation of this image.

Are you Making it, Or Testing It?

I had a conversation with a colleague the other day, and I had mentioned the line that exists somewhere between making something and testing something. This was in relation to machine learning, where it is clear there is an advantage, maybe even an obligation, for testers to develop intimate understanding of statistical models and data sampling methods to appropriately test an ML model or system using it. But there is also a weird transformation on this path where the person shifts from trying to learn about the system’s behavior by testing it and instead trying to build the system. I worry not so much about the shift as I do leaving behind that exploration such that nobody is doing it.

It is kind of a fuzzy idea. The line is more of a zone and an artificial philosophical boundary meant to separate two activities whose only distinction might be intent. That makes it difficult to really describe or explain.

But it did remind me of an experience I had with integration testing. The problem is not entirely the same, but it is similar.

Several years ago, test managers in the Office team were concerned there was not enough cross application integration testing happening. There had been a discussion, and word of it got to me. I proposed a cross-team initiative. I had an idea for an approach.

The approach was this:

1.??????Work together or with a partner solve a problem a real customer would have with Office

2.??????The problem must involve three or more of the Office client applications.

3.??????While attempting to build it, when you come across any impediment that keeps you from solving the problem in a straightforward way, report the bugs and move on to another problem.

I wish this had been a blazing success. I wish we had reported tons of bugs with this approach. Something else happened.

The first part of the initiative was a success. We did not have immediate approval to work on the problem, so I knew we needed to produce fruit right away. I had been given some solid testers, even if only with partial time. My directive to them was to come up with as many bugs in new areas as they could by combining two or more of the Office products together in the tests. We needed to buy credibility by demonstrating the capacity to find problems, so we wanted output quickly.

One of the testers chose a two-pronged strategy, based on the bug list. He chose areas that either had very high bug counts in the last release (meaning feature is hard for developers to get right), or near zero bug counts in the last release (meaning area is likely under-examined). He then combined those areas together into usage scenarios. Bugs flowed.

Another tester went after the clipboard. Copy/Cut/Paste between apps was covered well in prior versions. This tester’s suspicion, though, was that if one were to do copy or cut from product 1 and paste into product 2, and then do some other action with the content that we would see bugs not previously reported. This effort was likewise productive, as weird little oddities popped out trying to get the secondary actions in the different client applications to handle the data that had just pasted over.

Our first week or so of bug finding gave us more time to go to the second phase. This did not fare as well.

The first problem had nothing to do with the methodology, and everything to do with cross team dynamics. At the I asked the team participants how much of their time they could put in. Not one said 100%. The best I got was 50%, although the answers were more like 20%. I had seen this kind of thing before, and rarely did it work out. I have a general heuristic. There is no such thing as half a person. When a manager says, “Put 20% of your time into it,” it is a guarantee that resource is going to be pulled from whatever “IT” is quick. One by one that happened on this team.

The second issue did involve methodology. Two of the team members had come up with an idea for a problem that involved going between three of the client applications. I do not remember the nature of the problem they were solving, but the persona was for an administrative assistant, no development background, having to keep track of something in the workplace. They described the problem to the group in one meeting. It sounded interesting. We agreed to meet later and talk about progress.

We met a week or two later, and I asked how things were going. Did they make progress? Had they found any interesting issues?

What they told me is they had a way they intended to solve the problem. They worked on it for a bit but found something that wasn’t working right. So, they experimented and concluded they needed to write some custom code to get around the issue, and were proceeding with that, and…

…and that was where it all fell apart.

The entire purpose of the exercise had been to find that thing which was going to impede further progress solving the problem. The persona had been selected assuming no coding skills. The point was to use features of the apps which did not need custom code to solve the problem. Rather than working up a custom code solution and keeping on solving the problem, they should have written up their report describing the impeding issues and then moved on to something else. They were behaving as if they really were some kind of IT shop, or solution vendor making something to sell instead of a tester trying to find problem behaviors.

It was around this time that the half-a-person allocation thing came back to bite us, and the team members were called back to other pressing priorities. My experiment done, I returned to something else.

An interesting result of this, though, is that test leads and managers had been watching, and I saw a lot of adoption of “scenario testing” across the Office organization. Testers were formalizing as tests the attempt to simulate something a user would do – complete some task – and from that finding bugs that were between the cracks of functional components. They were not quite as ambitious as my integration project experiment, but it was a bit of a new tack for the organization.

But I am always reminded of the psychological stumble we ran into with this project. Somehow, these individuals forgot they were testing something and instead went full on into making something. That something existed for no other reason than to be a test of the platform they were building on. It did not need to be finished; it didn’t even need to work. It only needed to motivate discoveries the tester could report as bugs. Despite that, testers almost couldn’t help themselves. They started avoiding problems instead of finding them. They started DEALING with problems instead of reporting them.

Katja Obring

Quality in Agile and DevOps | International Speaker | Author | UKITB

1 年

"The line is more of a zone and an artificial philosophical boundary meant to separate two activities whose only distinction might be intent." I think this is important, and at the heart of a lot of the confusion we see around can testers still test if they know the code - as you know, I firmly believe the answer is yes, and the longer I work in well performing teams, the less I think a tester needs to be role, it is a perspective someone adopts for a certain amount of time. And your story supports my belief that people like to fix problems more than they like to report problems, which is a limitation that makes a lot of seasoned testers feel unfilled and unhappy at some point, and imho is one of the reasons so many talented testers end up changing career path, be it towards more of an engineering role, be it into product, or be it into management.

回复

Jason Arbon, the opening paragraph refers back to our conversation today. The rest is just story I recalled after.

要查看或添加评论,请登录

Wayne Roseberry的更多文章

  • Generic Test Cases and the Great ESC Ape

    Generic Test Cases and the Great ESC Ape

    When I imagine test cases, what I am primarily interested in is what happens in the case when a given condition occurs.…

    12 条评论
  • Lessons learned using an AI army to test Microsoft Office

    Lessons learned using an AI army to test Microsoft Office

    From 2016 until March of 2023, I was working on using AI to test desktop applications for Microsoft Office. A set of…

    37 条评论
  • Why is Re-Run a Valid Test Strategy for CI/CD?

    Why is Re-Run a Valid Test Strategy for CI/CD?

    Did I break something? In a CI/CD environment, the main question on a developer’s mind is “did I break something that…

    14 条评论
  • How Microsoft Office Integrated Early Observability Into its Engineering Processes

    How Microsoft Office Integrated Early Observability Into its Engineering Processes

    Can we see problems before our users hit them? One of the ways many modern products assess, and measure success and…

    4 条评论
  • Signals Based Software Testing

    Signals Based Software Testing

    For the last several years, I have been working on a test approach that I call "signals based testing." I define it as…

    16 条评论
  • I'm Doing TDD. Am I Testing Now?

    I'm Doing TDD. Am I Testing Now?

    Working on a personal coding project, I am using TDD, because I am thinking about maybe making the code publicly…

  • Let's compare two test approaches

    Let's compare two test approaches

    NOTE: an earlier version of this article used the terms "confirmatory" and "exploratory" to distinguish the two types…

    21 条评论
  • It’s Complicated

    It’s Complicated

    Writing tools for testing complex behaviors There is this weird, shared capability between humans and computers that…

    9 条评论
  • Titles Only Test Cases Were my Gateway Drug

    Titles Only Test Cases Were my Gateway Drug

    If someone were to ask me today for advice on a test case management strategy, the first question I would probably ask…

    14 条评论
  • What in blazes is a dedicated tester?

    What in blazes is a dedicated tester?

    A prior co-worker and colleague of mine, Alan Page, submitted a post on LinkedIn on why “…most software teams don't…

    10 条评论

社区洞察

其他会员也浏览了