3 More Things I Wish I Had Done Differently in Google Test
Only through constant reflection can one become a master of their trade. If one thinks that their work is already perfect, they will be missing out on all the growth opportunities.
In this follow-up to my previous post How to Compensate for the Top 3 Google Test Mistakes, I will describe three more mistakes I made and how you can work around them.
For most of us, making choices is a burden.
What should I wear to work today? Which movie should I check out on Netflix? Should I go with Kung-pao chicken or Mongolian beef for dinner? These are the types of decisions we have to make on a daily basis. Let's face it: it's a boring chore.
Wouldn't it be nice if someone smart has already made the best (or at least good enough) choices for us? That's why TikTok has been so addictive to many. Sometimes, we just need mindless fun.
It's the same in software development. Every decision a framework forces the user to make is extra work for the user. Ideally, there's only one (reasonable) way to do something, so that the user doesn't have to waste their brain power on making the choice and can focus on solving their actual problems instead.
This, unfortunately, isn't always possible. As a framework evolves, new features will emerge to subsume old ones. This is inevitable unless the framework is dead.
Sometimes the framework authors deprecate old features and eventually remove them, but this is typically a lengthy process as we cannot suddenly break all the users. During this period, the users have access to both the old and the new approaches. Such inconsistencies increase their cognitive burden.
One example is how Google Test allows the users to extend their testing vocabulary by defining custom assertions. The idea is that Google Test cannot possibly provide canned solutions to every testing problem, so it's a must to have an API for defining new assertion patterns and reusing them.
Initially, I implemented what I called Predicate Assertions. This lets a user define a Boolean function that takes a number of arguments and returns true if the arguments satisfy some interesting constraints. Then this function can be used with the EXPECT_PREDn(func, arg1, ..., argn) macro to verify that arg1, ..., argn have this property. If the check fails, Google Test will automatically print the values of all arguments for debugging. E.g.
// Returns true if m and n have no common divisors except 1.
bool MutuallyPrime(int m, int n) { ... }
...
EXPECT_PRED2(MutuallyPrime, 2, 3); // Succeeds.
EXPECT_PRED2(MutuallyPrime, 8, 6); // Fails.
EXPECT_PRED2(MutuallyPrime, GetFoo(), GetBar());
This was all fine and well, except that:
In other words, this feature solved a problem, but the solution was half baked.
Later, when I designed Google Mock (the mocking component of Google Test), I was inspired by the notion of matchers in jMock (a mocking framework for Java). Matchers are glorified predicates that:
So I implemented a matcher API for Google Mock and extended Google Test to allow using matchers in test assertions. The same example, when written in the matcher style, would look like:
MATCHER_P(IsMutuallyPrimeWith, rhs, "") { ... }
...
EXPECT_THAT(GetFoo(), IsMutuallyPrimeWith(GetBar()));
In this style, it's clear that GetFoo() is the subject of the test, while "is mutually prime with GetBar()" is the property we expect GetFoo() to have. Note that the last line of code reads really like English: "expect that GetFoo() is mutually prime with GetBar()". Isn't that nice?
So, although matchers were initially introduced for dispatching mock function calls, they turned out to:
Now comes the "I regret" part:
Here's what I suggest you do as a Google Test user:
The third regret I'll talk about today is about Google Mock: I should've made mock objects nice by default.
What do I mean?
Mocking is a technique for testing the interaction between software components. It's a form of white-box testing: to be sure that your code is interacting with an object foo correctly, you use a mock version of foo and assert that your code calls specific methods of foo with expected arguments. In mocking terms, specifying which methods are called with what arguments is known as "setting expectations".
Google Mock follows the philosophy that the test code should reflect the author's intention. If the author doesn't set any expectation on a method, it should mean that the author isn't interested in the method. Therefore, if the method is called anyway, it shouldn't be alarming - it's just some behavior unrelated to the purpose of the test. If the author really doesn't want this method to be called, they should explicitly set an expectation that the method is called 0 times.
However, I didn't completely follow this philosophy in the implementation: if such an uninteresting call happens, I let Google Mock print a message so that the test author can look into it in case this call is actually unexpected. This message in no way affects the test result - it's purely informational. I thought this would be helpful.
Well, if there's one thing certain, it's that you can never be certain of how the users will react to a feature.
Even though my intention for the message was to provide information in case it's needed, many people saw the message and interpreted it as "something is not right". They found that they could remove such warnings by adding an expectation on the function call, so that was what many did.
But that is not right. Doing so made the tests unnecessarily coupled with the current implementation. This makes the tests brittle - now they can be broken by unrelated changes.
Imagine that someone implements a new feature or optimization, and as a result the method is no longer called. Because many test cases have already set an expectation on this call (even though they shouldn't), suddenly all these test cases are failing. Much busy work is needed now to unbreak the test cases.
This makes it very hard to change the behavior of a system. Ideally, an O(1) behavior change should only need an O(1) change in the test code. Instead, thanks to the unnecessary expectations, an O(1) behavior change may require changing a large number of tests. Tests are no longer helping us - they are working against us.
When I realized my mistake, I tried to make Google Mock just ignore uninteresting mock function calls by default (in Google Mock terms, we are making mock objects "nice"), so that people won't feel the need to add unnecessary expectations. However, that change was met with surprisingly strong resistance.
A small number of users work on critical systems where all uninteresting calls should be scrutinized, and they really liked the old behavior. Some of them got furious when my fix went out, even though I explained that they could easily get the old behavior back via a flag.
One such user was particularly loud and had a, well, not-so-subtle communication style. At that time, I already moved on to other projects and Google Mock improvement was just a labor of love for me. I tried to hold my ground, but eventually yielded as I didn't want my peaceful life derailed by a prolonged and heated fight.
So, regretfully, I wasn't able to fix this mistake.
Today, if you are using the mocking features of Google Test, I strongly suggest to always pass --gmock_default_mock_behavior=0 to your test programs. This flag takes a numeric value to control what Google Mock does when it sees an uninteresting call:
What are some other things you think Google Test should've done differently? Let me know in the comments. Thank you!
Software Engineer at Aurora. Not interested in jobs that aren't Aurora. :)
2 周My biggest pet peeve is the design of EXPECT_NEAR (https://google.github.io/googletest/reference/assertions.html#EXPECT_NEAR), which implicitly assumes that the arguments are `double`, or can be treated as such. I assume a strictly better design would have been to make no assumption, and work with any types that support absolute difference, and can be compared via `<`. Something like: using std::abs; return abs(expected - actual) < tolerance; The biggest thing this would enable is quantity types, such as from a units library (e.g., https://github.com/aurora-opensource/au). This would even enable comparing two std::chrono::time_point instances, where the tolerance has a type of std::chrono::duration! Am I right in thinking this was just an oversight in the initial design? By the way, despite users offering multiple technically feasible solutions, Google stopped publicly acknowledging the issue (https://github.com/google/googletest/issues/890) in 2019.
Client Technical Specialist and Chief Database Architect at Mphasis, a Blackstone company || Health AI @ DocNote.ai || GenAI Search @ MetaRAG.ai || GRC @ NIST.ai || KYC @ OFAC.ai
2 周Zhanyong Wan nice. Our work on 'Predicate Assertions' is proving effective in handling medical jargon with DocNote.ai, compliance with NIST.ai, and supporting KYC OFAC.ai. For what it's worth, the engine is a self-adjusting B-tree in C, baselined on #GCP compute. She is cloud-agnostic
Senior Software Developer
3 周Insightful. I liked the content especially the 0-1-2 .