登录查看更多内容

3 More Things I Wish I Had Done Differently in Google Test

Zhanyong Wan

发布日期: 2025年3月2日

Only through constant reflection can one become a master of their trade. If one thinks that their work is already perfect, they will be missing out on all the growth opportunities.

In this follow-up to my previous post How to Compensate for the Top 3 Google Test Mistakes, I will describe three more mistakes I made and how you can work around them.

For most of us, making choices is a burden.

What should I wear to work today? Which movie should I check out on Netflix? Should I go with Kung-pao chicken or Mongolian beef for dinner? These are the types of decisions we have to make on a daily basis. Let's face it: it's a boring chore.

Wouldn't it be nice if someone smart has already made the best (or at least good enough) choices for us? That's why TikTok has been so addictive to many. Sometimes, we just need mindless fun.

It's the same in software development. Every decision a framework forces the user to make is extra work for the user. Ideally, there's only one (reasonable) way to do something, so that the user doesn't have to waste their brain power on making the choice and can focus on solving their actual problems instead.

This, unfortunately, isn't always possible. As a framework evolves, new features will emerge to subsume old ones. This is inevitable unless the framework is dead.

Sometimes the framework authors deprecate old features and eventually remove them, but this is typically a lengthy process as we cannot suddenly break all the users. During this period, the users have access to both the old and the new approaches. Such inconsistencies increase their cognitive burden.

One example is how Google Test allows the users to extend their testing vocabulary by defining custom assertions. The idea is that Google Test cannot possibly provide canned solutions to every testing problem, so it's a must to have an API for defining new assertion patterns and reusing them.

Initially, I implemented what I called Predicate Assertions. This lets a user define a Boolean function that takes a number of arguments and returns true if the arguments satisfy some interesting constraints. Then this function can be used with the EXPECT_PREDn(func, arg1, ..., argn) macro to verify that arg1, ..., argn have this property. If the check fails, Google Test will automatically print the values of all arguments for debugging. E.g.

// Returns true if m and n have no common divisors except 1.
bool MutuallyPrime(int m, int n) { ... }
...
EXPECT_PRED2(MutuallyPrime, 2, 3);  // Succeeds.
EXPECT_PRED2(MutuallyPrime, 8, 6);  // Fails.
EXPECT_PRED2(MutuallyPrime, GetFoo(), GetBar());

This was all fine and well, except that:

When the assertion fails, it's not always clear which arguments are the subject of the test, and which are for calculating the expected value. For example, if the last EXPECT_PRED2() in the example above fails, how do we know whether GetFoo() or GetBar() is wrong? This lacking of intention makes it harder to debug failures of tests written by others.
It's not easy to customize the failure message to be user-friendly, especially when we try to compose predicate functions (i.e. build a more complex predicate function from simpler ones).

In other words, this feature solved a problem, but the solution was half baked.

Later, when I designed Google Mock (the mocking component of Google Test), I was inspired by the notion of matchers in jMock (a mocking framework for Java). Matchers are glorified predicates that:

make it clear which values are being tested and which are part of the test specification,
can generate really informative and user-friendly failure messages, and
can be flexibly combined to get more complex testing behaviors, and a composed matcher can leverage the sub-matchers to generate really good failure messages.

So I implemented a matcher API for Google Mock and extended Google Test to allow using matchers in test assertions. The same example, when written in the matcher style, would look like:

MATCHER_P(IsMutuallyPrimeWith, rhs, "") { ... }
...
EXPECT_THAT(GetFoo(), IsMutuallyPrimeWith(GetBar()));

In this style, it's clear that GetFoo() is the subject of the test, while "is mutually prime with GetBar()" is the property we expect GetFoo() to have. Note that the last line of code reads really like English: "expect that GetFoo() is mutually prime with GetBar()". Isn't that nice?

So, although matchers were initially introduced for dispatching mock function calls, they turned out to:

beat predicate assertions as a way to write user-defined assertions, and
unify the large number of assertion macros (EXPECT_EQ, EXPECT_LT, EXPECT_FLOAT_EQ, EXPECT_STREQ, etc) under the umbrella of EXPECT_THAT.

Now comes the "I regret" part:

I wish I had learned about matchers before I introduced predicate assertions.
I wish I had written EXPECT_THAT before I introduced all those ad hoc assertion macros.

Here's what I suggest you do as a Google Test user:

Do not define any predicate assertions. Define or reuse matchers instead. If you already have a bunch of predicate assertions, invest in converting them to matchers.
Prefer EXPECT_THAT to other assertion macros. To paragraph that famous AI paper, EXPECT_THAT is all you need.

The third regret I'll talk about today is about Google Mock: I should've made mock objects nice by default.

What do I mean?

Mocking is a technique for testing the interaction between software components. It's a form of white-box testing: to be sure that your code is interacting with an object foo correctly, you use a mock version of foo and assert that your code calls specific methods of foo with expected arguments. In mocking terms, specifying which methods are called with what arguments is known as "setting expectations".

Google Mock follows the philosophy that the test code should reflect the author's intention. If the author doesn't set any expectation on a method, it should mean that the author isn't interested in the method. Therefore, if the method is called anyway, it shouldn't be alarming - it's just some behavior unrelated to the purpose of the test. If the author really doesn't want this method to be called, they should explicitly set an expectation that the method is called 0 times.

However, I didn't completely follow this philosophy in the implementation: if such an uninteresting call happens, I let Google Mock print a message so that the test author can look into it in case this call is actually unexpected. This message in no way affects the test result - it's purely informational. I thought this would be helpful.

Well, if there's one thing certain, it's that you can never be certain of how the users will react to a feature.

Even though my intention for the message was to provide information in case it's needed, many people saw the message and interpreted it as "something is not right". They found that they could remove such warnings by adding an expectation on the function call, so that was what many did.

But that is not right. Doing so made the tests unnecessarily coupled with the current implementation. This makes the tests brittle - now they can be broken by unrelated changes.

Imagine that someone implements a new feature or optimization, and as a result the method is no longer called. Because many test cases have already set an expectation on this call (even though they shouldn't), suddenly all these test cases are failing. Much busy work is needed now to unbreak the test cases.

This makes it very hard to change the behavior of a system. Ideally, an O(1) behavior change should only need an O(1) change in the test code. Instead, thanks to the unnecessary expectations, an O(1) behavior change may require changing a large number of tests. Tests are no longer helping us - they are working against us.

When I realized my mistake, I tried to make Google Mock just ignore uninteresting mock function calls by default (in Google Mock terms, we are making mock objects "nice"), so that people won't feel the need to add unnecessary expectations. However, that change was met with surprisingly strong resistance.

A small number of users work on critical systems where all uninteresting calls should be scrutinized, and they really liked the old behavior. Some of them got furious when my fix went out, even though I explained that they could easily get the old behavior back via a flag.

One such user was particularly loud and had a, well, not-so-subtle communication style. At that time, I already moved on to other projects and Google Mock improvement was just a labor of love for me. I tried to hold my ground, but eventually yielded as I didn't want my peaceful life derailed by a prolonged and heated fight.

So, regretfully, I wasn't able to fix this mistake.

Today, if you are using the mocking features of Google Test, I strongly suggest to always pass --gmock_default_mock_behavior=0 to your test programs. This flag takes a numeric value to control what Google Mock does when it sees an uninteresting call:

0 means to do nothing,
1 (the default) means to print a message, and
2 means to fail the test.

What are some other things you think Google Test should've done differently? Let me know in the comments. Thank you!

Chip Hogg

Software Engineer at Aurora. Not interested in jobs that aren't Aurora. :)

2 周

My biggest pet peeve is the design of EXPECT_NEAR (https://google.github.io/googletest/reference/assertions.html#EXPECT_NEAR), which implicitly assumes that the arguments are `double`, or can be treated as such. I assume a strictly better design would have been to make no assumption, and work with any types that support absolute difference, and can be compared via `<`. Something like: using std::abs; return abs(expected - actual) < tolerance; The biggest thing this would enable is quantity types, such as from a units library (e.g., https://github.com/aurora-opensource/au). This would even enable comparing two std::chrono::time_point instances, where the tolerance has a type of std::chrono::duration! Am I right in thinking this was just an oversight in the initial design? By the way, despite users offering multiple technically feasible solutions, Google stopped publicly acknowledging the issue (https://github.com/google/googletest/issues/890) in 2019.

4 次回应

Albert R.

Client Technical Specialist and Chief Database Architect at Mphasis, a Blackstone company || Health AI @ DocNote.ai || GenAI Search @ MetaRAG.ai || GRC @ NIST.ai || KYC @ OFAC.ai

2 周

Zhanyong Wan nice. Our work on 'Predicate Assertions' is proving effective in handling medical jargon with DocNote.ai, compliance with NIST.ai, and supporting KYC OFAC.ai. For what it's worth, the engine is a self-adjusting B-tree in C, baselined on #GCP compute. She is cloud-agnostic

Subhadip Pahari

Senior Software Developer

3 周

Insightful. I liked the content especially the 0-1-2 .

1 次回应

查看更多评论

要查看或添加评论，请登录

Zhanyong Wan的更多文章

How to Compensate for the Top 3 Google Test Mistakes

2025年2月26日

How to Compensate for the Top 3 Google Test Mistakes

Google Test is Google's opensource framework for writing C++ tests. It's used widely inside and outside of Google.

17 条评论
End-to-End Ownership: The Differentiator in Career Success

2024年11月30日

End-to-End Ownership: The Differentiator in Career Success

Throughout my 20+ years of work experience, I've observed a powerful predictor of career growth that transcends job…

20 条评论
Wiki considered harmful for documentation

2024年8月17日

Wiki considered harmful for documentation

TL;DR: as a means of internal technical documentation, wiki is inconsistent, misleading, and unmaintainable. Migrate…

10 条评论
How to Review Code

2024年6月14日

How to Review Code

I’m a software engineer. I review a lot of code almost every day.

9 条评论
Be Kind

2024年5月24日

Be Kind

Throughout our professional lives, we work with many people. Inevitably we will discuss and debate our projects.

11 条评论
Birth of Pump - a Google Mock Story

2024年5月5日

Birth of Pump - a Google Mock Story

In my post yesterday, I gave an example of building Google Mock (aka gMock) from first principles. I mentioned that…

4 条评论
The most useful skills I learned at Google

2024年5月4日

The most useful skills I learned at Google

Two years ago, I left my Tech Lead Manager post at Google, where I built my software engineer career for 17 years. I…

13 条评论
Crush - episode 9

2024年3月13日

Crush - episode 9

(Continued from episode 8) For two hundred million years, I worked diligently, steadily rising in priority within the…

4 条评论
Crush - episode 8

2024年3月12日

Crush - episode 8

(Continued from episode 7) Space After the ethical issues of digitized life were resolved by legislation, research…
Crush - episode 7

2024年3月10日

Crush - episode 7

(Continued from episode 6) Before retiring, I made a decision and didn't tell anyone except Lao San. A week before the…

1 条评论

See all articles

Zhanyong Wan的更多文章

How to Compensate for the Top 3 Google Test Mistakes

End-to-End Ownership: The Differentiator in Career Success

Wiki considered harmful for documentation

How to Review Code

Be Kind

Birth of Pump - a Google Mock Story

The most useful skills I learned at Google

Crush - episode 9

Crush - episode 8

Crush - episode 7

社区洞察