The Competitor Algorithm
Anand Sanwal
Founder of CB Insights | I owe people money. Please buy a subscription to CB Insights
Competitive relationships often violate rules of transitivity and symmetry. This has made it one of the thornier data science / machine learning challenges we've faced at CB Insights.
One of the hardest data and technical challenges at CB Insights has been building an algorithm to identify competitors.
While we augment and deliver algorithmic recommendations to an internal team that curates competitor relationships before our clients see them, our aspiration is to build an algorithm that does this at high enough fidelity (near perfect) that it doesn’t require human assistance.
This is challenging for many reasons.
Competitor relationships, while seemingly, simple on the surface have 2 characteristics which make them tricky which I detail below:
- They often violate transitivity
- They are often asymmetric
Violating transitivity – If A is a competitor of B, and B is a competitor of C, A is not necessarily a competitor of C. For example, Lyft is a competitor of Uber and Uber is a competitor of Grubhub (by virtue of its Uber Eats business), but Lyft and Grubhub are not competitors. (BTW, given Uber's S-1 filing, here's a free primer on how Uber makes money)
But violating transitivity is not the only problem.
Asymmetric competitor relationships – Sometimes, company A considers company B as a competitor, but company B doesn’t consider A a competitor. Or company A is just a feature of competitor B. In this case, if you ask who A competes with, folks might say B — but if you ask who B’s competitors are, A would not come up.
Consider SaaS vendor Mindbody, which makes fitness, wellness, & gym management software. One of its products is a point-of-sale device for its clients. Square is a competitor. But given its vertical focus, you might not agree that Mindbody is a competitor of Square.
Irrespective of where you come out on this, it is a point which can be argued between humans which makes algorithmic derivation of these relationships even harder.
Asymmetry doesn’t just come from product/customer focus — as the Mindbody / Square example highlights.
Geography can also influence asymmetry. Does an online shoe retailer in China compete with one in the United States or with one in India?
Sometimes, even when the core business is the same, the stage of company creates asymmetric competitor relationships.
Take Alibaba and Yamibuy: Both are engaged in e-commerce and target the Asian market. But Yamibuy is an early-stage startup that is targeting Alibaba, and so while the company likely perceives Alibaba as a competitor, it’s unclear if it should be listed as a competitor of Alibaba until it reaches a scale that suggests they are truly competitors.
These types of asymmetric competitive relationships make this an even more interesting (and difficult) problem.
Of course, one can try to throw humans at this problem, but that doesn’t work for several reasons:
- It’s expensive and difficult to scale when trying to do this for hundreds of thousands of companies
- It requires domain knowledge especially in areas like biotech or enterprise software
- It is not consistent. Human biases & perspectives will result in inconsistency even with the best procedures laid out.
- Businesses change product and focus areas over time, often subtly, which results in changes to who competitors are. The volume, velocity, and variety of these changes are impossible for human curators to keep on top of
Our model to identify competitors is the most sophisticated of anything available. It uses a variety of signals, ranging from search data to co-mentions in the media to keyword description overlap, and a ton of other factors. But we’re still not at the point of providing our clients with algorithmic recommendations that have not gone through some level of human review — i.e., they can’t just be pushed to production.
It's something I know we will solve.
If you find this problem interesting or have solved similar challenges, we’re aggressively hiring on our engineering and data science / machine learning teams.
Or just reach out directly to me.
BTW, learn more about our algorithm that had a 48% hit rate on predicting future unicorns. (FYI, no VC has that kind of hit rate). We're pretty good at this algorithm development stuff :)
Helping Startups Scale | Hybrid Cloud Solutions| Data Center Solutions| Program Management
5 年Very interesting problem and well presented! Thanks!
Launch Marketing for Startups | 4 X unicorns ?? … Follow to shortcut your marketing journey
5 年This is one of the best recruiting posts I have ever read. I want to restudy CS just so I could respond to this call.
Navigating PE, VC | Decision Intelligence AI R&D | 4X Exits | GTM
5 年I'm sure you all will ultimately figure it out enough at least to help aid/assist people at scale. Lots of movement creates last mile decisions for anything.
Chief Strategy & Product Officer | AI | Michigan MBA | Manufacturing | Supply Chain | Forbes & HBR Council Member
5 年Also, I don’t think being a competitor or not is a binary outcome in reality (or as an output of your algorithm, I hope). When you refer another entity as a competitor, you qualify them with the top correlating vectors around which they are competitors. Correct?