The Competitor Algorithm
credit: CB Insights (Uber profile) https://app.cbinsights.com/profiles/c/jMy4/competitors

The Competitor Algorithm

Competitive relationships often violate rules of transitivity and symmetry. This has made it one of the thornier data science / machine learning challenges we've faced at CB Insights.

One of the hardest data and technical challenges at CB Insights has been building an algorithm to identify competitors.

While we augment and deliver algorithmic recommendations to an internal team that curates competitor relationships before our clients see them, our aspiration is to build an algorithm that does this at high enough fidelity (near perfect) that it doesn’t require human assistance.

This is challenging for many reasons.

Competitor relationships, while seemingly, simple on the surface have 2 characteristics which make them tricky which I detail below:

  • They often violate transitivity
  • They are often asymmetric

Violating transitivity – If A is a competitor of B, and B is a competitor of C, A is not necessarily a competitor of C. For example, Lyft is a competitor of Uber and Uber is a competitor of Grubhub (by virtue of its Uber Eats business), but Lyft and Grubhub are not competitors. (BTW, given Uber's S-1 filing, here's a free primer on how Uber makes money)

But violating transitivity is not the only problem

Asymmetric competitor relationships – Sometimes, company A considers company B as a competitor, but company B doesn’t consider A a competitor. Or company A is just a feature of competitor B. In this case, if you ask who A competes with, folks might say B — but if you ask who B’s competitors are, A would not come up.

Consider SaaS vendor Mindbody, which makes fitness, wellness, & gym management software. One of its products is a point-of-sale device for its clients. Square is a competitor. But given its vertical focus, you might not agree that Mindbody is a competitor of Square.

Irrespective of where you come out on this, it is a point which can be argued between humans which makes algorithmic derivation of these relationships even harder.

Asymmetry doesn’t just come from product/customer focus — as the Mindbody / Square example highlights.

Geography can also influence asymmetry. Does an online shoe retailer in China compete with one in the United States or with one in India?

Sometimes, even when the core business is the same, the stage of company creates asymmetric competitor relationships.

Take Alibaba and Yamibuy: Both are engaged in e-commerce and target the Asian market. But Yamibuy is an early-stage startup that is targeting Alibaba, and so while the company likely perceives Alibaba as a competitor, it’s unclear if it should be listed as a competitor of Alibaba until it reaches a scale that suggests they are truly competitors.

These types of asymmetric competitive relationships make this an even more interesting (and difficult) problem. 

Of course, one can try to throw humans at this problem, but that doesn’t work for several reasons:

  • It’s expensive and difficult to scale when trying to do this for hundreds of thousands of companies
  • It requires domain knowledge especially in areas like biotech or enterprise software
  • It is not consistent. Human biases & perspectives will result in inconsistency even with the best procedures laid out.
  • Businesses change product and focus areas over time, often subtly, which results in changes to who competitors are. The volume, velocity, and variety of these changes are impossible for human curators to keep on top of

Our model to identify competitors is the most sophisticated of anything available. It uses a variety of signals, ranging from search data to co-mentions in the media to keyword description overlap, and a ton of other factors. But we’re still not at the point of providing our clients with algorithmic recommendations that have not gone through some level of human review — i.e., they can’t just be pushed to production.

It's something I know we will solve.

If you find this problem interesting or have solved similar challenges, we’re aggressively hiring on our engineering and data science / machine learning teams.  

Or just reach out directly to me.

BTW, learn more about our algorithm that had a 48% hit rate on predicting future unicorns. (FYI, no VC has that kind of hit rate). We're pretty good at this algorithm development stuff :)


Supriyo Sanyal

Helping Startups Scale | Hybrid Cloud Solutions| Data Center Solutions| Program Management

5 年

Very interesting problem and well presented! Thanks!

回复
Sumeru 'Sumo' Chatterjee ????

Launch Marketing for Startups | 4 X unicorns ?? … Follow to shortcut your marketing journey

5 年

This is one of the best recruiting posts I have ever read. I want to restudy CS just so I could respond to this call.

Anand Thaker

Navigating PE, VC | Decision Intelligence AI R&D | 4X Exits | GTM

5 年

I'm sure you all will ultimately figure it out enough at least to help aid/assist people at scale. Lots of movement creates last mile decisions for anything.

回复
Srinivas K.

Chief Strategy & Product Officer | AI | Michigan MBA | Manufacturing | Supply Chain | Forbes & HBR Council Member

5 年

Also, I don’t think being a competitor or not is a binary outcome in reality (or as an output of your algorithm, I hope). When you refer another entity as a competitor, you qualify them with the top correlating vectors around which they are competitors. Correct?

回复

要查看或添加评论,请登录

Anand Sanwal的更多文章

  • Neuralink and mind reading tech

    Neuralink and mind reading tech

    Mind reading tech is coming A bit on Neuralink, brain-computer interfaces (BCI) and mind reading tech First, let's talk…

    3 条评论
  • Carta's secondary market ambitions cause Twitter drama. But does it matter?

    Carta's secondary market ambitions cause Twitter drama. But does it matter?

    Alright, for those of you who are not chronically online, you probably missed that equity cap table management startup…

    14 条评论
  • Data suggests Byju's valuation is $20 billion lower (an 80-90% cut)

    Data suggests Byju's valuation is $20 billion lower (an 80-90% cut)

    Once valued at $22 billion, BYJU'S, the India-based ed tech unicorn, is now likely worth less than the total equity…

    11 条评论
  • The 1st billionaire YouTuber is going after Coca-Cola and Pepsi

    The 1st billionaire YouTuber is going after Coca-Cola and Pepsi

    Variety just revealed that MrBeast pulled in $82 million in revenue in the last year. And so many think he will be the…

    22 条评论
  • Digital pill mill?

    Digital pill mill?

    Saw this ad today while doing some research on the mental health space. I appreciate the honesty of this 'mental…

    6 条评论
  • 18 guaranteed ways to ruin your startup's 2022

    18 guaranteed ways to ruin your startup's 2022

    Your startup has more than 5 people Want to leave 2022 weaker than you started? Follow these tips 1/ Keep the…

    11 条评论
  • A bunch of startup ideas I like

    A bunch of startup ideas I like

    So this started with 3 startup ideas but given the response in the newsletter, I decided to keep adding to them here…

    17 条评论
  • Data residency-as-a-service is HOT

    Data residency-as-a-service is HOT

    Note: this is a bit of an experiment to share technology market trends we're observing on CB Insights. This might be…

  • Using fake startup valuations to lure talent

    Using fake startup valuations to lure talent

    On the 31% of unicorns that are worth exactly $1B, partner at Lightspeed Venture Partners Jeremy Liew wryly noted (via…

    23 条评论
  • The Awkward, Sometimes Difficult Teenage Years

    The Awkward, Sometimes Difficult Teenage Years

    In 2014, CB Insights was 30 people. We are now ~250.

    4 条评论

社区洞察

其他会员也浏览了