A Co-investment Network Analysis: Finding the Most Influential and Connected Investors in the Netherlands
Mustafa Torun
Committed to showcasing data science/AI’s critical role for a carbon-negative economy | Growing data culture in [Dutch] investment ecosystem | Driving data-driven investing | Championing deeptech as Europe’s future.
Article Highlights
Introduction:
In the complex tapestry of the investment world, the visible market trends and shifts in funding often overshadow the underlying mosaic of relationships and collaborations. By delving into this hidden realm of co-investments, one can glean invaluable insights about the key influencers, rising sectors, and potential market trajectories. This article aims to guide you through the intricate corridors of co-investment networks, shedding light on the power players and their strategic alignments. Whether you're an investor, an entrepreneur scouting for backers, or simply an enthusiast eager to learn, this exploration promises illuminating revelations.
1. Harnessing the Power of Co-Investment Network Analysis
The investment ecosystem thrives on collaborations. Decoding co-investment networks provides a panoramic view of these strategic alliances, enabling us to:
Imagine a web where entities, represented as nodes, are interconnected. In our narrative, these nodes symbolize investors, with co-investments forming the interconnecting threads or edges. So, how do we pinpoint the linchpins in this web?
This is where centrality metrics come into play, acting as compasses to navigate the network maze:
2. Crafting the Co-Investment Blueprint: A DIY Guide
Data Schema:
Begin with a structured dataset highlighting the primary investor, their co-investor, and the collaborative deal. For instance, using the Dealroom's transactions endpoint, I extracted deal data for VC investments specifically for the Netherlands. The processed dataframe, replete with DealID, Company, Amount, Investors, and more, serves as the foundation. To transform this into a co-investment matrix, one can segregate investors into pairwise combinations. A trio of investors, say A, B, and C, in a single round would lead to combinations like A-B, B-C, and A-C, painting a mutual, undirected network picture. This implies that the sequence within the pair is inconsequential, making it apt for co-investment networks. It's crucial to remember that a single connection doesn't always denote a unique investment round.
Choosing the Network Paradigm:
Investor-Co-investor Network: A bipartite web spotlighting direct investor relationships, ideal for deciphering collaboration blueprints.
Investor-Company-Co-investor Network: A tripartite construct linking investors to specific investments, offering deeper insights into overlapping portfolios and co-investment tendencies.
Your analytical objectives dictate the choice. Our focus remains on elucidating investor collaborations, making the investor-co-investor paradigm the pick of the lot.
DIY: Codebase
Thus, your data set should have the following format:
Having this format in your data set, now you are good to go for creating the network graphs:
import pandas as pd
import networkx as nx
from pyvis.network import Network
# Load the dataset
df_themes = pd.read_excel("/path/to/YourDataSet.xlsx")
# Create an empty undirected graph
G = nx.Graph()
# Add edges to the graph using the Investor and CoInvestor columns
for i in range(df.shape[0]):
G.add_edge(df.iloc[i]['Investor'], df.iloc[i]['CoInvestor'])
Now it is trivial to just calculate the centrality measures over graph G and subgraphs.
# Calculate all centrality measures for the entire graph
eigenvector_centrality = nx.eigenvector_centrality(G,max_iter=500)
degree_centrality = nx.degree_centrality(G)
betweenness_centrality = nx.betweenness_centrality(G)
closeness_centrality = nx.closeness_centrality(G)
# Add the eigenvector centrality values as new columns to the dataframe
df['Eigenvector Investor'] = df['Investor'].map(eigenvector_centrality)
df['Degree Investor'] = df['Investor'].map(degree_centrality)
df['Betweenness Investor'] = df['Investor'].map(betweenness_centrality)
df['Closeness Investor'] = df['Investor'].map(closeness_centrality)
Now it is time to calculate centrality measures for each sub-sector (we call it thematic teams in the script), which requires creating sub-graphs:
# Define the list of thematic_team columns
thematic_teams_columns = ['Subsectors Agrifood', 'Subsectors Biocircular', 'Subsectors Energy', 'Subsectors Deeptech', 'Subsectors LSH']
# Iterate over each thematic_team column
for thematic_team_column in thematic_teams_columns:
# Extract the name of the thematic team from the column name
thematic_team = thematic_team_column.split(' ')[1]
# Create an empty directed graph for the thematic team
subgraph_thematic_team = nx.Graph()
# Add edges that match the thematic team to the subgraph
for i, row in df.iterrows():
if row[thematic_team_column] is not None: # Check if the row is classified as thematic_team
subgraph_thematic_team.add_edge(row['Investor'], row['CoInvestor'])
# Check if the subgraph is not empty
if not subgraph_thematic_team.nodes():
print(f'Subgraph for {thematic_team} is empty')
continue
# Calculate eigenvector centrality for the subgraph
eigenvector_centrality = nx.eigenvector_centrality(subgraph_thematic_team, max_iter=1000)
# Add the eigenvector centrality values for the subgraph as new columns to the dataframe
df['Eigenvector Investor ' + thematic_team] = df['Investor'].map(eigenvector_centrality)
# Calculate degree centrality for the subgraph
degree_centrality = nx.degree_centrality(subgraph_thematic_team)
# Add the degree centrality values for the subgraph as new columns to the dataframe
df['Degree Investor ' + thematic_team] = df['Investor'].map(degree_centrality)
# Calculate betweenness centrality for the subgraph
betweenness_centrality = nx.betweenness_centrality(subgraph_thematic_team)
# Add the betweenness centrality values for the subgraph as new columns to the dataframe
df['Betweenness Investor ' + thematic_team] = df['Investor'].map(betweenness_centrality)
# Calculate closeness centrality for the subgraph
closeness_centrality = nx.closeness_centrality(subgraph_thematic_team)
# Add the closeness centrality values for the subgraph as new columns to the dataframe
df['Closeness Investor ' + thematic_team] = df['Investor'].map(closeness_centrality)
Since the absolute values of centrality measure vary to much, it is better to scale them for better comparison:
from sklearn.preprocessing import MinMaxScaler
# Create an instance of the MinMaxScaler
scaler = MinMaxScaler(feature_range=(0,100))
# Define the columns to be scaled
cols_to_scale = [col for col in df.columns if 'Eigenvector' in col or 'Degree' in col or 'Betweenness' in col or 'Closeness' in col]
# Fit the scaler to the data
scaler.fit(df[cols_to_scale])
# Transform the data and update the dataframe
df[cols_to_scale] = scaler.transform(df[cols_to_scale])
Your sub-graphs and centrality measure values for each investor in the graphs are ready! We used PowerBI for visualizing the networks.
Decoding Industry Categorization: A Necessary Conundrum
The crux of the issue lies in the fluidity of definitions. What one investor might label as 'biotech', another might perceive as 'healthcare technology'. Data vendors, too, with their vast repositories of information, have their own taxonomies, which might not always align with conventional industry nomenclature. This divergence in categorization isn't necessarily due to oversight; it's an inherent challenge. The business landscape is dynamic, with companies often straddling multiple sectors, making it tricky to pigeonhole them into one category.
Our Approach to Industry Labelling:
To bring some semblance of order to this chaos, we've adopted a pragmatic approach. Leveraging data from Dealroom.co , we employ a keyword search strategy. This method, while not infallible, offers a reasonable degree of accuracy. Companies often use specific terminologies in their descriptions, mission statements, and product listings. By targeting these keywords, we can assign industry labels with a higher confidence level.
For instance, a company mentioning 'solar', 'renewable', and 'photovoltaic' in its profile is a strong candidate for the 'Renewable Energy' category. Similarly, mentions of 'AI', 'neural networks', or 'machine learning' could point towards a 'Deep Tech' classification.
The Continuous Evolution of Categorization:
It's essential to acknowledge that industry categorization isn't a one-time task. As businesses pivot, diversify, or evolve, their industry affiliations might change. Our keyword-based approach, while effective now, will require periodic recalibrations to stay relevant. Thus, some better solutions would be leveraging LLMs, clustering/unsupervised learning etc.
3. Deciphering the Analysis
Centrality scores, while indicative of influence, need contextual anchoring:
4. The Revelations
Based on centrality measures, the spotlight falls on:
Agrifood Network Graph
Biocircular Network Graph
领英推荐
Energy Network Graph
Life Science and Health (LSH) Network Graph
Deeptech Network Graph
Disclaimer: Deeptech classification is a bit cumbersome. For example in this study I included blockchain (not digital currencies), some digital solutions as well but excluded SaaS. Questionable but can be adjusted based on different taxonomy approaches.
Entire Network Graph
An interpretation of high degree centrality but low eigenvector centrality:
High Degree Centrality: This VC has co-invested in many rounds with a variety of other VCs. They are very active in terms of collaborations and have a broad network of co-investors.
Low Eigenvector Centrality: Even though this VC is active and collaborates with many other VCs, the VCs they collaborate with are not themselves very central or influential in the overall network. In other words, they frequently co-invest with VCs who have fewer co-investments overall or who don't collaborate with other influential VCs.
Interpretation in the context of VC investments:
The VC in question is very active and likely has a diverse portfolio, given the many co-investments. However, they might be operating more on the periphery of the "main action" in the VC community, co-investing with VCs who aren't the major players or don't have strong influence in the broader VC network. This could suggest a niche focus, a different investment thesis, or a strategy that diverges from the main VC clusters. Alternatively, it could also mean that this VC is newer or is yet to form strong collaborative ties with the most influential VCs in the ecosystem.
5. Navigational Tips and Traps
While the analysis provides a roadmap, it's essential to tread with caution:
Conclusion:
In the nuanced tapestry of co-investment networks, understanding centrality measures offers a unique lens through which we can discern the patterns, influence, and roles of various investors in different sectors. Let's break down the results:
Agrifood:
Pale Blue Dot emerges as the most influential entity in this network, indicating its ability to associate with other influential partners in strategic collaborations.
Voyagers Fund, with the most connections, suggests a broad collaborative strategy in the Agrifood sector, possibly indicating diverse investment interests.
Shift Invest plays a pivotal role as a bridge, connecting disparate clusters of investors, suggesting they could be a crucial player in syndicate formations or deal brokering.
With Shift Invest also leading in closeness, it indicates their strategic positioning in the network, providing them with faster access to information flow or deal opportunities.
Biocircular:
Invest-NL stands out as the most dominant player in multiple facets - from influence to connections to being a bridge. Their central role implies a significant footprint in the Biocircular sector, and they seem to be at the heart of many co-investment collaborations.
Energy:
InvestNL again emerges prominently, but it's interesting to note Shift Invest's role as the primary bridge. This suggests that while InvestNL might be involved in more co-investments, Shift Invest plays a key role in facilitating connections between diverse investment groups.
Life Science and Health (LSH):
EQT Life Sciences' influence indicates their strategic co-investments with other influential entities. However, BOM's position, with the most connections and also leading in betweenness and closeness, showcases its central role in the LSH sector, possibly acting as a hub in this network.
Deeptech:
Sfermion's top influence score hints at its strategic partnerships with other key players. BOM, with its prominence in connections, betweenness, and closeness, suggests it's not just active but also central in facilitating co-investments in the Deeptech arena.
Overall Insights:
InvestNL has a pronounced presence across multiple sectors, emphasizing its broad investment strategy and pivotal role in the Netherlands' investment ecosystem.
BOM and Inkef are recurrent names across sectors, hinting at their extensive collaborative strategies and central roles in various industry networks.
Influence (eigenvector centrality) doesn't always correlate with the number of connections (degree centrality). An investor might have fewer but more strategic collaborations, emphasizing quality over quantity.
The bridging role (betweenness centrality) of entities like Shift Invest underscores the importance of players who might not always be the most active but are strategically positioned to influence information flow and deal formations.
In the complex world of co-investments, these centrality measures provide a compass, helping stakeholders navigate the landscape, identify potential partners, and understand market dynamics.
Diving deep into co-investment networks has surfaced the latent collaboration contours among investors. The analysis underscores the significance of strategic co-investments and their bearing on sectoral dominance. As the investment mosaic continues to evolve, staying attuned to these undercurrents becomes paramount. By wielding network analysis as our beacon, we can adeptly traverse the multifaceted investment terrains.
Coming Soon
We will investigate the activity of foreign and domestic investors in co-investment network in Netherlands. Stay tuned!
Co-Founder Adepti
1 年Cool analysis!
Committed to showcasing data science/AI’s critical role for a carbon-negative economy | Growing data culture in [Dutch] investment ecosystem | Driving data-driven investing | Championing deeptech as Europe’s future.
1 年Of course I need to mention the contributions of my colleagues Ruud Zandvliet Elisabeth Storm de Grave - Huyssen van Kattendijke Reda Atibi Rik Pantjes Guy de Sévaux especially for sector classifications, interpretation and limitations.
Financieringsspecialist @ KplusV | Venture Capital | Startup | Scale-up
1 年Great article! Very relevant to also spot where possibilities for more cooperation lie to form a better 'funding chain'
Partner / owner KplusV - Focussed on public-private finance
1 年Thomas Ticheloven
Interesting analysis! Prior to describing the relations one has to build them, from scratch in this case Elisabeth Storm de Grave - Huyssen van Kattendijke