Now in its third year, the Data-driven VC landscape report is the definitive guide to how data and AI initiatives are evolving in venture capital. This year’s report compiled key takeaways from 300+ VC firms, of which 235 were identified as data-driven firms. These are firms defined as using data to “objectively analyze the full opportunity set to optimize for the global maximum”, versus traditional firms that are limited by “deal coverage and personal experience”.
Firms were surveyed on how they’re thinking about data, where it’s adding value in the deal cycle, and how they see the future of the industry taking shape.
Affinity is proud to support this important piece of research. Here the report’s author (and Partner at Earlybird Venture Capital), Dr. Andre Retterath, shares his perspective on the data-driven VC landscape.
{{DDVCreport-202405="/rt-components"}}
Key takeaways
- The biggest opportunities for data-driven VC initiatives lie in deal sourcing and screening (where two-thirds of venture capital value is created), while portfolio value creation remains an overlooked area with significant potential.
- Major challenges for data-driven investors include technical decisions (build vs. buy), data management issues, and cultural barriers, particularly around leadership buy-in and encouraging traditional investment teams to adopt new technologies.
- While measuring ROI in venture capital remains challenging due to long feedback cycles, firms are increasingly focusing on metrics like hit rates, deal coverage, and team efficiency to evaluate the effectiveness of their data-driven initiatives.
1. Where are the biggest opportunities for VCs to become more data-driven, and what’s commonly overlooked?
Studies have shown that two thirds of venture capital value is created in the sourcing and screening parts of the process. It’s a game of finding and picking the winners. This is why VCs typically start their data-driven journey with deal flow generation, deal flow screening, and deal flow management. Data can help identify new opportunities, enrich them with additional context, and prioritize the great ones to find the inflection point that tells you it’s the right time to engage.
Where I see VCs overlook data-driven opportunities is with portfolio value creation. Most tools focus on the first stages of the value chain (sourcing and screening) and few firms have built tooling to support portfolio value creation.
There’s an opportunity here to help portfolio companies achieve their goals. Consider what’s involved in portfolio management—things like supporting introductions to potential customers, talent, investors, and even acquirers down the road. With the data on your firm’s network of relationships you can build something like a LinkedIn on steroids. Whether portcos are looking for talent specific to industry, specialism, or location, you can help find and match the best people.
And then there’s other components like competitive landscape benchmarking. Firms already do a competitive landscape analysis during due diligence, but these insights can also be really helpful for portfolio companies—giving them a bird’s eye view of their market that they can keep tracking and benchmarking themselves against with metrics like headcount growth, website traffic, and funding raised.
2. What are the biggest challenges that data-driven investors face, and how can they overcome these challenges?
There are two main challenges that data-driven investors face: product-related, and cultural.
Technical challenges
VCs have to decide what to build versus buy off-the-shelf. There’s not always a need to reinvent the wheel, but you do need to be able to identify opportunities where you can make a difference and create a competitive edge.
Once that’s decided you also need to think about talent and team structure. You need to solve technical issues related to data collection, entity matching/deduplication, all the things that it takes to make data actionable. Once you have the full cycle of investment data in the system, that’s when you can start to build a data flywheel.
Cultural challenges
Cultural issues can be broken down into two categories: top down buy-in and teamwide aversion to change.
The first relates to General Partners, those who eventually own the budget for data-driven initiatives. It’s a question of whether the firm’s leadership is truly bought in on the approach, or if they’re window dressing to keep up with the competition or appease LPs. Ultimately, are they fulfilling an intrinsic or extrinsic need? Because those carry different perspectives and that impacts the available budget and how you’re able to think about innovation.
The second part of cultural challenges comes down to this: we work in the financial industry and many people were trained in very traditional institutions—investment banks, consultancies, and other places with traditional processes. Since venture capital began in the 1950s, the only real innovation has been from pen and paper to computers. People are used to working in a manual way and thinking they can increase the output by increasing the input. But that doesn’t apply in VC.
We need to culturally change how people interact with their daily tools, and how they think about time allocation. Getting investment teams to adopt a new platform is the biggest area where I hear firms are struggling.
3. What’s the biggest pushback you hear from firms that are not data-driven?
Generally, the sentiment has changed a lot over the last few years. When I got into the industry about eight years ago, people told me it didn’t make sense to become data-driven because VC is mostly private investors investing in private companies. Neither one of them has to disclose any information and the information that is available is there because the company or investors want it so. The other argument I would hear is that with early-stage investing, the data that’s available is unstructured and qualitative, and that’s hard to process.
We know today that there are way more digital footprints available on companies than you could get by looking into a data room, for example. There’s a huge volume of both quantitative and qualitative data available about every company and the challenge is to find the signal in the noise of that digital footprint. We can do that now with large language models—we can process that unstructured data, semi-quantify it, and make sense of it.
With the launch of ChatGPT and generative AI more generally, we see more and more firms trying to act in the data-driven space. There was a 21% increase in data-driven VCs over the past year. This year’s report found that 63% want to increase their budgets in this area, 16% lower than last year—indicating slower budget growth as the industry matures from foundational investments to refining existing tools.
The few firms that are not moving in this direction tend to be small: angel investors, solo GP firms, and boutique firms. These are firms managing funds of sub $50 million to around $200 million and they have less of a need for advanced, proprietary, data-driven infrastructure because they can rely on their network to source the number of investment opportunities they need.
Problems arise as you grow, institutionalize, and need to deploy more capital. As soon as that happens, you need to see more relevant opportunities and you’re in competition with many more firms. That’s when being data-driven becomes a competitive advantage.
{{report-202404="/rt-components"}}
4. How are data-driven VCs measuring ROI?
Venture capital is a long term game. If you’re an early stage investor, the feedback cycles can last decades. Ideally, we’d look at data-driven initiatives through the lens of an A/B test where one group follows the traditional business model with established deal sourcing norms and the same team. The other group would use data-driven tools for sourcing, screening, prioritizing, and so on. That would be a perfect scenario to then compare the Distributions to Paid In Capital ratio (DPI).
The problem is there are so many dependencies—partners involved, time, market conditions. It’s impossible to get and then control a significant sample size.
Instead, firms can introduce more short term measures to assess their data-driven performance. For example, looking at effectiveness. This can include multiple dimensions like your hit rate, your coverage, your conversion and win rates, and longer term performance data. How many deals are being done by competitors before and after introducing a data-driven tool? Did your hit rate improve?
The second measure is team efficiency. Can your team now handle more deals? Can they dive deeper on each one? This looks at whether your team can get more done with less.
5. What data-driven tools are you most excited about?
I’ve been observing the market for a long time, looking at probably 500+ tools over the last eight years. I remember about five years ago I did the analysis of how many tools I looked at in one week, from calendar scheduling to email, CRM, and portfolio management. I was using around 80 tools and the context-switching was too much. Even when they were best of breed, they weren’t natively integrated and we needed a single source of truth.
Fast forward to today and it’s great to see providers like Affinity expanding horizontally, either via partnerships or by organically building the connections. The industry is moving from a very fragmented tech stack to a more horizontal stack with one point of interaction for the user. This year we’ll also see the rise of AI agents in the VC space—there’s more in the report on what this will mean for different aspects of the investor’s job.
6. What was the most surprising data point from the report, and why?
When we asked about the future of VC, 12% of respondents in 2024 said they believed that the ‘quant VC’ will be the dominating model in the future. That is, no human involvement in the investment process at all. This year, that number dropped to 5%.
In contrast, the percentage predicting that the augmented VC will be the model of the future rose from 75% to 94%. This aligns with my own thinking—where we’ll see an augmented VC model with human involvement dominating. After all, founders still want to work with humans. But they can be more efficient and effective when augmented with data-driven approaches and AI. Saying all this, quant VC might work for dedicated follower strategies.
7. What does the future of the data-driven VC look like?
We’re going to see massive changes in the industry in the next five years. Now the awareness is there and there are lots of firms just starting to answer questions like: Do we build or buy? Should we hire a team? Do we have the resources? How can we create an edge with competitive insight?
These firms are still in the exploration phase but some are already ahead and establishing a competitive edge. It’s clear to me that every fund larger than that of a boutique firm needs to become data-driven and have the technology and mindset to use data and AI to become a better investor. Otherwise they just won’t be competitive in the years to come.
On the flip side, we will continue to see some solo GPs and smaller firms that can work with limited tooling and rely primarily on their networks. But that’s a fraction of the market and capital to be deployed.
In the next five years, we’ll see the firms move from awareness into interest and desire, and eventually into action. And they’ll be supported by developments in the investment tech space, with both existing companies growing and expanding their value proposition, and new entrants into the space.
Learn about the future of the data-driven VC when you download the report.
{{DDVCreport-202405="/rt-components"}}