Category Archives: Technology – Insights

Welcome to the New Database Era

The new category of cloud database services emerging

One of the most profound and maybe non-obvious shifts driving this is the emergence of the cloud database. Services such as Amazon S3, Google BigQuery, Snowflake, Databricks, and others, have solved computing on large volumes of data, and have made it easy to store data from every available source. Enterprise wants to store everything they can in the hopes of being able to deliver improved customer experiences and new market capabilities.

It has been a good time to be a database company.

Database companies have raised over $8.7B over the last 10 years, with almost half of that, $4.1B, just in the last 24 months, up from $849M in 2019 (according to CB Insights).

It’s not surprising with the sky-high valuations of Snowflake and Databricks and $16B in new revenue up for grabs in 2021, simply from market growth. A market that doubled in the last four years to almost $90B, is expected to double again over the next four. Safe to say there is a huge opportunity to go after.

See here for a solid list of database financings in 2021.

20 years ago, you had one option, a relational database.

Today, thanks to the cloud, microservices, distributed applications, global scale, real-time data, deep learning, etc., new database architectures emerged to hyper-solve new performance requirements. Different systems for fast reads, and fast writes. Systems specifically to power ad-hoc analytics, or for data that is unstructured, semi-structured, transactional, relational, graph, or time-series. Also for data used for cache, search, based on indexes, events, and more.

Each came with different performance needs, including high availability, horizontal scale, distributed consistency, failover protection, partition tolerance, serverless, and fully managed.

As a result, enterprises on average store data across seven or more different databases (i.e., Snowflake as your data warehouse, Clickhouse for ad-hoc analytics, Timescale for time series data, Elastic for their search data, S3 for logs, Postgres for transactions, Redis for caching or application data, Cassandra for complex workloads, and Dgraph for relationship data or dynamic schemas). That’s all assuming you are collocated to a single cloud, and that you’ve built a modern data stack from scratch.

The level of performance and guarantees from these services and platforms are unparalleled, compared to 5–10 years ago. At the same time, the proliferation and fragmentation of the database layer are increasingly creating new challenges. For example, syncing across the different schemas and systems, writing new ETL jobs to bridge workloads across multiple databases, constant cross-talk and connectivity issues, the overhead of managing active-active clustering across so many different systems, or data transfers when new clusters or systems come online. Each with different scaling, branching, propagation, sharding, and resource requirements.

What’s more, new databases emerge monthly to solve the next challenge of enterprise scale.

The New Age Database

So the question is, will the future of the database continue to be defined as what a database is today?

I’d make the case that they shouldn’t.

Instead, I hope the next generation of databases will look very different from the last. They should have the following capabilities:

  • Primarily compute, query, and/or be infrastructure engines that can sit on top of commodity storage layers.
  • No migration or restructuring of the underlying data is required.
  • No re-writing or parsing of queries is needed.
  • Work on top of multiple storage engines, whether columnar, non-relational, or graph.
  • Moves the complexity of configuration, availability, and scale into code.
  • Allows applications to call into a single interface, regardless of the underlying data infrastructure.
  • Works out of the box as a serverless or managed service.
  • Be built for developer-first experiences, in both single-player and multiplayer modes.
  • Deliver day 0 value for both existing (brownfield) and new (greenfield) projects

There are many secular trends driving this future:

1. No one wants to migrate to a new database. The cost of every new database introduced into an organization is an N² problem to the number of databases you already have. Migrating to a new architecture, schema, configuration, and needing to re-optimize for rebalancing, query planning, scaling, resource requirements, and more is often a [value/(time+cost)] of close to zero. It may come as a surprise, but there are still billions of dollars in Oracle instances still powering critical apps today, and they likely aren’t going anywhere.

2. Majority of the killer features won’t be in the storage layer. Separating compute and storage has increasingly enabled new levels of performance, allowing for super cheap raw storage costs, and finely-tuned, elastically scaled compute/query/infra layers. The storage layer can be at the center of data infrastructure and leveraged in various different ways, by multiple tools, to solve routing, parsing, availability, scale, translation, and more.

3. The database is slowly unbundling into highly specialized services, moving away from the overly-complex, locked-in approaches of the past. No single database can solve transactional and analytical use cases fully; with fast reads and writes, with high availability and consistency;all while solving caching at the edge, and horizontally scaling as needed. But unbundling into a set of layers sitting on top of the storage engine can introduce a set of new services to deliver new levels of performance and guarantees. For example, a dynamic caching service that can optimize caches based on user, query, and data awareness; managing sharding based on data distribution query demand and data change rates; a proxy layer to enable high availability and horizontal scale, with connection pooling and resource management; a data management framework to solve async and sync propagation between schemas; or translation layers between GraphQL and relational databases. These multi-dimensional problems can be built as programmatic solutions, in code, decoupled from the database itself, and perform significantly better.

4. Scale and simplicity have been trade-offs up until now. Postgres, MySQL, and Cassandra are very powerful, but difficult to get right. Firebase and Heroku are super easy to use but don’t scale. These database technologies have massive install bases, and robust engines, and withstood the test of time at Facebook and Netflix-level scales. But tuning them for your needs often requires a Ph.D. and a team of database experts, as teams at Facebook, Netflix, Uber, Airbnb all have. For the rest of us, we struggle with consistency and isolation, sharding, locking, clock skews, query planning, security, networking, etc. What companies like Supabase and Hydras are doing in leveraging standard Postgres installs but building powerful compute and management layers on top, allow for the power of Postgres, but with the simplicity of Firebase or Heroku.

5. The database index model hasn’t changed in 30+ years. Today we rely on general-purpose, one size fits all indexes such as B-trees and Hash-maps, taking a black-box view of our data. Being more data-aware, such as leveraging a cumulative distribution function (CDF) as we’ve seen with Learned Indexes, can lead to smaller indexes, faster lookups, increased parallelism, and reduced CPU usage. We’ve barely even begun to demonstrate next-generation indexes that have adapted both to the shape and changes of our data.

6. There is little-to-no machine learning used to improve database performance. Instead, today we define static rule sets and configurations to optimize query performance, cost modeling, and workload forecasting. These combinatorial, multi-dimensional problem sets are too complex for humans to configure, and are perfect machine learning problems. Resources such as disk, RAM, and CPU are well characterized, query history is well understood, and data distribution can be defined. We could see 10x step-ups in query performance, cost, and resource utilization, and never see another nested loop join again.

7. Data platform and engineering teams don’t want to be DBAs, DevOps, or SREs. They want their systems and services to just work, out of the box, and not have to think about resources, connection pooling, cache logic, vacuuming, query planning, updating indexes, and more. Teams today want a robust set of endpoints that are easy to deploy, and just work.

8. The need for operational real-time data is driving a need for hybrid systems. Transactional systems can write new records into a table rapidly, with a high level of accuracy, speed, and reliability. An analytics system can search across a set of tables and data rapidly to find an answer. With streaming data and need for faster responsiveness in analytical systems, the idea of HTAP (hybrid transaction/analytical processing) systems are emerging — particularly for use cases that are highly operational in nature — meaning a very high level of new writes/records and more responsive telemetry or analytics on business metrics. This introduces a new architectural paradigm, where transactional and analytical data and systems start to reside much closer to each other, but not together.

A New Category of Databases

A new category of cloud database companies is emerging, effectively deconstructing the traditional database monolith stack into core layered services; storage, compute, optimization, query planning, indexing, functions, and more. Companies like ReadySet, Hasura, Xata, Ottertune, Apollo, Polyscale, and others are examples of this movement and quickly becoming the new developer standard.

These new unbundled databases are focused on solving the hard problems of caching, indexes, scale, and availability, and beginning to remove the trade-off between performance and guarantees. Fast databases, always-on, handling mass scale, and data-aware, blurring the lines between the traditional divisions between operational and analytical systems. The future looks bright.

Welcome to the New Database Era was originally published on TechCrunch.


Orchestrating the Modern Data Platform, Venrock’s Investment into Astronomer

The backstory of our 2020 Series A lead investment into Astronomer.

— originally posted at —

Update since we invested (March 2022)

So much has happened since we led the Series A for Astronomer. Here is a brief snippet:

The backstory of our 2020 Series A lead investment into Astronomer.

Ten years ago, the modern enterprise stack looked quite a bit different. Teams of network admins managing data centers, running one application per server, deploying monolithic services, through waterfall processes, with staged releases managed by an entire role labeled “release manager.”

Today, we have multi and hybrid clouds, serverless services, continuous integration, and deployment, running infrastructure-as-code, with DevOps and SRE fighting to keep up with the rapid scale.

Companies are building more dynamic, multi-platform, complex infrastructures than ever. We see the ‘-aaS’ of the application, data, runtime, and virtualization layers. Modern architectures are forcing extensibility to work with any number of mixed and matched services, and fully managed services, effectively leaving operations and scale to the service providers, are becoming the golden standard.

With limited engineering budgets and resource constraints, CTOs and VP Engs are increasingly looking for ways to free up their teams. They’re moving from manual, time-consuming, repetitive work to programmatic workflows, where infrastructure and services are written as code and abstracted into manageable operators such as SQL or DAGs, owned by developers.

If the 2010s represented a renaissance for what we can build and deliver, the 2020s have begun to represent a shift to how we build and deliver, with a focused intensity on infrastructure, data, and operational productivity.

Every company is now becoming a data company.

The last 12 months, in particular, have been a technology tipping point for businesses in the wake of remote work. Every CIO and CTO has been burdened with the increasing need to leverage data for faster decision making, increased pressures of moving workloads to the cloud, and the realizations of technology investments as a competitive advantage in a digital world. The digital transformation we’ve seen play out over the last few years just compressed the next five years of progress into one.

With that, the data infrastructure has become a focal point in unlocking new velocity and scale. Data engineering teams are now the fastest-growing budget within engineering, and in many organizations, and fastest-growing budget, period.

Every company is now becoming a data company, with the data infrastructure at heart.

Building on a brittle data infrastructure

To best understand an enterprise data infrastructure, any company that relies on multiple data sources to power their business or make critical business decisions needs to rely on some form of data infrastructure and pipelines. These are systems and sequences of tasks that move data from one system to another, transform, process, and store the data for use. Metrics aggregations, instrumentation, experimentation, derived data generation, analytics, machine learning feature computation, business reporting, dashboards, etc, all require the automation of data processes and tasks to process and compute data into required formats.

Traditional approaches to building data infrastructures leveraged heavy ETL (extract, transform, load) tools (Informatica, SAS, Microsoft, Oracle, Talend) built on relational databases that require difficult, time-consuming, and labor-intensive rules and config files. With the introductions of modern application stacks and the massive increase in data and processing required, traditional ETL systems have become overly brittle and slow, unable to keep up with the need for increased scale, modularity, and agility.

The slightly more modern approach to scaling up data pipelines relied on batch processing of jobs through the use of static scripts or jobs that define schedulers to kick off specific tasks, i.e. pulling data from a particular source > running a computation > aggregating with another source > populating the updated aggregations to a data warehouse.

Static by design, data engineers must define relationships between steps in a job, pre-define the expected durations based on the worst-case scenario, and define the run-time schedules, hoping the pipelines run as expected.

But static scripts become more brittle as the scale of dependencies increases. If job stalls or errors for any reason, i.e. a server goes down, an exception is found in a data set, a job takes longer than the expected duration, or a dependent task is stalled, both the data engineering and devops teams have to spend countless hours manually identifying, triaging, and restarting jobs.

There are no systems to programmatically retry, queue, prevent overlapping jobs, enforce timeouts, report errors and metrics in a machine-readable way. Further exacerbating the problem, walking into the office every morning, the first question a data engineer asks is “did all of my jobs run?”, and there is no single source of truth to answer their question.

As the jobs and pipelines grow in number and complexity, data engineering and devops spend most of their time manually monitoring, triaging, and reconfiguring their pipeline to keep data flowing to support the business, resulting in more energy spent on the underlying platforms vs actually running the data pipelines.

Meet Apache Airflow

In 2015, a project at Airbnb, aptly named Airflow, focused on solving the brittle data pipeline problem, replacing cron jobs and legacy ETL systems with a workflow orchestration based infrastructure-as-code, allowing users to programmatically author, schedule, and monitor data pipelines. Believing that when workflows are defined as code, they become more maintainable, versionable, testable, collaborative, and performant.

Airflow today is the largest and most popular data infrastructure open source project in the world. Adopted by the Apache foundation in 2016, promoted to a top-level project in 2019, and as of this writing, has over 25K Github stars, 1900 contributors, over 8M downloads per month, and is one of the most energetic and passionate open source communities we have ever seen. Just surpassing Apache Spark in contributors, and Apache Kafka in stars and contributors. Airflow is used by thousands of organizations and hundreds of thousands of data engineers.

Airflow has been reinforced by the community as the standard and leader in data pipeline orchestration. With a mass migration off legacy ETL systems, and the increasing popularity of the Airflow project, companies from F500 to the most emerging brands of all sizes and segments are migrating to Airflow and replacing the need for data engineering and devops teams to spend most of their time managing and maintaining their data pipelines instead of building and scaling up new data pipelines. Handing back the control of pipelines to data engineering from DevOps, removing hours of debugging for failed or slow jobs, expanding the capability of the teams to perform more complex and performant pipelines, and run more jobs, faster, creating real business value.

And with the recent release of Airflow 2.0, the team introduced capabilities that go beyond workflow orchestration, to job and task execution, replacing the many, many tasks that teams today overly rely on heavy ETL processes vs moving them into Airflow, and the start of powering mission-critical operational use cases in addition to the core analytics needs.

Meet Astronomer, the modern orchestration platform

Astronomer unifies your distributed data ecosystem through a modern orchestration platform

In most organizations using Airflow today, it has become one of the most important pieces of infrastructure and changes how data engineering teams operate.

But like most popular open-source solutions, Airflow was designed by the community, for the community. Lacking the necessary enterprise capabilities as it proliferates across an organization such as cloud-native integrations, flexibility to deploy across the various environment and infrastructure setups, high availability, and uptime, performance monitoring, rights access, and security. Teams have to rely on homegrown solutions to solve these challenges as they scale up their installations to enable airflow-as-a-service to internal customers.

That’s until we met Joe and Ry.

Joe, just finishing his tour of duty as CEO at Alpine Data, previously SVP Sales at Greenplum/Pivotal, and Ry, a top contributor to Airflow, saw an early but promising open-source project, with little roadmap and project direction, that could become the central data plane for the data infrastructure. They dedicated themselves to reinvigorating the project, bringing life and focus to its original intentions, realizing that data orchestration and workflow weren’t a part of the data pipeline or infrastructure but was the core of it. They were convinced that Airflow needed to be brought to the enterprises, and there Astronomer was born.

They saw what Airflow could enable for organizations of all sizes; an enterprise-grade solution that was cloud-native, secure, and easy to deploy across any infrastructure or environment, cloud or customers own. Solving the key immediate challenges enterprises face when deploying Airflow and needing high availability, testing and robustness, and extensibility into any infrastructure setup.

Taking it one step further, deploying Astro Runtime, the managed airflow service, engineered for the cloud, as one integrated, manged platform. Giving enterprises complete visibility into their data universe, including lineage and metadata, through one single pane of glass. Start in less than an hour, scale to millions of tasks.

The result has been astro*nomical. Customers of all sizes and walks, looking to Astronomer to solve the challenges when trying to scale around availability, robustness, extensibility into their infrastructure, security, and support.

Today their customer base span every industry, segment, and vertical, from the top F500s to the most emerging brands, drawn to the power of the Airflow project and community, and confidence in the enterprise Astronomer platform.

Leading the Series A investment into Astronomer

We’ve long believed the data pipeline was central to the entire enterprise data stack; that workflow and orchestration was the meta-layer responsible for speed, resiliency, and capability of the data infrastructure, with the underlying primitives (connectors, storage, transformations, etc) easily replaced or augmented. As we see the needs of the enterprise increasingly shift from analytical use cases to power the business-critical operational needs, the orchestration layer will prove to be greater than the parts it executes.

As a result, we were fortunate to have been able to lead Astronomer’s Series A in April 2020, joining the board along with our friends at Sierra Ventures. Since then we’ve had the good fortune of being able to welcome our good friends Scott Yara and Sutter Hill who led their Series B, and Insight Partners, Salesforce Ventures, and Meritech with our Series C.

Exciting announcements are always sweeter with one more thing, so we also welcome Laurent, Julien, and the rest of the Datakin team to Astronomer!

This is just the beginning, and we can’t want to share what’s next.

Investing in Astronomer was originally published on


Unlocking the Modern Data Infrastructure, Venrock’s investment into Decodable

Announcing our Seed and Series A investments into Decodable

Beginnings of the real-time data enterprise

Data is rapidly transforming every industry. Brick & mortar stores are now turning into micro-distribution centers through real-time inventory management. Organizations are delivering 10x customer experience building up-to-the-second 360 customer views. Complex logistics services having real-time views into operational performance are delivering more on-time than ever before. Data is allowing enterprises to respond faster to changes in their business unlike ever before, and the volume of that data is increasing exponentially.

Just having the data stored in your data warehouse is no longer enough.

Real-time data isn’t for the faint of heart

Today, real-time has become the central theme in every data strategy.

Whether you are trying to understand user behavior, updating inventory systems as purchases occur across multiple stores, log monitoring to prevent an outage, or connecting on-prem services to the cloud, the majority of use cases can be enabled by filtering, restructuring, parsing, aggregating, or enriching data records. Then coding these transformations into pipelines to drive microservices, machine learning models, operational workflows, or populate datasets.

But building the underlying real-time data infrastructure to power them has always been another story.

What originally seemed like a rather straightforward deployment of a few Kafka APIs quickly became an unwieldy hardcore distributed systems problem.

This meant that instead of focusing on building new data pipelines and creating business value, data engineers toil in the time-sink of low-level Java and C++; agonizing over frameworks, data serialization, messaging guarantees, distributed state checkpoint algorithms, pipeline recovery semantics, data backpressure, and schema change management. They’re on the hook for managing petabytes of data daily, at unpredictable volumes and flow rates, across a variety of data types, requiring low-level optimizations to handle the compute demands, with ultra-low processing latencies. Systems must run continuously, without any downtime, flexing up/down with volume, without a quiver of latency.

Many Ph.D. theses have been written on this topic, with many more to come.

The underlying infrastructure needed to disappear

We’ve long believed that the only way to unlock real-time data was the underlying infrastructure needed to disappear. Application developers, data engineers, and data scientists should be able to build and deploy a production pipeline in minutes, using industry-standard SQL, without worrying about distributed system theory, message guarantees, or proprietary formats. It had to just work.

Introducing Decodable

Peeking under the hood of the Decodable platform

We knew we’d found it when we met Eric Sammer and Decodable.

We first met Eric while he was the VP & Distinguished Engineer at Splunk, overseeing the development of their real-time stream processing and infrastructure platform. He joined Splunk through the acquisition of Rocana, as the co-founder and CTO, which was building the ‘real-time data version of Splunk’. Eric was an early employee at Cloudera and even wrote the O’Reilly book on Hadoop Operations (!). A long way of saying Eric was one of the most sought out and respected thought leaders in real-time, distributed systems.

His realization ran deep. Enterprises would be able to deploy real-time data systems at scale only after:

  1. The underlying, low-level infrastructure effectively disappeared, along with the challenges of managing availability, reliability, and performance
  2. The real-world complexities of real-time data such as coordinating schema changes, testing pipelines against real data, performing safe deployments, or just knowing their pipelines are alive and healthy, were solved under the hood
  3. It just worked out of the box by writing SQL

Eric started Decodable with the clarity that developers and teams want to focus on building new real-time applications and capabilities, free from the heavy lifting needed to build and manage infrastructure.. ​​

Abstracted away from infrastructure, Decodable’s developer experience needed to be simple and fast — create connections to data sources and sinks, stream data from and to source(s) and sink, and write SQL for the transformation. Decodable uses existing tools and processes, within your existing data platform, across clouds and data infrastructure.

Powered by underlying infrastructure that is invisible and fully managed; Decodable has no nodes, clusters, or services to manage. It can run in the same cloud providers and regions as your existing infrastructure, leaving the platform engineering, DevOps, and SRE to Decodable.

It was simple, easy to build and deploy, within minutes not days or weeks, with no proprietary formats, and with just SQL.

No more low-level code and stitching together complex systems. Build and deploy pipelines in minutes with SQL.

Partnering with Eric & Decodable

Enterprises need to build differentiation to unlock the value of their data, not in the underlying infrastructure to power it. However, that infrastructure is crucial and needs to be powered by a platform that just works as if you had a 25+ data engineering team dedicated to its uptime and performance. It needs to abstract the complexity and allow developers to create new streams in minutes with just SQL.

Eric is building the future of the data infrastructure, and today the platform has moved into general availability!

So we’re thrilled to partner with Eric and announce our Seed and Series A investments into Decodable, co-led with our good friends at Bain Capital. We couldn’t be more excited to support Eric to unlock the power of real-time data.

Unlocking the Modern Data Infrastructure, Venrock’s investment into Decodable was originally published on Medium.


Investment Follow-Up: Announcing Atom Computing’s Series B

Congratulations to Ben, Jonathan, Rob, and the team at Atom Computing on the close of their $60M Series B, and welcome to our new friends at Third Point Ventures and Prime Movers Labs to the team!

Since announcing Atom Computing’s seed round in 2018, we thought it would be fun to follow up on the original thesis to our investment compared to where we are today.

Atom Computing’s 100-qubit quantum computer, Phoenix

We’ve long been believers in the near-term reality of quantum computing. The promise of quantum computers will unlock a new dimension of computing capabilities that are not possible today with classical computers, regardless of the size, compute power, or parallelization.

We wrote in 2020 about the progress to setting the stage for a commercial era of quantum computers and drivers accelerating the age of commercial-ready quantum computing.

From risk analysis, monte carlo simulations, determining chemical ground states, to dynamic simulations, FeMoCo, image/pattern recognition, and more, fundamental questions with economic impacts in the tens to hundreds of billions of dollars will be unlocked thanks to quantum computing.

Our original hypothesis into QC came from the belief that regardless of Moore’s law, classical computers are fundamentally limited by a sequential processing architecture. Even with breakthroughs in SoC’s, 3D integrated memory, optical interconnects, ferroelectrics, ASIC’s, beyond CMOS, and more, sequential processing means ‘time’ is the uncontrollable constraint.

For complex, polynomial, or combinatorial problem sets, with answers that are probabilistic in nature, sequential processing is just not a feasible approach due to the sheer amount of time it would take to process. Where quantum computing, even with a few hundreds of qubits, can begin to offer transformational capabilities.

But the challenges in quantum computing had traditionally stemmed from lack of individual qubit control, sensitivity to environmental noise, limited coherence times, limited total qubit volume, overbearing physical resource requirements, and limited error correction. All required precursors to bringing quantum computing into a real-world domain.

This understanding helped us develop a clear thesis to what a scalable architecture would need to look like when evaluating potential investments (re-pasted from our original seed post):

  1. An architectural approach that could scale to a large number of individually controlled and maintained qubits.
  2. Could demonstrate long coherence times in order to maintain and feedback on the quantum entangled state for long enough to complete calculations.
  3. Designed to have limited sensitivities to environmental noise and thus simplify the problem of maintaining quantum states.
  4. Could scale up the number of qubits without needing to scale up the physical resources (i.e. cold lines, physical wires, isolation methods) required to control every incremental qubit.
  5. Could scale-up qubits in both proximity and volume in a single architecture to eventually support a million qubits.
  6. Could scale-up without requiring new manufacturing techniques, material fabrication, or yet to be invented technologies.
  7. The system could be error corrected at scale with a path to sufficient fault tolerance in order to sustain logical qubits to compute and validate the outputs.

We originally led Atom Computing’s seed in 2018 based on the belief that we were backing the world-class team, and that the architectural approach on neutral atoms would be the right building block for building a scalable system.

Three Years Later…

Atom Computing is the first to build nuclear spin qubits out of optically-trapped atoms. Demonstrating the fastest time to 100 qubits ever, in under 2 years since founding, still a high watermark for the QC industry in qubits and time.

They demonstrated the longest ever coherence time for a quantum computer with multiple qubits, at over 40s of qubit coherence time, compared to the next best only in milliseconds and microseconds. Cementing that neutral atoms produce the highest quality qubits; with higher fidelity, longer coherence time, and ability to execute independent gates in parallel compared to any other approach.

Long coherence is a critical precursor to error correction and eventual fault tolerance.

They demonstrated scaling up to 100 qubits wirelessly controlled in a free space of less than 40μm. No physical lines or resources for each qubit, no dilution chambers, or isolation requirements. In the coming years they’ll show over 100,000 qubits in the same space as a single qubit in a superconducting approach.

And they proved they could recruit a world-class team of executives to bring the Atom Computing platform to the market, as Rob Hays joined as our CEO from Lenovo & Intel, Denise Ruffner joined as our Chief Business Officer from IonQ and IBM Quantum, and Justin Ging joined as our Chief Product Officer from Honeywell Quantum.

The heart of Phoenix, where qubits are made

They proved they could build a machine with a large number of individually controlled and maintained atoms. With the longest coherence time on record. Without the need for complex isolation chambers or resources. Could scale up quickly and densely. And required no new manufacturing, materials, or physics.

As the team begins to bring forward the Atom Computing platform with our second-generation system to run breakthrough commercial use-cases, the next three years will be even more exciting*.

We couldn’t be more proud of the team and what they’ve accomplished. The last three years have demonstrated a lot of first in the quantum computing industry. Can’t wait to share what happens in the next 3 years.

Investment Follow-Up: Announcing Atom Computing’s Series B was originally published on Medium.


Let’s Made A Deal: A Crash Course On Corporate Development

Wash, rinse, repeat: A startup is founded, first product ships, customers engage, and then a larger company’s corporate development team sends a blind email requesting to “connect and compare notes.”

If you’re a venture-backed startup, it would be wise to generate a return at some point, which means either get acquired or go public.

If you’re going to get acquired, chances are you’re going to spend a lot of time with corporate development teams. With a hot stock market, mountains of cash and cheap debt floating around, the environment for acquisitions is extremely rich.

And as I’ve been on both sides of these equations, an increasing number of my FriendDA partners have been calling for advice on corporate development mating rituals.

Here are the highlights.

Before my first company was acquired, I believed that every acquisition I’d ever read about was strategic and well thought out. I was blindingly wrong.

You need to take the meeting

Book a 45-minute initial meeting. Give yourself an hour on the calendar, but only burn the full 60 minutes if things are going well. Don’t be overly excited, be a pleaser and or ramble on. Pontificate? Yes, but with precision.

You need to demonstrate a command of the domain you’ve chosen. Also, demonstrate that you’re humble and thoughtful, but never come to the first meeting with a written list of “ways we can work together.” That will smell of desperation.

In the worst-case scenario, you’ll get a few new LinkedIn connections and you’re now a known quantity. The best-case scenario will be a second meeting.

But they’re going to steal my brilliant idea!

No, they aren’t. I hear this a lot and it’s a solid tell that an entrepreneur has never operated within a large enterprise before. That’s fine, as not everyone gets to have an employee ID number with five or six digits.

Big companies manage operational expenses, including salaries and related expenses, pretty tightly. And there frequently aren’t enough experts to go around the moneyball startups for new domains, let alone older enterprises.

So there’s no secret lab with dozens of developers and subject matter experts waiting for a freshly minted MBA to return with their meeting notes and start pilfering your awesomeness. Plus, a key component to many successful startups is go-to-market (GTM), and most larger enterprises don’t have the marketing and sales domain knowledge to sell a stolen product.

They still need you and your team.

You will be the wisest person there

In these meetings, do not assume that the majority of the attendees will have the same level of product and market knowledge that you possess. In fact, treat early meetings as educational and exploratory.

Use this to your advantage: Have points of view that plant traps for your competitors and accentuate your uniqueness. By gauging how well a corporate development team knows a market and competitors, you can understand where they are in their process, how serious they are, and if you’re just a box being checked because the real target has been identified.

Who else is in the room?

I made my first acquisition in my mid-20s after failing to get operational expenditure to build a new product concept. In retrospect, the leadership of the company we acquired must have thought it odd that this random child was pursuing them, and they must have wondered if I was even authorized to be having the conversations I was having (no comment).

But you definitely want to understand who’s in the meeting. If you’ve got the general manager of a multi-billion dollar business unit, excellent. If it’s a summer intern who’s writing in crayon … this might not be a priority. But also remember this is a journey, so do the meeting, be respectful and hope things continue.

What is strategic?

Before my first company was acquired, I believed that every acquisition I’d ever read about was strategic and well thought out. I pictured a secret bunker with multiple forms of authentication protecting walls covered in market maps, competitive weaknesses and targets that will complete the quest for market leadership.

I was blindingly wrong. Acquisitions generally fall into three categories:

  1. Acqui-hire: Opportunistic — interesting tech, couldn’t get explosive growth, investors got tired, no one retires but tech lives on, and many team members have employment continued.
  2. Strategic: Decently well thought-out, high-growth business leading their market, premium dollars paid, founders get an earnout that delays private island purchase, lucite statues for everyone.
  3. Executive hubris: Sometimes a senior executive will get an idea in their mind and everyone around them is powerless to prevent impending disaster. Often, premium dollars are paid, but there’s no obvious synergy with the existing business and the offending executive is the only person who speaks in meetings, phrasing questions in the form of statements to their underlings in attendance.

It’s a marathon with a lot of sprinting

If there’s appetite for an acquisition from a larger company, you’re going to have a lot of intense moments followed by (perceived) lulls in activity. I tell folks on a path to get acquired that it’s going to take 17 meetings. But it’s actually going to be variations of the same meeting 17 times. Each meeting, there will be one new person and they are the Most Important Person in the meeting.

It’s likely they’re the next rung up the leadership or part of a new function that needs to engage on the deal (Sales, for example). If you get to 11 meetings and then no more, well, you know that the 11th Important person more than likely played their veto card (Sorry).

At the same time, quiet does not mean over. When a larger company is doing an acquisition there are dozens (sometimes hundreds) of people who get involved. That’s a lot of people to brief, answer questions from, advocate with, etc. It doesn’t mean your potential buyer has lost interest just because they don’t call you daily.

But don’t be surprised if you suddenly get a flurry of questions after periods of silence. If things are going well, this is your chance to partner with the corporate development team and make them look like heroes by getting back the best information you can to answer the (likely) random questions.

What about bankers?

If you aren’t running a formal sale process, corporate development teams will get annoyed if you show up with a banker at the first meeting. And, well, all the meetings. The really good investment bankers will always annoy a corporate development team, because this is Now Serious and the power dynamic will shift from the big company to the banker (note: it does not shift to you, the entrepreneur).

Assuming there’s no process leading up to the first meeting, you should seek out a banker only once the intentions of the larger organization are clear. This will likely be after things like joint customer exploration, a partnership, etc. Also, bankers are expensive, so make sure a senior leader at the potential acquirer has signaled that they want to get married and at a price that generates an ROI on those fees.

Bankers will also be happy if you’ve been engaged with multiple corporate development organizations, because they can then manufacture FOMO.

Strategic options

Every corporate development team does the same exercise: Build, buy, partner. I doubt there has ever been an acquisition where someone had to generate this spreadsheet.

The reality is that this exercise has never generated an, “oh, we should build this” result once you’re past a few meetings. But if you’re in early meetings and there’s an engineering VP who’s demonstrating their intelligence, yes, they might try to build something.

So the discussions will go on ice, the company will try to build — and fail, and then discussions will resume. (Again, they aren’t going to steal your source code, etc.) Remember: Play the long game.

The partner option is a little harder to navigate. Don’t be fooled by someone who says, “We have 18,000 sales people who will sell your product.” It takes a lot of work to partner with a bigger company and they might drag you into deep water before you know it.

Let’s Made A Deal: A Crash Course On Corporate Development was originally published in TechCrunch.


13 Problem Areas in Data Infrastructure to build NewCos

Sharing the problem areas and opportunities that could see large, rapid growth new challenges emerge.

A year ago, we decided to start open-sourcing our thinking around problems areas of interest, starting with 15 Problem Spaces in Developer Tools & Infrastructure We’re Excited About at Venrock. With the hope that it would spark conversations and bring teams together to go after exciting opportunities. With so much that has changed in the world, we decided to follow up almost a year later, with our next iteration.

The last 12 months have been a technology tipping point for businesses in the wake of remote work, the increased need to leverage data for faster decision making, increased pressures of moving workloads to the cloud, and the realizations of technology investments as a competitive advantage in a digital world. The digital transformation we’ve seen play out over the last few years just compressed the next five years of progress into one.

Companies such as Astronomer*, DBT, Decodable*, Imply, Materialize, Superconductive, and more — built on core open source projects — have seen meteoric rises as a result of increased focus on data engineering and delivering business value through unlocking data velocity.

Private valuations soared for companies such as Databricks, Confluent, and Datarobot; ushering massive infrastructure transitions from the likes of legacy incumbents Cloudera, Talend, Oracle, and Informatica as they modernize their enterprise capabilities.

Public companies such as Snowflake, Mongo, Cloudflare*, and Twilio are seeing historic 20x — 60x EV/revenue multiples as ‘digital transformation’ shifts into second gear. With a concerted focus on modernizing the infrastructure and data planes in order to unlock data as a competitive edge and reduce operational overhead in order to move faster. We’ve previously written about this as the evolution to an ‘everything as code’ model and the era of the programmatic infrastructure.

While a lot has changed in the last 12 months, much also remains the same, with continued opportunities to evolve how organizations build services, deploy infrastructure, distribute resources, increase data velocity, secure applications, and begin to leverage machine learning for workload-specific optimizations.

If the 2010s represented a renaissance for what we can build and deliver, the 2020s have begun to clearly represent a shift to how we build and deliver, with a focused intensity on infrastructure, data, and operational productivity.

As we look forward over the next 12–24 months, here are (13) more problem areas we’ve been excited about. If any of them resonate with you, or if you have comments/thoughts, please reach out!

  1. Persistence layer replication is still an unsolved problem in true multi-cloud deployments. The evolution of the multi-cloud is allowing applications to become more cloud-agnostic. You can deploy now wherever there is capacity or specialized services available. While you can elastically scale up your application servers, there is no way to auto-scale your persistence layer. As soon as you talk about disk storage, cross DC communication latency becomes untenable. The bigger your persistence layer footprint is, the more sharded your data becomes, the more replication becomes an architecturally limiting problem.
  2. The AWS console is the worst. Not data infra specific, but a challenge that plagues the entire infrastructure workflow.
  3. Streaming data continues to require significant resources to ingest, build, and manage new pipelines. Streaming data requires a very different approach than batch; higher velocity of data, near real-time processing, unpredictable data loads; and limited out-of-the-box infrastructure exists. Companies such as Decodable* are making this easier, but it’s still early days.
  4. Kubernetes management still relies on YAML. For all the advancements containers and orchestration enable for the modern application stack, we are still managing and configuring them like we’re in 2008. Imagine a world without YAML? Companies such as Porter* are making this easier, but YAML continues to be the leading cause of insomnia for SREs and DevOps.
  5. Snowflake is becoming the new Oracle (the good and the bad). The move from hardware to a fully managed cloud data warehouse was a huge leap in capability, flexibility, and cost (we thought so at least), and has reinvigorated a ‘SQL renaissance’. At the same time, it has left a lot to be desired in an attempt to be the ‘catch-all’ data warehouse. Concurrency is often limited to 8–15 queries before needing to spin up a new node. No indexes exist so you must rely on system compression strategies and metadata options. Some queries can be painfully slow if you rely on any type of joins or scans. Setup requires explicit knowledge of what the data is, how it will be used, and when to separate ingestion from reporting, when to split deployments, etc. Migrating from a traditional database is riddled with data quality issues due to the lack of indexes, unique or primary keys, etc.
  6. The Data Lakehouse idea of bringing the best of both Data Warehouses and Data Lakes together (engineers are terrific at naming) has been a meaningful step forward but continues to be too complex to deploy and manage. Data ingestion vastly differs between streaming and batch data. Compute and storage need to be decoupled. Storage layers need to be purpose-designed based on the data (document, graph, relational, blob, etc)… Active cataloging is required to keep track of sources, data lineage, metadata, etc. Do we need/want more one-size-fits-all solutions vs vertical/specialized?
  7. Machine learning in infrastructure management and operations is still incredibly nascent. While we significantly overestimated the likely number of machine learning models in production powering business-critical use cases in user applications, applying it to both stateful and stateless infrastructure would be a no-brainer. Available structured log data for training, known downstream and upstream dependencies, available time-series event data, bounded capacity constraints, makes for a perfect use case for supervised and unsupervised learning to build management planes that take the reactive blocking & tackling out of infrastructure management.
  8. The underlying theory behind database indexes hasn’t changed in decades. We continue to rely on one-size-fits-all, general-purpose indexes such as B-Trees, Hash-maps, and others that take blackbox views of our data, assuming nothing about the data or common patterns prevalent in our data sets. Designing indexes to be data-aware using neural nets, cumulative distribution functions (CDF) and other techniques could lead to significant performance benefits leading to smaller indexes, faster lookups, increased parallelism, and reduced CPU usage. Whether multi-dimensional or high volume transaction data systems, memory-based or in-disk, data-aware systems are already demonstrating step function benefits over the current state-of-the-art systems.
  9. There is little-to-no machine learning used to improve database performance. From improved cardinality estimation, query optimization, cost modeling, workload forecasting, to data partitioning, leveraging machine learning can have a substantial impact on query cost and resource utilization, especially in multi-tenant environments where disk, RAM, and CPU time are scarce resources. Gone can be the days of nested loop joins and merge joins + index scans!
  10. Data quality and lineage is still mostly unsolved, despite many attempts at pipeline testing and like solutions. Unlike software development, there is limited ‘develop locally, test in staging, push to production’. How do business users and analysts know when to feel confident in a certain dataset or dashboard? Can we apply tiers or ratings to certain data sources or pipelines in order as a way to determine confidence in uptime/lineage/quality/freshness? And how can engineering or ops track and remediate issues once models/workloads are in production?
  11. Our modern data stacks have been overbuilt for analytics and underbuilt for operations. While the current analytics-centric approach provides for a strong foundation, the shift to powering operational, more complex use cases, is still in its infancy. Enterprise executives are beginning to ask how these data infrastructure investments can begin to help speed up our supply chain fulfillment, connect demand forecasting to capacity planning, improve preventative maintenance, respond to user engagement/problems faster, clickstream data, and more. Where are the modern equivalents of Snowflake, DBT, Fivetran, etc for operational business needs?
  12. Where does application serving fit into the modern data stack? Where high concurrency & low latency is required (the opposite of a data warehouse). While for read-only workloads there are solutions, it usually means copying over data to Redis, Memcached, etc. Check out Netflix’s bulldozer for an idea of how this can be done with production scale, a self-serve data platform that moves data efficiently from data warehouse tables to key-value stores in batches, making the data warehouse tables more accessible to different microservices and applications as needed. The enterprise ‘bulldozer’ could be a massive hit.
  13. “Excel is production” is the unfortunate standard for many critical business workloads. The reason often is data engineering as a bottleneck in moving business-critical workloads that should be production services. The challenge is multifold. Data ingestion and processing are often managed through a series of highly sequenced and brittle scripts. Excel or Google Sheets is used as the data warehouse. Complex, 500 line queries are driving business processes. The migration to a production-quality service is untenable without it being completely rewritten by data engineering. How can we build services to enable data analysts and business users to create production-grade workloads from the start?

*Venrock portfolio company

13 Problem Areas in Data Infrastructure to build NewCos was originally published on Medium.


First Time Founder – Now What?

A discussion of the steps that developers can take to becoming entrepreneurs and turn their coding project into a funded start-up company.

Over the past year, a number of early-stage startups led by IT practitioners-turned-founders have been funded, and countless more are currently seeking capital. Many of those founders are on unfamiliar ground, building a startup for the first time. Making the transition from practitioner to founder is a non-trivial move and it can be difficult to navigate without the right resources and guidance; but there are some universal steps that will set you on the right path.

If you’re still reading this, there’s a good chance you’re an IT practitioner. And chances are you’ve been solving problems for your company while gaining a unique insight into where today’s technology is lacking. You’re probably leading your company’s IT department or managing your cloud infrastructure, but are itching to explore your budding start-up idea. While you have the insight and expertise to make it happen, you’re unsure about how to switch gears and take those first steps towards entrepreneurship. Unlike developers who’ve been checking in code forever, you might not know where to begin – technically. You’ve also probably come across a number of companies that have proven themselves successful by becoming a unicorn, getting acquired for a large sum, or going public; you want to join the club.

For those looking for more information on starting a company, here are some tips for navigating the start-up ecosystem and maybe even thwarting some common mistakes.

Building a Start-Up 101 

  1. Start with some key questions: It’s important to define your solution either as a feature, product or company. You’ll also want to figure out if you truly have product-market fit, all the while making sure that you’ll be able to have more than just a few great customers – think long-term. 
  2. Truly understand the market that you’re going into. Researchhow big the market is and know who is in the market right now. Figure out how you are going to be better than what’s already out there or understand why no one else has solved this problem already. If someone has already failed at trying to solve the problem you’re focusing on, why did they fail? Or if you are building off of an existing technology, what would motivate someone to choose your solution specifically? 
  3. Build a strong founding team. It’s crucial toconsider how a potential co-founder works on a professional and interpersonal level. Once you’ve started a company with someone, you’re going to be in it for the long haul; in other words, it’s really hard to get a divorce. The start of a journey is when everyone is on their best behavior, so note any red flags. Within your founding team, you’ll also want to have someone who specializes in the technical aspects, someone who has product experience, and someone who understands marketing and go-to-market. Determining responsibilities early on will also help in the long run.
  4. Know your story. Make sure you can describe your business in a way that is universally easy to understand and compelling. Ask yourself, “Can I explain my solution to this complex problem through a 30-second elevator pitch?” Then practice, practice, practice.
  5. A great pitch deck will go a long way. Cover the basics by including the problem, the market, your solution, your team, your traction (if any), and your special sauce or why you are going to win.
  6. Understand how venture capital operates. It may sound non-obvious, but when raising money,it’s helpful to consider what your *next* round will look like. Always“pre-bake” for the next funding round by outlining key milestones which will pave a path to the next round. Also, keep in mind who you want to be involved in that round. These days, the lines between rounds are blurred, but generally, you can think of the first round as the one to get you to product-market fit, the second round as the one that helps you sell the product in the market, the third round as the one that helps scale-up, and the fourth as the one to further expand your sales motion.
  7. Honesty. Be honest, communicate honestly, and create a culture of honesty. Bonus: culture will shine through to the customer and help scale sales growth. Upfront communication between colleagues will also help maintain relationships and help set you up for success.
  8. Have a solid advisory board. You want an advisor who: 1. provides business advice, 2. opens the door to their network (a.k.a.  potential customers), and 3. has a strong reputation that will help you gain credibility. You likely won’t get advisors that check all the boxes, but if you do, that’s amazing!
  9. Saying no is just as important as saying yes. Early pilot customers are extremely useful for gaining hands-on experience and tremendous amounts of feedback, but you’ll need to sift out the bad advice. Venn Diagram all of the feedback you receive to see what overlaps across different aspects – that’s what you should be focusing on.

Most importantly, remember that it’s a journey. This won’t be an overnight success and there isn’t just one predefined path. In many cases, you’re going to feel like every problem or failure you encounter will be the one that ends it all, but, in hindsight, it won’t be. Remember that you are not alone. In fact, it’s extremely common to run into an avalanche of challenges when building a company – you just have to keep going.

First Time Founder – Now What? was originally published on DZone.


Building Commercial Open Source Software: Part 4 — Deployment & Business Model

Building a Commercial Open Source Company

In our time investing and supporting open-source companies, we’ve learned several lessons about project & community growth, balancing roadmap and priorities, go-to-market strategy, considering various deployment models, and more.

In the spirit of open source, we wanted to share these learnings, organized into a series of posts that we’ll be publishing every week — enjoy!

PART 4: Your deployment model is your business model

  1. Serverless does not make a company: Offering a ‘serverless’ managed service for your project can be a significant boon to kick start customer adoption; it’s easy to get up and running, running a sophisticated service 24/7 is non-trivial and removes the operational complexities with devops and infrastructure. At the same time, a hosted open core does not make for a company. Offering a serverless version of your open core can be a great initial start but is not sufficient. You need to ask yourself: can your offering be differentiated enough from a cloud provider offering the same thing? Can a customer scale up on your managed service or will they eventually need to migrate onto their own infrastructure for security, residency, policy, or other enterprise needs? Can you continually build layers of value on top of your open core and deliver them all through the managed service? Or are you really just reselling cloud compute resources from AWS? Companies such as Mongo, Redis, Astronomer*, and Cockroach have gone well beyond just managed versions of their open cores in order to drive value on top of their open cores.
  2. My service, your infrastructure: The type of workloads, users, and use cases should dictate the deployment strategy, and doesn’t need to be limited to managed service or self-hosted. Think about strategies that meet the infrastructure requirements of the customer while making it as easy to deploy, manage, and scale as possible. You might offer managed orchestration but running within the customer’s infrastructure — allowing the customer to host within their cloud account but solving the operational challenges of keeping the service running. Or offer as a private ‘as a service’ for an entire customer organization to run on, making it easy to deploy on any cloud or platform within a customer’s infrastructure.
  3. Security as a deployment model: Workloads running on data centers or in the private cloud still make up 2/3rd of worldwide IT infrastructure. This means it’s highly likely that your customer may be migrating from an on-prem or private cloud instance to your service. Recently VPCs have been the baby-step solution for this, but lack many of the benefits from a multi-tenant, public cloud service. This is where companies such as MongoDB’s Atlas have introduced a pseudo-VPC or VPC-as-a-service like offering with an enterprise-friendly ‘-aaS’ offering but performant, reliable, scalable, and ultra-secure. It’s effectively a fully-managed environment and service, offering the performance, reliability, and scale of multi-tenant public cloud service but with the data security, namespaces, and isolation of a virtual or on-premise private cloud. Some services can now offer network isolation, role-based access management, bring-your-own SSO/SAML, and end-to-end encryption, but on a multi-tenant cloud offering — operating like cattle, not pets.
  4. Support-first is not a death wish, it’s a distribution strategy: Support-focused models get a bad rap, often being called a ‘lifestyle business’, and just enough revenue for a comfortable lifestyle, but unlikely to generate VC returns. In general I agree with this, with one caveat, which is the idea of *starting* with a support-first model to drive broader adoption of an OSS project that is fairly new technology. It’s a compelling way to acquire customers and generate value/lock-in before introducing new closed features that amplify the value of the open core. Heptio was a great example of this (open core + support/services), building a powerful land/expand motion that eventually led to more powerful enterprise features.
  5. Pricing should be a feature, not a bug: Think about your pricing strategy as a way to further reinforce the value you are delivering. Whether you are building a faster DB, a better way to build data pipelines, automation workflows, etc., pricing should reflect how developers use and benefit from your solution. Whether it means charging by instance, consumption, users, volume, features/capabilities, it should be easy to predict and grow as the customer value grows.

Building Commercial Open Source Software: Part 4 — Deployment & Business Model was originally published on Medium.


Earn While Doing What You Love

When I was in high school I had the good fortune to earn a spot on the Jones Beach Lifeguard Corps. It was a job that was every bit as fun as it sounds, and because we were unionized state employees, it paid decently too. Our days involved sitting on the lifeguard stand every other hour, staring intently at our patch of ocean, followed by an hour off, during which we were encouraged to exercise, take out the surf lifeboats, or “patrol” the sand. I remember commenting to one of the grizzled veterans several decades my senior that “I would do this job for free.” He looked at me with a knowing eye, tinged with the pitying look of a chess player who knows they are at least two moves ahead of you, “but the thing is kid, they do pay us to do this.” Those summers were an early lesson in the harmony of getting paid for doing something you truly love.

Because I am passionate about entrepreneurship and software, I am still earning a living doing what I love, as an early stage technology venture capitalist. For many people, however, neither business nor technology sparks joy. For them, teaching yoga, or fitness, or cooking, or magic, or art, or you-name-it, is what they love. Being chained to a laptop with seven browser tabs open so they can create email campaigns, manage customer lists, process payments, and balance their business accounts, is at best a necessary evil to enable them to earn income pursuing their passion.  

The aforementioned state of affairs has long held, but in March of 2020 Covid19 threw a curveball at small businesses everywhere, but especially those dependent on serving clients face to face. All of a sudden small business owners needed to go virtual by figuring out how to use Zoom, accept online payments, and hopefully make up some of their lost revenue by serving a potentially bigger, geographically dispersed audience. And for the employees of these small businesses, many of them saw their work hours shrink, or faced painful furloughs. For some of these employees, however, necessity led them to branch out on their own, serving clients directly through video conferencing, with neither the limitations nor safety net of working for someone else. Add to the mix countless others who saw the opportunity to turn their personal hobby into an income producing “side hustle” as virtual services quickly went mainstream.

Enter Luma, a startup founded, quite appropriately, by two engineers who had* only ever met over video conference. In March 2020, Dan and Victor quickly saw the need to help solopreneurs, small businesses, and groups invite people to virtual events, accept payments, and manage customer relationships. They applied their skills as full-stack programmers to quickly launch an MVP, which met with quick success. Because Zoom was designed primarily for business meetings and webinars, Luma saw an opportunity to leverage Zoom for many other use cases by enabling customizable event pages, CRM and membership management, subscriptions, payments, and easily understandable analytics for event hosts. Luma has been used for hosting fitness classes, magic shows, cooking classes, writers workshops, live podcasts, PTA speaker series, and a myriad of other activities. The list of future features, use cases, and target user segments grows longer everyday.

While Dan and Victor were quick to jump into action with Luma back in April, now that Zoom has become a verb they are hardly the only ones to see the need for a virtual event platform. What drew me to invest in these two founders, however, is their incredible ability to get stuff done, their high bar for quality and customer service, and their relentless intellectual curiosity driving them to best understand how to improve the lives of their users, so that hosts and guests alike can spend more time doing what they love, while the fiddly bits of technology and managing a business become nearly invisible. 

One great example of Dan and Victor’s commitment to customer centricity was the following. One evening a few months ago I was about to log on to a parent education event hosted by Common Ground Speaker Series. What I soon realized was that I had failed to pre-register and so I was missing the appropriate Zoom link. I found a live chat help button, not knowing whether anyone from Common Ground would actually be there at this late hour, and lo and behold Victor pops up in the chat within seconds and immediately works behind the scene with the event host to get my registration accepted so I could receive the link. Victor himself was providing live support to an event host at the end of a day filled with coding new features, working on strategic planning, creating marketing campaigns, recruiting team members, and donning dozens of other hats as a startup founder does. All that, and I’ve never seen either Victor or Dan without a huge smile on their faces. Luma’s founders embody the commitment, optimism, and truth seeking that great founders embrace, which is ultimately why we invested in them, and are so excited for the journey ahead. Luma helps people earn a living doing what they love. I am fortunate to earn my living helping great founders like Dan and Victor.

*Dan has since relocated to San Francisco and the two founders are now bubble-mates working together shoulder to shoulder.

Earn While Doing What You Love was originally published on VCWaves.


Building Commercial Open Source Software: Part 3 — Distribution & GTM

Building a Commercial Open Source Company

In our time investing and supporting open-source companies, we’ve learned several lessons about project & community growth, balancing roadmap and priorities, go-to-market strategy, considering various deployment models, and more.

In the spirit of open source, we wanted to share these learnings, organized into a series of posts that we’ll be publishing every week — enjoy!

PART 3: Sequence your distribution & GTM strategy in layers

Image for post

1. Vibrant communities make for the best lead generation:
The open-source popularity of a project can become a significant factor in driving far more efficiency and virality in your go-to-market motion. The user is already a “customer” before they even pay for it. As the initial adoption of the project comes from developers organically downloading and using the software, you can often bypass both the marketing pitch and the proof-of-concept stage of the sales cycle.

2. Start bottoms up, developers first: Focus on product-led growth by building love and conviction at the individual developer level. Make it easy to sign up, deploy, and show value. Leverage developer referrals, community engagement, content marketing, and build the product-led growth mentality into each function of your company. Nailing the developer experience can lead to growth curves that look much more like consumer businesses than enterprise, and lead to much larger deals later on.

3. Nail bottoms up before enterprise selling: You can focus on product-led growth (bottoms up) or account-based marketing (top-down), but not both at the same time. Start with product-led growth to build an experience that developers love. Once you’ve reached some critical mass with a flywheel of developer acquisition, begin to introduce account-based marketing, starting with the expansion of existing customers to learn the enterprise sales motion before going after new accounts.

4. Developer first doesn’t mean developer-only: While nailing the developer-first experience is key to driving strong customer growth, it’s often not sufficient when trying to scale the project into larger-scale deployments. Transforming from a proof of concept to multiple large scale deployments across the customer’s organization requires a different set of decision-makers and requirements (i.e. security, policies, control, SLAs). Be sure to understand how the needs of the organization may differ from the needs of the developer when planning how to expand deal sizes and go after larger customers.

5. Build your sales funnel based on project commitment: Customers will come in three coarse flavors: (1) already deployed OSS project internally (2) starting to deploy OSS project internally, (3) decided to adopt OSS project. Design the sales motion tailored to the customer journey in order to focus on solving the right problems and challenges.

6. Target the ‘right’ developer: It’s critical to know who you are solving for and what you are solving for them. Going after the wrong developer persona can make a critical difference in whether the developer community understands and embraces your solutions, or not. Is this a solution for DevOps or data engineering? Technical business users or Data Scientists? An example data infrastructure project could be seen as (a) making it easier for DevOps to manage, (b) shifting the power from DevOps to engineering, c) helping data engineering leverage better code patterns, and (d) making it more secure for SecOps to manage data access. Obviously, all (4) have very different problems, with different values associated to them, but are all value props of the same solution. Focusing on the right persona, with the most painful problem, where you can continually layer value over time, is critical to building wider community love and commercial adoption.

7. Sell impact, not solutions: Help understand the total cost of ownership (TCO) of your solution vs an existing, closed, or in-house system — this matters and is rarely done well by the customer when making buy/build decisions. Understanding the value and ROI your service delivers, both hard and soft, allows you to sell on impact to the business, and not on a technical solution. Are you saving developer headcount? Increasing developer productivity? Reducing infrastructure costs? Cost-take out of a more expensive or legacy system? Being clear on the cost savings and velocity benefits of your solution drives up customer contract values.

Building Commercial Open Source Software: Part 3 — Distribution & GTM was originally published on Medium.