30 startups that show how open source ate the world in 2021

Hear from CIOs, CTOs, and other C-level and senior execs on data and AI strategies at the Future of Work Summit this January 12, 2022. Learn more

Let the OSS Enterprise newsletter guide your open source journey! Sign up here.

It has been a busy year in the open source software sphere, from high-profile license changes to critical zero-day vulnerabilities that sent businesses into meltdown. But in among all the usual hullabaloo that permeates the open source world, countless open source startups launched new products, attracted venture capitalist’s (VC) money, and generally reminded us of the role that open source plays in today’s technological landscape — including the data sovereignty and digital autonomy it promises companies of all sizes.

Here, we take a look at some of the fledgling commercial open source companies that gained traction in the past year, revealing where enterprises and investors are betting on the power of community-driven software.

Polar Signals: Cutting cloud bills

Continuous profiling belongs to the software monitoring category known as observability. It’s chiefly concerned with monitoring the resources that an application is using, such as CPU or memory, to give engineers deeper insights into what code — down to the line number — is consuming the most resources. This can help companies reduce their cloud bill, given that most of the major cloud platform providers charge on a consumption basis.

While there are a few continuous profiling products on the market already, Polar Signals officially went to market back in October with the launch of an open source project called Parca. At the same time, Polar Signals also raised $4 million in seed funding from Alphabet’s venture capital arm GV and Lightspeed, as it gears up to launch a commercial hosted product in 2022.

Unleash: Open source feature management

Above: Unleash: An open source feature flag platform

Feature management is an important part of the continuous release/continuous deployment (CI/CD) process, one that allows developers to test new features incrementally with a small subset of users, turn features on or off, and A/B test alternatives to gain insights into what works best — without having to ship a whole new version.

Unleash is an open source platform that promises companies greater flexibility and control over their data and feature management deployment. The company raised $2.5 million last year to build on its recent growth, which has seen it secure customers such as Lenovo and U.S. manufacturing giant Generac.

Conduktor: A GUI for Kafka

Above: Conduktor founders Stéphane Maarek (CMO), Stéphane Derosiaux (CTO), and Nicolas Orban (CEO)

Companies that need real-time data in their applications often use Kafka, an event streaming platform built to handle common business use-cases such as processing ecommerce payments, managing signups, matching passengers with drivers in ride-hailing apps. Some 80% of Fortune 100 companies use Kafka to store, process, and connect all their disparate data streams — but Kafka requires significant technical nous and resources to fully leverage, which is where Conduktor is setting out to help with an all-in-one graphical user interface (GUI) that makes it easier to work with Kafka via a desktop client.

Conduktor last year raised $20 million in a series A round of funding led by Accel, as it looks to “simplify working with real-time data” on the Kafka platform.

Scarf: Measuring open source software

Above: Scarf: Example dashboard

While open source software may well have eaten the world, the developers and companies behind open source projects often lack meaningful insights into their project’s use and distribution, something that Scarf has set out to solve.

The company’s core Scarf Gateway product serves as a central access point to all open source components and packages wherever they are hosted, and provides key usage data that the registry provider typically doesn’t offer. This includes which companies are installing a particular package; which regions a project is most popular in; and what platforms or cloud providers the package is most commonly installed on — it’s a little like Google Analytics, but for open source software.

After emerging from stealth back in March with $2 million in seed funding, the company went on to raise a further $5.3 million.

Rudderstack: Leveraging customer data

Above: Rudderstack: Integrations

Customer data platforms (CDPs) bring the utility of customer analytics to non-technical personnel such as marketers, allowing them to derive key insights from vast swathes of data — CDPs serve as a unified customer database built on real-time data such as behavioral, transactional, and demographic, drawn from myriad sources.

While there are many CDPs to choose from, RudderStack is a developer-centric, open source alternative that affords companies more flexibility in terms of how they deploy their CDP. Indeed, it’s pitched as “data warehouse-first,” which means that users can retain full control over all their data in their own warehouse.

To accelerate its growth, RudderStack last year announced a $21 million series A round of funding led by Kleiner Perkins.

AtomicJar: Integration testing

AtomicJar is setting out to commercialize Testcontainers, a popular open source integration testing framework used at major companies including Google, Oracle, and Uber.

While unit testing is all about testing individual software components in isolation, integration testing is concerned with checking that all the components operate as they should when connected together as part of an application.

Founded last March, AtomicJar has invited a number of enterprises to participate in a private beta to trial various enhancements and extensions that it’s adding to Testcontainers. To help, the company last year raised $4 million in seed funding.

Bit: Bringing microservices to the frontend

Above: Bit: How a hotel booking website might look when broken down into multiple components

Microservices is a familiar concept in backend engineering, but it has been gaining steam in the frontend sphere too as companies explore ways to leverage a flexible, component-based architecture across the entire development process. And this is where Bit is hoping to carve its niche.

Bit provides open source tools and a cloud platform to help frontend developers collaborate and build component-driven software. At its core, Bit makes it easier for companies to split frontend development into smaller features and codebases, allowing teams to develop features independently, while continuously integrating as part of a unified application.

The company announced a $25 million series B round of funding back in November, as it prepares to launch new products such as Ripple CI, which continuously integrates component changes from across all applications and teams in an organization. Ripple CI is scheduled to launch later in 2022.

Cerbos: Managing user permissions

Companies often need to enable different user permissions in their software, so some employees can only submit expense reports, for example, while others can “approve” the expenses or mark them as “paid.” These various permissions might vary by team, department, and geographic location — and companies need to be able to set their own user permission rules.

There are plenty of tools in the identity and access management (IAM) space that allow for this already, but a young company called Cerbos is setting out to streamline how software developers and engineers manage user permissions, while also addressing the myriad access control compliance requirements driven by regulations and standards such as GDPR and ISO-27001.

Cerbos is adopting a self-hosted, open source approach to the user permissions problem, one that works across languages and frameworks — and one that gives companies full visibility into how it’s handling user data. To help build a commercial product on top of the open source platform, Cerbos recently announced it had raised $3.5 million in a seed round of funding.

Chatwoot: Customer engagement

Above: Chatwoot: Shared inbox

Chatwoot has built an open source platform to challenge some of the major players in the customer engagement software space, including multi-billion dollar publicly traded Zendesk.

The core Chatwoot platform constitutes a shared inbox that allows companies to connect all their various communication channels in a single, centralized location, while it also offers a live chat tool, native mobile apps, and myriad out-of-the-box integrations. As with other open source companies, Chatwoot promises greater data control and extensibility versus the proprietary incumbents.

The Y Combinator (YC) alum announced a $1.6 million seed round of funding back in September.

Cal.com: An open source Calendly alternative

With Calendly now a $3 billion company, this has shone a light on the broader meeting scheduling space as companies search for new tools to cut down on needless, repetitive admin.

With that in mind, Cal.com last year launched what it calls “scheduling infrastructure for everyone,” aimed at anyone from yoga instructors and SMEs all the way through to enterprises. Similar to Calendly, meeting organizers use Cal.com to share a scheduling link with invitees, who are then asked to choose from a set of time slots — the slot that everyone can make is then added to everyone’s calendar.

Cal.com in action

Above: Cal.com in action

As an open source product available via GitHub, however, companies using Cal.com can also retail full control of all their data through self-hosting. Moreover, they can manage the entire look-and-feel of their Cal.com deployment via its white-label offering. If users don’t want the hassle of self-hosting, Cal.com is available as a fully-hosted service too.

Cal.com recently announced that it has raised $7.4 million in seed funding from a slew of angel investors and institutional backers, including YouTube cofounder and former CEO Chad Hurley.

PostHog: Open source product analytics

Above: PostHog: Feature flags

PostHog is an open source alternative to popular product analytics platform such as Amplitude, serving companies with data on how people are using their products, insights into notable trends, and — ultimately — removing bottlenecks and reducing churn.

The company last year raised $15 million in a series B round of funding from notable backers including Alphabet’s venture capital arm GV, while it also launched a new self-hosted plan that lets companies track their product engagements on their own infrastructure for free.

Hoppscotch: Open source API development

Above: Hoppscotch for teams

APIs (application programming interfaces) are the glue that holds most modern software together — they are what bring data to sales and marketing teams; privacy to banking and health care apps; and maps to your fitness-tracking app. And that is why Hoppscotch is striving to build what it calls an “API development ecosystem,” with open source at its core.

The Hoppscotch platform includes several integrated API development tools, aimed at engineers, software developers, quality assurance (QA) testers, and product managers. In pursuit of commercialization, Hoppscotch recently announced it had raised $3 million in a seed round of funding from a slew of investors including WordPress.com parent company Automattic and OSS Capital.

Element: Open source team communications

Above: Element: An instant message app built on Matrix

There are several open source Slack alternatives out there, one of which is Element — the company behind an end-to-end encrypted team messaging platform powered by the Matrix protocol. Matrix is something akin to a telephone network or email, insofar as it’s an interoperable communication system that doesn’t lock people into a closed ecosystem.

Because Element is built on Matrix, it essentially serves as a catalyst for the growth of the broader Matrix network. And to help it push further into the commercial sphere, Element last year raised $30 million in a series B round of funding.

MindsDB: Giving enterprise databases a brain

MindsDB enables companies to make machine learning-powered predictions directly from their database using standard SQL commands, and visualize them in their application or analytics platform of choice. In the company’s own words, it wants to “democratize machine learning by giving enterprise databases a brain.”

There are many use cases for MindsDB, such as predicting customer behavior, improving employee retention, credit-risk scoring, and predicting inventory demand — it’s all about using existing data to figure out what that data might look like at a later date.

MindsDB ships in three broad variations, including a free and open source incarnation that can be deployed anywhere. To further develop and commercialize its product, MindsDB recently announced it had raised $3.75 million in seed funding and unveiled partnerships with major database brands, including Snowflake, SingleStore, and DataStax.

TerminusDB: Open source graph database

Knowledge graphs enable businesses to extract new information by aggregating and analyzing connections between large volumes of internal data. Music streaming services, search engines, fraud detection software, and more can all be aligned through their use of knowledge graphs to derive insights from disparate data that may not seem closely related.

While several larger established graph database companies raised sizable sums in 2021, some newer players also raised VC cash, suggesting that the graph database space has room for growth. One of those was TerminusDB, which raised $4.3 million in seed funding to build what it calls a “knowledge collaboration infrastructure” for the internet, combining an open source graph database and document store with the commercial, cloud-based collaboration TerminusHub built on top of TerminusDB. The company is also working on a cloud-based version of TerminusDB.

Open source Firebase rivals

Above: Nhost founders Johan Eliasson and Nuno Pato.

The burgeoning backend-as-a-service (BaaS) market was pegged at $1.6 billion in 2020, a figure that’s predicted to grow to nearly $8 billion within six years. The value for companies and developers is that BaaS enables them to forget about infrastructure and put all their efforts into the front end, while open source can also help ensure that they are not locked into any specific ecosystem.

With that in mind, a handful of young open source upstarts have emerged to challenge the big incumbents such as Google’s Firebase.

Nhost

With Nhost, companies can automate their entire backend development and cloud infrastructure spanning file storage, databases, user authentication, APIs, and more. The company last year raised $3 million from a slew of notable investors, including GitHub founders Scott Chacon and Tom Preston-Werner.

Appwrite

Similarly, Appwrite is a self-hosted BaaS solution for web and mobile app development — it includes user authentication, file storage, a database for storing and querying data, API management, security and privacy, and more. The company last year announced $10 million in funding as it prepares to launch its cloud product in 2022.

Supabase

Much like Nhost and Appwrite, Supabase pitches itself as an open source Firebase alternative, one that allows developers to create an entire backend in minutes. The company announced a $30 million series A round of funding back in September.

Data integration

Above: Airbyte: Data replication

Businesses often have a wealth of data spread across tools such as CRM, marketing, customer support, and product analytics. While accessing the data isn’t the problem, deriving meaningful insights from data stored in different locations and formats is — this means that businesses have to combine it in a centralized location and transform it into a common format that makes it easier to analyze.

A typical process for achieving this is what’s known as “extract, transform, load” (ETL), which involves transforming the data before it arrives in a central data warehouse. Though a more modern alternative — “extract, load, transform” (ELT) — allows companies to transform the raw data on-demand when it’s already in the warehouse. While there are pros and cons to both methods, we’re seeing countless companies emerge to tackle the broader data integration problem, with open source serving as a common theme throughout.

Airbyte

It was a rollercoaster 12 months for open source data integration platform Airbyte, which announced its $5.2 million seed fundraise in March and then swiftly followed this up with a $26 million series A and $150 million series B which valued the company at $1.5 billion. In the midst of all this, Airbyte — which was only founded in 2020 — announced its first data lake integration, starting with Amazon’s Simple Storage Service (S3).

Dbt Labs

Fishtown Analytics, the company behind an open source “analytics engineering” tool called dbt (data build tool), rebranded as Dbt Labs and raised $150 million in a series C round of funding at a $1.5 billion valuation. Analytics engineering refers to the process of taking raw data after it enters a data warehouse and preparing it for analysis, meaning that dbt effectively serves as the “T” in ELT.

Estuary

Combining data from SaaS applications and other sources to unlock insights is a major undertaking, one made all the more difficult when it comes to real-time, low-latency data streaming. And this is where  Estuary enters the fray, with a fully-managed ELT service — built on top of the open source Gazette project — that combines the benefits of both “batch” and “stream” data processing pipelines. The company raised a $7 million seed round of funding last year.

Meltano

GitLab had initially debuted Meltano back in 2018, and through various iterations it ended up as an open source platform for data integration and transformation. Last year, however, GitLab spun out Meltano as a standalone business, with backing from major investors including Alphabet’s GV.

Preset

Preset was founded by Apache Superset (and Airflow) creator Maxime Beauchemin. Superset is a data exploration and visualization platform, upon which Preset offers enterprise hosting, security, compliance, governance, and more. The company last year launched its fully-managed cloud service out of beta and raised $35.9 million in series B funding.

Treeverse

Data lakes that constitute petabytes of different datasets can become unwieldy and difficult to manage. This is where young startup Treeverse is setting out to help, with an open source platform called LakeFS that enables enterprises to manage their data lake in a way similar to how they manage their code — this includes version control and other Git-like operations such as branch, commit, merge, revert, and full reproducibility of all data and code. The company last year raised $23 million in a series A round of funding.

Cube Dev

Once a company has combined and transformed all their data, how do they actually leverage this data to create internal business intelligence dashboards or add analytics to existing customer applications? This is something that Cube Dev is setting out to solve.

Cube Dev is the company and core developer behind the open source “analytical API platform” Cube.js, which gives developers the backend infrastructure to connect their aggregated and transformed data to end-user visualizations. It helps circumvent many of the technical barriers — such as SQL generation, caching, API design, and security — that’s involved in making data useful.

Back in July, Cube Dev announced it had raised $15.5 million in a series A round of funding to commercialize Cube.js, which included launching a cloud-hosted SaaS version of the open source project.

Kubernetes is king

Above: Nirmata dashboard

The rise of Kubernetes since Google open-sourced the project back in 2014 highlight’s a broader industry push toward containerized applications. This was a trend that continued into last year, with the recently-published State of Cloud Native Development report indicating that 31% of all backend developers use Kubernetes today, representing a 67% year-on-year increase. And as with just about every other hot open source project out there, Kubernetes is giving rise to a slew of commercial companies.

Nirmata

Nirmata is setting out to “conquer Kubernetes complexity” with a unified management platform for Kubernetes clusters. The company is the creator of and chief contributor to Kyverno, an open source policy engine for Kubernetes, and last year it raised $3.6 million in pre-series A funding to “capitalize on the full potential of Kubernetes-native policy management.”

Rafay Systems

Rafay Systems is a platform that unifies the lifecycle management for Kubernetes infrastructure and apps, bringing together capabilities spanning automation, security, visibility, and governance — the company last year raised $25 million in a series B round of funding.

Loft Labs

Loft Labs promises self-service Kubernetes access for all developers in a company. The company, which raised a $4.6 million seed round of funding last year, has open-sourced several Kubernetes projects, on top of which sits its commercial product known as Loft, which enables enterprises to “scale self-service access” to Kubernetes across the engineering workforce.

Kubermatic

Similar to Loft Labs, Kubermatic targets developers with a self-service Kubernetes platform for deploying their clusters across any infrastructure, and enabling them to centrally manage all their workloads from a single dashboard. The company last year raised $6 million in a seed round of funding.

Akuity

Akuity emerged from stealth last year with $4.5 million in seed funding to be the “Argo enterprise company for Kubernetes app delivery.” Akuity was founded by the co-creators of Argo, a popular open source project for orchestrating Kubernetes-native application delivery, and is used at major companies including Google, Tesla, GitHub, and Intuit.

VentureBeat

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Source: Read Full Article