Aporia takes aim at ML observability, responsible AI and more

Join today’s leading executives online at the Data Summit on March 9th. Register here.

Is there a line connecting machine learning observability to explainability, leading to responsible AI? Aporia, an observability platform for machine learning, thinks so.

After launching its platform in 2021, and seeing good traction, Aporia today announced a $25 million Series A funding round.

Aporia CEO and co-founder Liran Hason met with VentureBeat to discuss Aporia’s vision, its inner workings and its growth.

Observability for data and machine learning models

Hason, who founded Aporia in 2019, has a background in software engineering. After a five-year stint in the elite technological unit of the Israeli intelligence forces, he joined Adallom, a cloud security startup that was later acquired by Microsoft. Following that, he spent three years investing in early-stage companies, mostly in AI and cybersecurity, while working at Vertex Ventures.

Throughout his career, Hason saw the good and the not-so-good side of machine learning, and experienced first-hand the tantalizing process of bringing machine learning to production. This is what inspired him to start Aporia to help companies monitor their machine learning models.

Instead of hand-coding tailor-made scripts to monitor machine learning models, which quickly becomes a maintenance nightmare, Hason embarked on a journey to bring his ideal solution to life in 2019. In 2021, Aporia’s platform was unveiled, and it caught on almost immediately.

Aporia has experienced a 600% growth in customers over the past six months. Hundreds of leading companies like New Relic, Lemonade and Armis trust Aporia’s solution to gain visibility and explainability of machine learning models in production, boasts Aporia’s press release. Aporia’s platform consists of four parts:

The first part is visibility: enabling data scientists and machine learning engineers to see what decisions are being made by their models in production.
The second part is explainability: enabling Aporia users to explain the predictions their models make.
The third part is a monitoring engine that constantly tracks and monitors the data and model behavior, alerting users when there is performance degradation, potential bias or discrimination.
The fourth part is the so-called investigation toolbox, enabling users to slice and dice the data and find out what is the root cause for issues so they can remediate it.

A key feature for Aporia is the focus on targeting concept drift. Machine learning models are trained and tested based on historical data, formulated in training and testing datasets. If the data, the process and the model are of good quality, then the end result can be a model that performs well when deployed in production, and fed with real-world data. The real world, however, is ever-changing.

It can be changes in the data pipeline, as systems and configurations change, or it can be “reality itself,” as Hason put it: marketing campaigns, cultural changes, seasonality, or something like COVID-19. All these changes affect the behavior and the performance of the model. By monitoring their models, Aporia users can step in and adjust them, should they start going astray.

There are different issues that can lead to model performance degradation. They can range from the trivial, such as a mismatch in data types, to the profound, such as models that fail when presented with data volume and variety beyond what was used to train them.

Aporia comes with certain predefined metrics out of the box, such as accuracy or precision. However, users can both tinker with the definitions of those metrics, as well as define custom metrics of their own.

Aporia’s platform doesn’t just monitor model predictions, but also the data that was used to train the models, and the data that is fed to the models in production. What this means in practice is that the platform ends up ingesting large volumes of data, and also works as a de facto data governance platform.

The platform ingests all the data from production — everything that a machine learning model is fed. Then it analyzes all of that data, using statistics, metrics and aggregation. It generates different distributions for training and production, different time slices, and different population segments.

In addition to having its own data ingestion and storage in place, Aporia’s platform is engineered to operate in self-hosted environments, at customers’ own cloud environments. This means sensitive data stays on premise.

Observability, explainability, and responsible AI

As far as model observability goes, the focus is on the data. This means that Aporia can be model agnostic. Whether users are using neural networks or decision trees, it doesn’t really matter for the system itself to make proper monitoring. Explainability, however, is a different story.

In order to be able to provide explanations as to why a specific prediction was made, models need to be examined in a white box manner. This part was the one that raised the most questions to us. There is a wide variety of machine learning models, some of which are entirely explainable (like decision trees), and some of which are black boxes (like neural networks). How can Aporia possibly provide explainability for all of them? The short answer is — it can’t.

As Hason acknowledged, explainability and observability / monitoring are different capabilities. Although, he went on to add, there is a link connecting them. The link, Hason said, is responsible AI: helping organizations and society at large implement AI in a responsible manner:

“When monitoring machine learning models in production, from time to time people may see results they don’t expect. It could be unintentional bias or just weird predictions. And then the next question that comes to mind is — okay, but why did the model end up with this prediction? You want to debug the model and because it’s a black box it’s really cumbersome and really challenging”, said Hason.

When dealing with loan applications in the US, for example, regulation dictates that applicants have to be given explanations as to why their loans were rejected, and what they can do to improve their score. This is effective regardless of how loan applications are processed – it could be a manual process, a rule-based system, or a neural-network based system. In practice, however, a neural-network based system would not be a very good choice in terms of explainability.

Neural networks, Hason said, are far too complex for humans to really understand and comprehend the way a decision is made. However, he went on to add, there are different aspects of explainability. Explaining how a neural network works in general is an open research question. But explaining how it works for a specific problem is doable, and this is what Aporia does, Hason claimed.

Aporia users can use the platform’s capabilities via a portal. However, the tools it champions, such as MLNotify, can be used by adding import statements in Python code. MLNotify is a tool that notifies users when their models have completed their training.

Hason said MLNotify exemplifies Aporia’s philosophy, and that — along with MLOps.toys (a collection of MLOps open-source resources) and Train Invaders (a space invaders clone) — has helped Aporia’s organic growth. Perhaps more importantly, however, Aporia’s platform is geared towards self-serve.

Aporia offers a free to use community edition, which offers all the capabilities of the platform except for some integrations. Aporia charges users based on the number of models they want to monitor, and offers an unlimited number of users and an unlimited amount of data, Hason said.

The company will use the funding to expand its team, with a current headcount of 21 people, all based in Israel. The goal is to triple the team, and expand in the US. Product expansion is also in Aporia’s plans, aiming to become a full stack observability platform for machine learning, as per Hason. Computer vision is next in line, and Hason was optimistic that Aporia’s traction and funding could make it the market leader by the end of 2023. The announcement comes just 10 months after Aporia’s $5 million seed round, bringing Aporia’s total amount raised to $30 million. The funding was led by venture capital giant Tiger Global, with participation from Samsung Next as well as existing investors TLV Partners and Vertex Ventures.

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Learn More

Source: Read Full Article