Uncertainty in machine learning deployments

What if the deployment begins before the data extraction?

Jun 07, 2023

The failure rate of software projects ranges between 50% and 80%. In data science projects, nearly 90% of projects never go into production. 1

While in software projects, the definition of success is most often if the system meets a business objective in production, in data science projects, the insights collected from the process can be helpful for a business decision or understanding some phenomenon, even if you do not deploy the model in production. The goal can be more flexible.

Despite the flexibility of purpose, there are many reasons for a useful machine learning model never going into production, such as lack of stakeholder buy-in, data quality (and availability), model complexity, lack of interpretability, lack of scalability, regulatory and compliance issues, and so on.

Process

The most common way of doing machine learning projects is by sequentially analyzing the problem, extracting data, modeling, deploying, and operating the models—one after the other.

The linear process is easier to assimilate, ensuring that only the best possible model goes into production. However, it is a naive approach because it does not seek to reduce the uncertainty of the project development. It assumes that everything will work fine in the respective phases. The universe is rarely so lazy.

Imagine the following scenario: The business team develops some hypotheses and opportunities for automating some processes. The data science team works on data extraction with another area responsible for databases and starts the modeling. The modeling phase lasted three months and presented an incredible result, captivating all stakeholders. The stakeholders approve the work and liberate the process of deployment.

The data science team collaborates with other teams to integrate the solution. Some weeks later, the data science teams realized some features were unavailable in production or had strange values. They extracted the data from a consolidated and processed database. Some data have been cleaned up or come from other data sources that the model can not access in production.

The modeling phase restarts to remove and transform the features with issues. The results are not so appealing anymore. Stakeholders involved in the deployment lost interest and found other priorities. The project failed because they did not deploy the model.

In the opinion of the data science team, they delivered the model, and the other team was responsible for not deploying it. In the sponsors' view, they failed to capture value despite all the work and good results.

Embracing uncertainty

In software engineering, the most experienced engineers work to anticipate the risks and solve the uncertainty first, above many parts of the system or the process. The uncertainty surrounding the project guides the planning of deliveries, determines which proofs-of-concept will be necessary, and influences the development and delivery process.

Let me introduce a radical notion. What if the deployment phase begins before the data extraction phase?

You might wonder how someone can distribute something they did not build, or even, how to start the deployment process without knowing the result of a model.

Imagine the following scenario: The business team develops a hypothesis to use machine learning to automate part of the auditing process, which was previously entirely done by humans. The data science team begins by understanding how the auditing system will utilize the model. After receiving an audit request, the system calls the model using an API, passing the data about the customer and the current audit request.

The API will return whether the request can be approved automatically or requires human review. The data science team presents a proposal to stakeholders, comprising a minimum of 5% automation for an initial model with a loss threshold. The initial design of the API will receive some data and write a log file, responding that the request needs human approval.

After developing the minimal API, the application team starts the integration by sending a minimum set of features about the audit request, customer, and history. The data science team extracts the same features from the data lake and starts the modeling phase using a no-code AutoML platform to create a baseline and quickly prove the viability.

After some iterations, the baseline proves to be feasible to deploy. The data science team submits the log file collected by the API to check if the production data has the same distribution and format as the training data. After making some adaptations, the data science team presents the initial results to sponsors and implements the model.

The implementation occurred smoothly because the data science team resolved the most common problems and mapped the production data transformations.

The data science team connects the model to the API and starts the rollout process. The API responds that 10% of audit requests can be approved automatically. The data science team continues the process of extracting new data and modeling.

In the next iteration of the model, the teams decide to implement a new set of feature engineering without changing the API contract. The API increases the automation for 15% of audit requests.

The data science team keeps iterating and identifies a significant impact on unavailable data in production. The team can now pinpoint the implementation cost related to the automation gain.

Instead of delivering value only at the end of the deployment phase, the data science team adds value constantly.

https://venturebeat.com/ai/why-do-87-of-data-science-projects-never-make-it-into-production/

Denis’s Substack

Discussion about this post