Simplifying machine learning lifecycle management

23 August 2024

0

In this episode of the Data Show, I spoke with Harish Doddi, co-founder and CEO of Datatron, a startup focused on helping companies deploy and manage machine learning models. As companies move from machine learning prototypes to products and services, tools and best practices for productionizing and managing models are just starting to emerge. Today’s data science and data engineering teams work with a variety of machine learning libraries, data ingestion, and data storage technologies. Risk and compliance considerations mean that the ability to reproduce machine learning workflows is essential to meet audits in certain application domains. And as data science and data engineering teams continue to expand, tools need to enable and facilitate collaboration.

As someone who specializes in helping teams turn machine learning prototypes into production-ready services, I wanted to hear what Doddi has learned while working with organizations that aspire to “become machine learning companies.”

Learn faster. Dig deeper. See farther.

Join the O’Reilly online learning platform. Get a free trial today and find answers on the fly, or master something new and useful.

Learn more

Here are some highlights from our conversation:

A central platform for building, deploying, and managing machine learning models

In one of the companies where I worked, we had built infrastructure related to Spark. We were a heavy Spark shop. So we built everything around Spark and other components. But later, when that organization grew, a lot of people came from a TensorFlow background. That suddenly created a little bit of frustration in the team because everybody wanted to move to TensorFlow. But we had invested a lot of time, effort and energy in building the infrastructure for Spark.

… We suddenly had hidden technical debt that needed to be addressed. … Let’s say right now you have two models running in production and you know that in the next two or three years you are going to deploy 20 to 30 models. You need to start thinking about this ahead of time.

… That’s why these days I observed that organizations are creating centralized teams. The centralized team is responsible for maintaining flexible machine learning infrastructure that can be used to deploy, operate, and monitor many models simultaneously.

Feature store: Create, manage, and share canonical features

When I talk to companies these days, everybody knows that their data scientists are duplicating work because they don’t have a centralized feature store. Everybody I talk to really wants to build or even buy a feature store, depending on what is easiest for them.

… The number of data scientists within most companies is increasing. And one of the pain points I’ve observed is when a new data scientist joins an organization, there is an extreme amount of ramp-up period. A new data scientist needs to figure out what the data sets are, what the features are, so on and so forth. But if an organization had a feature store, the ramp-up period can be much faster.

Related resources:

“Lessons learned turning machine learning models into real products and services”
“What are machine learning engineers?”: examining a new role focused on creating data products and making data science work in production
“MLflow: A Platform for Managing the Machine Learning Lifecycle”
“Managing risk in machine learning models”: Andrew Burt and Steven Touw on how companies can manage models they cannot fully explain
“We need to build machine learning tools to augment machine learning engineers”
When models go rogue: David Talby on hard-earned lessons about using machine learning in production

Post topics: AI & ML, Data, O’Reilly Data Show Podcast

Post tags: Podcast

Simplifying machine learning lifecycle management

Learn faster. Dig deeper. See farther.

A central platform for building, deploying, and managing machine learning models

Feature store: Create, manage, and share canonical features

Run Local AWS Cloud Stack using LocalStack on Linux

Learn Terraform Automation in 3 days using Video Courses

How To Expose Ansible AWX Service using Nginx Ingress

LEAVE A REPLY Cancel reply

Most Popular

How to Watch the ICC Cricket World Cup From Anywhere in 2023 by Gjurgjica Panova

How to Watch The Voice From Anywhere in 2025 by Eric Goldstein

10 Best VPN Deals in 2025: Verified Coupons & Codes by Katarina Glamoslija

5 Best Password Managers With Local Storage in 2025 by Tyler Cross

Recent Comments

EDITOR PICKS

How to Watch the ICC Cricket World Cup From Anywhere in 2023 by Gjurgjica Panova

How to Watch The Voice From Anywhere in 2025 by Eric Goldstein

10 Best VPN Deals in 2025: Verified Coupons & Codes by Katarina Glamoslija

POPULAR POSTS

How to Watch the ICC Cricket World Cup From Anywhere in 2023 by Gjurgjica Panova

How to Watch The Voice From Anywhere in 2025 by Eric Goldstein

10 Best VPN Deals in 2025: Verified Coupons & Codes by Katarina Glamoslija

POPULAR CATEGORY

ABOUT US

FOLLOW US