How To Improve Machine Learning Solutions In Production

We went live with a new machine learning product that texts the “job of the day” to our associates. This solution leverages our serverless recommendations engine powered by machine learning.

Our KPIs are meeting or exceeding expectations.

Our users love the experience.

Production deployments are smooth and streamlined via robust CI/CD pipeline.

Machine learning models are performing as expected, as evidenced by active monitoring and automatic re-training when necessary.

This has been a cross-functional collaboration between product management, data engineering, data science, machine learning engineering, AWS dedicated team, SecOps, software teams, and business stakeholders.

What do we do next? Which new features or enhancements do we prioritize?

Product managers can build a product roadmap as a function of two major dimensions:

  • Daily feedback calls with users and surrounding stakeholders
  • Analytics on usage data, conversion rates, and business outcomes

The former is extremely valuable to obtain qualitative feedback, context around edge cases, unexpected outcomes, and information about the end-to-end user experience. This is information that may not be possible to infer accurately from the data alone.

The latter is essential to assess quantitative feedback in the form of KPIs. All events are logged, aggregated, and visualized on a daily basis. Ultimately, the numbers don’t lie – they are the business truth.

User feedback such as “the pay is too low” or “the job is too far” or “I don’t like restaurant jobs” is a gold mine of feature engineering ideas for the serverless recommendations engine behind the scenes. User and stakeholder feedback also helps inform new data points to collect.

Metrics such as “job acceptance rate” and “job completion rate” per user segment helps assess the overall success of the product. After all, the ML recommendations engine and the UI/UX are decoupled from each other, but they do work together to create the final user experience and business outcomes.

We use both qualitative and quantitative feedback to develop hypotheses as to which new features will move the KPIs in the desired direction, faster.

No alt text provided for this image

This is the famous Build-Measure-Learn feedback loop from Lean Startup (a must-read, company policy for all new hires). It allows us to systematically:

  1. Develop value hypotheses
  2. Build corresponding product features
  3. Release to users in small batches
  4. Measure performance
  5. Validate or invalidate hypotheses
  6. Refine hypotheses or develop new ones based on data
  7. Repeat

For example, if we consistently hear “the pay is too low” from users, we might develop the following hypothesis:

  • “Engineering a new feature involving the rolling average pay for previously accepted jobs will improve model performance and yield more accurate job recommendations.”

As the ML product evolves, collaborate with technical product owners (TPOs) and tech lead(s) to convert master features into epics and stories, then prioritize accordingly. Also, build a visual product roadmap with milestones to provide visibility to business leaders.

This agile, iterative approach to product development especially applies to machine learning products. Waterfall simply will not work due to uncertainty and fast-pace changes. We treat new ML solutions as “mini lean startups” within the larger organization.

Over time, if the KPIs are not meeting expectations, then the answer could also be a pivot – a fundamental change in strategic direction. It’s a form of “positive failure” that points to a better product direction. We believe teams can always converge on “the right answer” systematically based on empirical data.

How do you approach ML product development after going live in production? Let us know in the comments!

If you need help implementing cloud-native MLOps, Well-Architected production ML software solutions, training/inference pipelines, monetizing your ML models in production, have specific solution architecture questions, or would just like us to review your solution architecture and provide feedback based on your goals, contact us or send me a message and we will be happy to help you.

Subscribe to my blog at: https://gradientgroup.ai/blog/

Follow me on LinkedIn: https://linkedin.com/in/carloslaraai

One thought on “How To Improve Machine Learning Solutions In Production

Leave a Reply

%d bloggers like this: