The following blog was written together with Petar Torre, Solutions Architect at Intel. This blog describes how Cloudify automates the deployment and monitoring of Machine Learning systems, by orchestrating an Intel-optimized TensorFlow workload running inference with a pre-trained ResNet-50 model from the Intel Model Zoo. In a nutshell, a container running a Jupyter Notebook with the Intel optimized TensorFlow model is scheduled as a Kubernetes pod on K3S on AWS EC2.
Nowadays, when most people think about the term “machine learning,” they think of advanced, refined applications such as Chat-GPT, the chatbot-based deep learning text generator, or AlphaGo, the computer program that’s currently the “world's best player” of the board game Go.
The adoption of AI/ML in financial services is increasing as companies seek to drive more robust, data-driven decision processes as part of their digital transformation journey. For global banking, McKinsey estimates that AI technologies could potentially deliver up to $1 trillion of additional value each year. But productionising machine learning at scale is challenging.
With more and more applications moving to the cloud, an increasing amount of telemetry data (logs, metrics, traces) is being collected, which can help improve application performance, operational efficiencies, and business KPIs. However, analyzing this data is extremely tedious and time consuming given the tremendous amounts of data being generated. Traditional methods of alerting and simple pattern matching (visual or simple searching etc) are not sufficient for IT Operations teams and SREs.
MLOps (short for machine learning operations) is slowly evolving into an independent approach to the machine learning lifecycle that includes all steps – from data gathering to governance and monitoring. It will become a standard as artificial intelligence is moving towards becoming part of everyday business, rather than an innovative activity.
While AI seems to be the topic of the moment, especially in the tech industry, the need to make it happen in a reliable way is becoming more obvious. MLOps, as a practice, finds itself in a place where it needs to keep growing and remain relevant in view of the latest trends. Solutions like ChatGPT or MidJourney dominated internet chatter last year, but the main question is…What do we foresee in the MLOps space this year and where is the community of MLOps practitioners focusing their energy?