Traditional Predictive Modeling: A Timeless Treasure in the AI Revolution

In the rapidly evolving landscape of Artificial Intelligence, generative AI has captured the imagination of many, particularly in late 2022 and throughout 2023. Its capabilities in creating novel content have opened new avenues for innovation and creativity. However, amidst this excitement, it’s crucial to revisit the foundational elements of AI — traditional predictive modeling. These algorithms have been the starting point or backbone for countless applications, ranging from recommendation engines to fraud detection, customer segmentation, churn prediction, etc; the list is endless. This article highlights the enduring relevance of some key concepts of traditional predictive modeling in an era dominated by generative AI, exploring key evolutions in the field and how companies have efficiently solved basic problems at scale.

Press enter or click to view image in full size

Image Reference: XKCD

Let’s dive into how this unfolds in our daily routines, making the complex seem simple. Predictive modeling’s influence is widespread, affecting many parts of our lives we often overlook. Picture this: a person usually spends about $2000 monthly on their credit card but unexpectedly splurges $10,000 in one particular month. Could it be a case of a stolen credit card, or perhaps they’re just enjoying a lavish holiday? In either case, an intelligent system is crucial to discern whether this is a case of fraud or simply a seasonal splurge. Now think about the excitement around Spotify Wrapped and Apple Music Replay at the end of the year. What kept us glued to these music applications? Could it be that we discovered new music along the way based on personalized recommendations or does the system just know what’s relevant to you?

In my experience working in Advertising Technology (AdTech) and Automation (Time Tracking), I have implemented some of these algorithms myself. The process usually begins by pinpointing a business need and then gathering available data/resources to address these needs. However, what often goes unnoticed are the errors made and the time required to rectify them. For example, in the rush to capitalize on the AI boom, many companies prematurely labeled themselves as “AI-driven”. However, they were merely at the analytics stage and failed to establish the necessary systems for collecting quality data. This led to inefficient algorithms and the classic problem of garbage in, and garbage out.

Press enter or click to view image in full size

Reflecting on these experiences has been invaluable for honing my approach to optimizing Machine Learning(ML) systems. A crucial insight is the necessity of striking a balance between addressing and refining business needs and developing an algorithm capable of scaling effectively. Beginning with a clear vision of the desired outcome is paramount. For instance, creating a low-latency application with some milliseconds API roundtrip requirement demands a significantly different approach than building a batch-type application with more lenient latency demands. Recognizing and aligning with these specific needs is fundamental to designing an appropriate system architecture. A case in point from my work involved using an Online Analytical Processing (OLAP) database for the inference architecture, which fell short of our specific latency needs. This led to a redesign using a feature store(Tecton) with pre-computed features to ensure that our ML engine could meet the roundtrip requirements of milliseconds. Such understanding and adjustments are pivotal in developing efficient and responsive ML solutions. This journey of optimization and refinement brings us to the forefront of two key concepts that I have found useful throughout this journey:

Feature Stores
MLOps

Feature Stores: Paving the Way for Speed and Efficiency

Feature stores have emerged as essential components in ML pipelines, acting as centralized hubs for managing and sharing features across teams, significantly reducing the time required for model training and deployment. They facilitate seamless data integration, automated feature generation, real-time predictions, and the creation of training and serving datasets. For example, in a fraud detection case, relevant features might include the average amount spent per transaction, the average duration/session on your bank app, the volume of transactions, etc., over a period. Pre-computing some of these values, as opposed to calculating them at inference time, can make predictions much faster and more accurate. Setting up a feature store involves considering data sources (batch or real-time), entities (such as primary keys), transformations, and aggregations, and choosing the right storage solutions for speed and efficiency. By leveraging feature stores, we address one of the significant challenges in predictive modeling — ensuring that data is readily available and in a usable format for real-time decision-making. This strategic move not only improves model performance but also accelerates the path from concept to deployment.

Press enter or click to view image in full size

Image Reference: Tecton

MLOps: A Guiding Framework for Operational Excellence

The evolution of MLOps is a testament to our learning journey in streamlining the integration and automation of ML models in production. It focuses on practices like Continuous Integration and Deployment (CI/CD), regular model monitoring and maintenance, and fostering collaboration among different roles. For example, in our projects, we’ve leveraged tools like SageMaker for deploying model artifacts and GitHub actions for triggering workflows upon code merge. Metrics are tracked using MLFlow, allowing us to monitor and improve our model’s performance continually. MLOps is not just about technology; it embodies a culture of continuous improvement and collaboration, ensuring that machine learning models deliver value in a real-world context. As we strive for excellence in predictive modeling, understanding and integrating MLOps practices become crucial for operational efficiency and maintaining a competitive edge in the fast-paced world of AI.

Having touched briefly on the concept of feature stores and MLOps, let’s not forget a crucial ingredient for success in creating powerful machine learning solutions: the people making it all happen. In the realm of effective predictive modeling, the confluence of skills and expertise from various professionals becomes indispensable. Data engineers, data scientists, and machine learning engineers form the trifecta of talent needed to turn data into decisions. Each role contributes distinctively — data engineers architect the pipelines that fuel the system, data scientists extract insights and patterns, and machine learning engineers weave these into operational business processes.

The synergy of this multidisciplinary team is not just about technical skills; it’s about understanding the dynamics of collaboration. My journey has underscored the significance of each team member not only understanding their role but also how it fits into the larger picture. Aligning with the business team on problem definition, ensuring data quality with the data engineering team, or fine-tuning models with machine learning engineers — each step is a dance of coordination and clarity. This collaborative spirit is not just beneficial but necessary for navigating the intricate world of machine learning and ensuring the success of any predictive modeling initiative.

In conclusion, it is important to recognize that while the strides in generative AI are truly captivating, they stand on the shoulders of traditional predictive modeling. By leveraging lessons from the past, such as optimizing for specific system requirements, and embracing concepts like feature stores and MLOps, organizations can effectively harness the power of these foundational AI techniques. The right team composition further ensures these technologies are not only implemented but also evolve with the changing landscape, securing their place as a timeless treasure in the AI revolution. As someone who has navigated this journey, I hope sharing my experiences and lessons will help others embark on their path to leveraging AI’s full potential.