up:: [[DBA806 - Applied AI Innovation]] tags:: #source/course #on/AI #on/ML people:: [[Praphul Chandra]] # DBA806 M9 - Productionizing AI and ML #### Key Discussions in Lecture [MLOps - YouTube](https://www.youtube.com/watch?v=tZOsEAzbgjY) ![[Pasted image 20240401004940.png]] The professor introduces the engineering side of AI/ML, highlighting that it goes beyond just building models and involves considerations like data collection, resource management, and continuous learning. The concept of MLOps is explained, focusing on making AI/ML solutions production-ready and scalable. The session delves into the AI/ML tools landscape, showcasing a framework to understand the stack, from infrastructure at the bottom to applications at the top. Differentiating between traditional (green) and new-gen AI capabilities (orange), the professor categorizes applications that create business value and the underlying capabilities of AI/ML. The discussion concludes with a clarification on foundation models and pre-trained models. [[Foundation Models]], synonymous with pre-trained models, represent a shift from traditional data-centric AI approaches. **Key Points:** 1. Pre-trained models involve using a model built by someone else and applying it in different contexts, termed transfer learning. 2. Transfer learning is not universally applicable, especially when the data or domain differs significantly. 3. Foundation models, like GPT 4 and GPT 3.5, are trained on vast datasets, making them versatile for various applications, leading to the term "Foundation" models. 4. **The AI/ML field lacks strict standards, with popular tools and libraries becoming de facto standards.** 5. NLP has seen significant advancements with the introduction of generative models, such as GPT. 6. The speaker suggests considering the ACM (Association for Computing Machinery) for AI-related information. 7. Infrastructure choices often revolve around major cloud providers like AWS, Microsoft, and Google, with Nvidia dominating GPU chipsets. 8. Frameworks like TensorFlow and PyTorch serve as standards in the deep learning domain. 9. Model formats (ONNX, pickle) and APIs help deploy and share machine learning models as services. 10. Containerization, using tools like FastAPI, Flask, and Django, facilitates the scalability of AI/ML models. 11. Kubernetes is employed for efficient model deployment and scaling. 12. Monitoring AI/ML models involves tracking performance, resource consumption, and potential issues. 13. Fine-tuning is crucial, especially for Foundation models, enabling adaptation to specific enterprise data. 14. Orchestration addresses aspects like retrieval, augmentation, and governance of models, with a focus on explainability and auditing. The discussion revolves around the themes of [[Orchestration (MLOps)]], [[Fine-Tuning (ML)]], and color coding in the context of AI and machine learning (AI/ML). Orchestration is highlighted as a crucial element in integrating machine learning with applications and building enterprise workflows. The significance of fine-tuning and orchestration has significantly increased in the last 18 to 24 months. **Key Points** * Deployment is defined as making a model available for consumption in various applications. * Fine-tuning involves optimizing a pre-trained model on a specific dataset. * Orchestration is explained as integrating a foundation model into an enterprise workflow. ![[Pasted image 20240331185347.png]] The discussion then transitions to the bottom four layers of the **AI/ML stack**: infrastructure, data, frameworks, and deployment. Infrastructure, as the foundation, is discussed in the context of cloud computing, emphasizing the advantages of on-demand compute, global accessibility, security, and compliance. The technical view of cloud computing for model training is explored, including the benefits of distributed computing, availability of specialized chipsets, and pre-configured environments for AI and ML. The build-versus-buy dilemma is mentioned, emphasizing the strategic decision-making process organizations face when deciding which parts of the stack to control in-house and which to use as a service from cloud providers. The scalability and dynamic resource allocation features of cloud services are highlighted, along with considerations for load balancing. **Key Points:** 1. Orchestration is crucial for integrating machine learning with applications and building enterprise workflows. 2. The significance of fine-tuning and orchestration has grown in the last 18 to 24 months. 3. Deployment involves making a model available for consumption, fine-tuning optimizes pre-trained models, and orchestration integrates models into workflows. 4. Cloud computing offers advantages such as on-demand compute, global accessibility, security, and compliance. 5. Cloud providers handle technical aspects like distributed computing, availability of specialized chipsets, and pre-configured environments. 6. The build-versus-buy dilemma requires strategic decisions on controlling parts in-house versus using cloud services. **Summary on Data** Moving on to infrastructure, the speaker emphasizes the split between proof of concepts (PoCs) on-premises and production on the cloud for enterprise-scale AI/ML models. The discussion shifts to data, highlighting databases as a crucial starting point, including common databases like IBM Db2, Microsoft SQL Server, MySQL, and PostgreSQL. The speaker emphasizes structured data stored in relational databases and introduces the concept of batch data versus stream data processing. Logs are identified as another essential data source often overlooked, providing information about system behavior, user interactions, and more. The speaker suggests logs are voluminous and can be a powerful source for data mining and machine learning. The discussion extends to customer relationship management (CRM) tools like Salesforce, Zoho, and others, emphasizing the importance of customer data in these systems. Social media feeds are recognized as valuable yet complex data sources, containing text, images, videos, likes, comments, etc. The speaker notes that the format of social media data varies and can be both structured (e.g., JSON) and unstructured. External data sources, including weather data, invoices, GPS locations, real estate prices, and satellite data, are discussed as additional valuable sources that can be obtained from third-party service providers. **Key Points:** - Infrastructure for AI/ML models involves a split between on-premises (PoCs) and cloud-based production. - Databases, including IBM Db2, Microsoft SQL Server, MySQL, and PostgreSQL, are crucial sources of structured data. - Batch data processing involves periodic analytics, while stream data processing deals with real-time updates. - Logs are valuable but often overlooked data sources, offering insights into system and user behavior. - CRM tools like Salesforce store essential customer data, impacting revenue streams. - Social media feeds provide diverse, sometimes unstructured, data formats requiring both batch and stream processing. - External data sources, such as weather, invoices, GPS locations, real estate prices, and satellite data, offer valuable information for AI/ML applications. **Summary:** The speaker discusses the challenges of dealing with diverse data formats, including documents, Excel sheets, PDFs, images, and more, emphasizing the need for software modules to read and consume data from various sources for AI and machine learning models. They highlight the *emergence of IoT data*, such as sensor readings from devices like Apple's Vision Pro, which generates continuous streams of data. The discussion moves on to data pipelines, essential in processing and transforming data from its source format to a usable one for AI and ML models. The speaker explains the concepts of [[Extract, Transform, Load (ETL)]] and differentiates between data warehouses and data lakes, with *data warehouses storing structured data for fast querying and data lakes serving as centralized repositories without enforcing a specific structure*. The presentation delves into the role of data pipelines in collecting raw data from various sources and delivering it in a consumable format. It addresses questions from the audience, clarifying the differences between data warehouses and data lakes, *schema on write versus schema on read*, and the evolving trend of adopting a "lake house" architecture combining both approaches. Finally, the discussion touches upon the tools and frameworks used by data science teams, with a mention of the Jupiter notebook as a staple tool for data scientists. The speaker briefly acknowledges challenges in deploying deep learning models on mobile devices, hinting at complexities involving model size, network constraints, and target shifts. **Key Points:** 1. Diverse data formats, including documents, Excel sheets, PDFs, images, etc., pose challenges for building software modules to read and consume data for AI and ML models. 2. IoT data, like sensor readings from devices, contributes to continuous streams of data, requiring efficient processing. 3. Data pipelines play a crucial role in transforming data from source formats to usable formats for AI and ML models. 4. ETL (extract, transform, load) workflows and the differences between data warehouses and data lakes are explained. 5. The "lake house" architecture is introduced as a combination of data warehouses and data lakes in real-world scenarios. 6. The Jupiter notebook is highlighted as a common tool used by data science teams for model development. 7. Challenges in deploying deep learning models on mobile devices, such as model size and network constraints, are briefly mentioned. **Summary on Frameworks** The speaker discusses the use of models deployed on end devices as a different use case. For model development, data scientists often use Jupyter notebooks as a staple tool. Jupyter is an integrated development environment (IDE) used for interactive data analysis, prototyping, and building small machine learning models. Anaconda, an extension of Jupyter, serves as a software package manager, handling dependencies and ensuring version compatibility. It is crucial for managing the complexity of dependencies in software development. The core machine learning library mentioned is scikit-learn (or psyit learn), widely used for classification, regression, clustering, and more. It is open source, well-documented, and a good starting point for data scientists. For deep learning and neural networks, the speaker introduces PyTorch, an open-source deep learning framework. PyTorch is suitable for working with unstructured data like text, image, audio, and video. TensorFlow, another framework developed by Google, is also mentioned as a popular alternative for large-scale production environments. In terms of choosing a framework, scikit-learn is recommended for structured data, while PyTorch and TensorFlow are preferable for deep learning tasks. The speaker emphasizes understanding licenses when using open source components, and while the community provides best efforts, users bear the responsibility for software reliability. **Key Points:** 1. Models on end devices are a different use case. 2. Jupyter notebooks are fundamental for model development. 3. Anaconda serves as a software package manager, handling dependencies. 4. Scikit-learn is a core machine learning library for various tasks. 5. PyTorch is recommended for deep learning on unstructured data. 6. TensorFlow is a popular alternative for large-scale production environments. 7. Choosing a framework depends on the nature of the data and tasks. 8. Understanding licenses is crucial when using open source components. 9. Users bear the responsibility for software reliability. 10. Pre-trained models are available from platforms like OpenAI and Hugging Face. **Summary on Deploymeny** The speaker discusses the process of deploying machine learning models, focusing on the deployment layer. They emphasize the importance of making machine learning models available for use by other applications through APIs (Application Programming Interfaces). The speaker mentions tools like Fast API, Flask, and Django for converting models into web services. 1. Deployment of machine learning models involves exposing them as APIs for use by other applications. 2. Fast API, Flask, and Django are tools used to convert machine learning models into web services. 3. APIs allow applications to call and use machine learning models over the web. 4. The process involves storing the machine learning model's functionality in an API, enabling it to be accessed remotely. 5. Scaling becomes a consideration when dealing with a large number of users, and cloud services or load balancers can be used to manage the computational load. 6. An alternative approach is "bnpl" (build now, predict later), where models are stored until they are ready for deployment. 7. Model serialization involves saving machine learning models for later use, and popular tools for this include pickle and ONNX. 8. [[ONNX (Open Neural Network Exchange)]] is optimized for deep learning models and promotes interoperability between different deep learning frameworks. 9. Stored models can be used for further training or deployment, providing flexibility in the machine learning workflow. 10. The location of stored model files is flexible, allowing them to be stored on local machines, servers, or cloud storage platforms. The speaker discusses three key concepts related to deploying machine learning models: *pickle optimization* for deep learning, *containers*, and *model deployment* considerations. They address questions about file storage, APIs, and the use of containers. The discussion delves into the differences between pickle, ONNX files, and containers, emphasizing the latter's role in packaging models and their dependencies for deployment on different hardware. 1. **Pickle Optimization:** - Specialized version for deep learning. - Optimized for neural network architectures. 2. **Containers:** - Containers are like pickle and ONNX but more powerful. - They package the model and all necessary dependencies for running on different hardware. - Docker is a popular container. 3. **Model Deployment:** - Questions about storing models as ONNX files and using them for API creation. - Clarification on API's need for the model in memory. - Distinction between models as software objects and APIs. - Containers include not only the model but also all required libraries and software modules. - Anaconda's role is different, not directly related to container creation. 4. **Deployment Challenges:** - Version dependencies in Python libraries can lead to challenges. - Mitigation involves professional cloud services or handling dependencies in-house. 5. **Cloud Providers:** - Choice between cloud providers (Azure, GCP, AWS) depends on company standards, budget, and talent. 6. **Monitoring and Management:** - Importance of tracking model performance using Prometheus and Grafana. - Alert systems for model crashes, performance degradation, or high resource usage. - Elk stack for logs analysis, with Elasticsearch, Logstash, and Kibana. 7. **Cloud Provider Tools:** - AWS SageMaker, Azure tools, and Google AI Platform provide cloud-specific solutions. - Tools cover model development, data wrangling, training, and deployment. #### Summary of Slide Deck [MLOps](https://cdn.upgrad.com/uploads/production/ee131d7b-50b0-435a-9523-1b6a1956cd5a/6_MLOps.pdf) The slides provides an overview of ML Ops (Machine Learning Operations), emphasizing the importance of productionizing AI/ML solutions and addressing challenges such as big data and continuous learning. It also highlights the diversity of AI, ML, and data platforms available in the market and the need to develop a clear understanding of their purposes. The AI/ML stack encompasses various components including applications, AI capabilities, orchestration, fine-tuning, deployment, frameworks, data, and infrastructure. Key considerations for infrastructure selection, particularly leveraging cloud services for training and deploying models, are discussed, citing benefits such as scalability, cost optimization, collaboration, and global reach. **Key Points:** - ML Ops involves productionizing AI/ML solutions and addressing challenges like big data and continuous learning. - Understanding the purpose of various AI, ML, and data platforms is crucial. - The AI/ML stack includes applications, AI capabilities, orchestration, fine-tuning, deployment, frameworks, data, and infrastructure. - Cloud infrastructure offers benefits such as scalability, cost optimization, collaboration, and global reach for training and deploying models. The text also delves into practical aspects of model development, deployment, and tracking. It discusses tools like [[Jupyter Notebook]] and Anaconda for model development, along with frameworks like [[Scikit Learn]], [[PyTorch]], and [[TensorFlow]] for different types of tasks. The deployment options include REST APIs, containerization, and serialization for storing models efficiently. Docker is highlighted for its role in creating consistent and portable development environments, crucial for deploying models as microservices or in complex systems. Kubernetes is mentioned for automating container deployment and management. **Key Points:** - Tools like Jupyter Notebook and Anaconda aid in model development. - Frameworks such as Scikit Learn, PyTorch, and TensorFlow cater to different ML tasks. - Deployment options include REST APIs, containerization, and serialization. - Docker ensures consistent execution across different systems, crucial for deploying models in complex setups. - Kubernetes automates container deployment and management. Furthermore, the text discusses tracking deployments and performance monitoring using tools like [[Prometheus]] and [[Grafana]] for collecting metrics related to model performance, inference times, and resource utilization. It also touches upon analyzing deployment logs using the ELK Stack (Elasticsearch, Logstash, Kibana) for monitoring and alerting purposes. **Key Takeaways:** - ML Ops involves the productionization of AI/ML solutions and addressing challenges such as big data and continuous learning. - Cloud infrastructure offers scalability, cost optimization, and global reach for training and deploying models. - Tools like Jupyter Notebook, Anaconda, and frameworks like Scikit Learn, PyTorch, and TensorFlow aid in model development. - Deployment options include REST APIs, containerization, and serialization, with Docker ensuring consistency across different systems. - Kubernetes automates container deployment and management. - Monitoring tools like Prometheus, Grafana, and the ELK Stack facilitate tracking deployments, performance monitoring, and log analysis for monitoring and alerting. The provided text discusses various aspects of deploying and managing machine learning (ML) models without directly handling the underlying server infrastructure. Key points include: **Load Balancing:** - Offers load balancing services to distribute incoming traffic across multiple instances for high availability and scalability. - Ensures efficient handling of traffic to maintain application performance. **API Endpoints:** - Facilitates the exposure of ML models as RESTful APIs for seamless integration with other applications. - Simplifies the process of interacting with ML models over HTTP protocols. **Container Orchestration:** - Enables the deployment and scaling of containerized ML models based on demand. - Provides flexibility and efficiency in managing resources for ML model deployment. **Development Environment:** - Encompasses various stages such as data preparation, exploration, model training, packaging, serialization, deployment, monitoring, management, scalability, resource management, experiment tracking, cost monitoring, and optimization. - Ensures a structured approach to ML model development and deployment. **Deployment Workflow:** - Involves a series of steps including data preparation, model training, packaging, deployment, monitoring, and management. - Streamlines the process of deploying ML models into production environments. **Platforms and Tools:** - Includes [[SageMaker]], Azure ML, AI Platform, GCP, Jupyter Notebooks, and various libraries such as pandas, scikit-learn, PyTorch, and TensorFlow. - Offers a range of services and features tailored for ML model development and deployment. **Technologies:** - Utilizes Docker, Kubernetes, Flask, FastAPI, [[ONNX (Open Neural Network Exchange)]], Pickle/Joblib, Prometheus, Grafana, [[ELK (Elastic, Logstask, Kibana)]], Dask, MLflow, and Tensorboard. - Provides a comprehensive ecosystem for building, deploying, and monitoring ML applications.