up:: [[DBA806 - Applied AI Innovation]] tags:: #source/course #on/AI #on/ML people:: [[Praphul Chandra]] # DBA806 M7 - Innovation Frameworks with AI #### Key Discussions in Lecture [Session 7 - Dr Praphul Chandra -Innovation Frameworks with AI - YouTube](https://www.youtube.com/watch?v=oFPmop3cTCs) The segment discusses various concepts related to machine learning, focusing on [[Supervised Learning]]. It mentions the *importance of addressing imbalance in data sets*, with [[Undersampling]] being a technique to achieve balance. The general idea of supervised learning involves finding the relationship between a dependent variable (Y) and independent variables (X1, X2, ..., XP). Two variants of supervised learning are highlighted: regression, where the Y variable is numeric, and classification, where the Y variable takes categorical values. Regression is illustrated using a simple linear regression model, emphasizing the importance of finding the equation of the model (represented by a blue line) to make predictions. Classification, on the other hand, involves finding a separating boundary (dotted curve) to distinguish between different classes (e.g., red and blue points). The lecture also draws an *analogy between how humans and machines learn, using the example of teaching a child to recognize animals in picture books*. It emphasizes the concept of generalization, where machines learn to recognize patterns and apply that knowledge to new, unseen data. **Key Points:** 1. Under-sampling is a technique to address imbalance in data sets. 2. Supervised learning involves finding the relationship between a dependent variable (Y) and independent variables (X1, X2, ..., XP). 3. Regression deals with numeric Y variables, while classification deals with categorical Y variables. 4. Classification involves finding a separating boundary to distinguish between different classes. 5. Generalization is a key concept, allowing machines to extrapolate knowledge from seen examples to new, unseen data. In this segment, the discussion revolves around the concept of [[Convolutional Neural Networks (CNN)]] and the realization that *machines store a multitude of features, forming a hierarchy, which allows them to understand and recognize objects like a cat*. The question arises about the extent of data required, with an acknowledgment that the amount is subjective but depends on the complexity of the problem. The conversation touches on AI offerings like [[DALL-E]] and [[Midjourney]], emphasizing the vast amount of data they process. The complexity of the problem determines the need for a larger dataset, particularly when dealing with a myriad of categories. The session also delves into supervised and unsupervised learning, drawing an analogy between clustering and classification. The discussion clarifies that supervised learning involves providing a dataset with desired outputs, while unsupervised learning, like clustering, does not require labeled data. A comparison is made between *teaching preschoolers to classify images without words (clustering) and introducing words for specific classifications (classification)*. The machine learning process involves training models by feeding them examples, allowing them to learn relationships between input data and output labels. The session provides a teaser about guiding machines to build models efficiently by specifying the type of functions or models they should aim for. Different machine learning model families, such as *linear models, tree-based models, distance-based models, and neural network-based models*, are mentioned as ways to guide machines in building models. The importance of evaluating model performance is emphasized, introducing the concept of splitting data into [[Training Set]] and [[Test Set]]. The session concludes by highlighting the hands-on aspect of the course, emphasizing that coding is essential, but understanding the high-level concepts, applications, and when to apply them is equally crucial. **Key Points:** 1. CNNs store a hierarchy of features for understanding objects like cats. 2. Amount of data needed in AI is subjective, depending on problem complexity. 3. AI offerings like DALL-E 2 and Mid Journey process vast amounts of data. 4. Supervised learning involves labeled datasets, while unsupervised learning (clustering) doesn't. 5. Machines learn by training on examples, building relationships between input and output. 6. Guiding machines to build specific models efficiently reduces computational costs. 7. Different machine learning model families include linear, tree-based, distance-based, and neural network-based models. 8. Evaluating model performance involves testing on a separate dataset (test set). **Summary:** The speaker discusses the process of machine learning, emphasizing the importance of data and the various steps involved. They highlight that *data labeling is often the most expensive part of training a model and requires human experts*. The machine learning journey includes building and training the model, running it on a server or cloud, and using it for predictions as new data becomes available. The speaker emphasizes that in real-world settings, the *majority of time (80%) is spent on building the data pipeline*. This involves *data identification, aggregation from multiple sources, cleaning, labeling, and sometimes augmenting with metadata*. They stress the need for a diverse team, including data engineers, cloud engineers, machine learning engineers, business and domain experts, and solution designers. The talk introduces the concept of[[Self-Supervised Learning]], *addressing the challenge of expensive data labeling*. The idea is to predict the next word in a sentence without explicitly labeling data. The speaker mentions applications such as autocomplete features on phones, search engines, e-commerce platforms, and email editors. The talk delves into the details of self-supervised learning, using the example of predicting the next word in a sentence. The speaker explains the [[Context Window]], which determines how far back the model looks to predict the next word. They also touch upon the role of probability in predicting words and discuss the auto-regressive model, where the model uses its own output to predict more output. Lastly, the speaker introduces the idea of *Generative AI, where autocomplete can be put in a loop to generate longer sentences or answers*. They discuss the power of self-supervised learning in utilizing existing knowledge and literature without explicitly labeled data. The talk concludes by connecting the concepts to the overarching idea of predicting the next word in a sentence. **Key Points:** - Machine learning involves data labeling, model building, training, running on servers/cloud, and using it for predictions. - Data pipeline building constitutes 80% of the time in real-world ML projects. - A diverse team, including data engineers, cloud engineers, ML engineers, business/domain experts, and solution designers, is essential. - Self-supervised learning aims to predict the next word without explicit data labeling. - Applications of self-supervised learning include autocomplete features in phones, search engines, e-commerce platforms, and email editors. - Context window determines how far back the model looks to predict the next word. - Auto-regressive models use their own output to predict more output. - Generative AI utilizes autocomplete in a loop to generate longer sentences or answers. - Self-supervised learning leverages existing knowledge and literature for predictions without explicit labels. - Self-supervised learning involves models generating their own supervision signals from data. - Can be extended to image domain, e.g., in painting or image colorization. **Language Modeling Example** - The model predicts missing words by hiding one word in a sentence and trying to predict it. - It learns language without explicit human labeling. - Language becomes a vast training set, eliminating the need for labeled data. - Millions of text sentences serve as training data points for text generation. - Cost-effective as it utilizes existing digital data for training. **Image Inpainting Example** - In the image domain, a white square is placed in an image, and the model is tasked to fill it based on surrounding context. - Inpainting is an example of self-supervised learning in the image domain. - Applied to tasks like image editing, restoration, and removing objects from images. - Similar to language modeling, it learns to predict missing parts using surrounding features. **Image Colorization Example** - Another application involves predicting missing information, such as color in grayscale images. - The model learns to colorize images based on surrounding context. - Used for tasks like automatic image editing and colorization. - Demonstrates the versatility of self-supervised learning across different domains. The later part transitions to [[Reinforcement Learning]], emphasizing its roots in psychology, particularly [[Positive Reinforcement]] and [[Negative Reinforcement]]. The core idea is that agents, which could be animals, robots, or software, learn optimal actions by receiving rewards or penalties based on their behavior. Applications of *reinforcement learning include game-playing agents*, as demonstrated by AI defeating a Go world champion, and self-driving cars making decisions like steering, accelerating, and braking to reach a destination. **Self-Supervised Learning:** - Predicting pixel colors without explicit labeling. - Preprocessing module converts images to grayscale. - Language models like GPT use self-supervised learning on large datasets. - Applications include missing data imputation and graph representation learning. **Reinforcement Learning:** - Rooted in psychology with positive and negative reinforcement. - Agents learn optimal actions by exploring and receiving rewards or penalties. - Applications include game-playing agents (e.g., defeating a Go champion) and self-driving cars making decisions. - Rewards and penalties influence agent behavior in tasks such as chess moves or self-driving actions. - Emphasis on learning by doing and exploring different action sequences. ##### Reinforcement Learning The speaker discusses *reinforcement learning in the context of decision-making agents*, using examples such as *setting tax rates and driving cars*. The abstract framework involves an agent taking actions, resulting in changes to the environment, and receiving rewards. The *environment's state changes with each action, leading to a continuous loop of action, state change, and reward*. One key example is the [[Multiarm Bandit Problem]], where an agent must decide which slot machine to play to maximize rewards. The challenge lies in *balancing exploration (trying different machines) and exploitation (choosing the best machine)* to achieve the highest expected aggregate reward over time. 1. **Abstract Framework:** - Agents take actions in an environment. - Actions lead to state changes and rewards. - Continuous loop of action, state change, and reward. 2. **Multiarm Bandit Problem:** - Scenario: Choosing slot machines in a casino to maximize rewards. - Exploration vs. exploitation trade-off. - Limited budget and unknown reward distributions. 3. **Exploration-Exploitation Policies:** - Naive policy: Randomly try all machines. - Epsilon-greedy policy: Occasionally try something new. - Soft Max policy: Initially explore a lot, then focus on exploitation. 4. **Challenges in Reinforcement Learning:** - Out-of-sample examples: Model trained on certain data struggles with diverse new data. - Example: Self-driving cars in Australia struggling with kangaroos due to lack of training on such cases. 5. **Limitations of Reinforcement Learning:** - Applicability depends on modeling problems as agent-environment-reward systems. - Difficulty in scenarios without a clear agent-environment structure. 6. **Markov Decision Processes (MDPs):** - Introduction to [[Markov Decision Processes (MDPs)]] as an extension of multiarm bandits. - MDPs involve the concept of states, where actions impact the future state and reward. 7. **State in Decision Processes:** - Contrast with multiarm bandit problem where actions are independent. - In MDPs, actions impact the future state and reward. 8. **Overall Objective in Reinforcement Learning:** - Maximize the expected aggregate reward over time. - Balancing exploration and exploitation crucial in decision-making. 9. **Future Challenges:** - Adapting reinforcement learning to various fields like robotics and algorithmic trading. - Continued research on addressing limitations and improving applicability. Reinforcement learning involves *learning a policy, which is a probability distribution over actions, with the added challenge that the reward of an action depends on the current state*. Competitive games like chess or go make the problem harder due to the vast number of possible sequences of states. [[Deep Reinforcement Learning]] is introduced as a solution, *utilizing neural networks to represent complex functions such as state value, action value, and policy functions*. The fundamental idea is to maximize expected aggregate rewards over time. The policy function is not deterministic, providing a probability distribution over actions based on the current state. The *challenge is to consider future scenarios and make decisions that might sacrifice short-term rewards for long-term success*. **Key Points:** - In reinforcement learning, actions affect the state, and the reward of an action depends on the current state. - Competitive games pose challenges due to the multitude of possible sequences of states. - Deep reinforcement learning utilizes neural networks to learn complex functions like state value, action value, and policy functions. - The policy function is a probability distribution over actions based on the current state. - Decision-making may involve sacrificing short-term rewards for long-term success. - The goal is to maximize expected aggregate rewards over time. #### Summary of Slide Deck [AI Computational Capabilities](https://cdn.upgrad.com/uploads/production/3328d4f7-44d3-40a2-928c-11bd957fd2c8/7_AComputerScientistsGuideToAICapabilities_Part2.pdf) The text provides insights into the foundational concepts of AI, focusing on the *computational view of data and machine learning*. It covers the significance of [[Relational Data Models]], [[Feature Engineering]], and the representation of various data types, including text, images, and audio, in a relational format. The distinction between unsupervised and supervised learning is explained, emphasizing the tasks of *inferring hidden structures* and finding functions to describe relationships in labeled data. The learning process for machines is compared to teaching a baby to recognize patterns, where examples are crucial. The training of machine learning models involves *feeding examples, pattern recognition, and generalization*. The text introduces the concept of guiding machines in building models, reducing computational load, and highlights the diverse machine learning model families with varying capacities. The evaluation of machine learning models is discussed, emphasizing the *importance of testing model performance on unknown data*. The practical application of various machine learning algorithms, including [[Logistic Regression]], [[Decision Trees]], [[Random Forests]], [[Support Vector Machines]], [[k-Nearest Neighbors]], and [[Naive Bayes]], is outlined. The broader perspective is presented by elucidating the overall process of AI/ML, involving raw data collection, [[Human Labeling]], [[Pattern Recognition]], and model deployment. It emphasizes the time-intensive nature of AI/ML projects, with *80% of the time spent on the data pipeline*. The text also touches upon the challenges and expenses associated with data labeling and raises the question of whether AI/ML can be *achieved without explicit data labeling*. A shift to [[Self-Supervised Learning]] is introduced, particularly in language models and generative AI. The concept of *predicting the next word in a sequence without explicit human labeling* is explored. The text delves into various applications of self-supervised learning, such as *language modeling, sentence embeddings, computer vision* tasks like image [[Inpainting and Colorization]], [[Missing Data Imputation]], and [[Graph Representation Learning]]. The final sections cover the use of large neural networks in language models and reinforcement learning, with examples from AlphaGo, Tesla Autopilot, and Alphabet Waymo One. The concept of [[Reinforcement Learning]] is explained through multi-armed bandit problems, sequential decision-making, and the application of deep reinforcement learning techniques like [[Deep Q-Network]] and [[Policy Gradient Methods]]. 1. **Foundational Concepts:** - Matrix, Data Frame, Data Table, Spreadsheet as variants of a relational data model. - Feature engineering for text, images, and audio. - Representation of all data in a relational format. 2. **Learning Processes:** - Unsupervised Learning: Inferring hidden structures from unlabeled data. - Supervised Learning: Finding functions for labeled data (regression, classification). 3. **Teaching Machines:** - Machines learn by examples and understanding patterns in data. - Training involves feeding examples, pattern recognition, and generalization. 4. **Training ML Models:** - Guiding machines in building models to reduce computational load. - Different ML model families with varying learning capacities. 5. **Evaluating Machines:** - Testing model performance on unknown data (Train & Test Data). - Practical application of various machine learning algorithms. 6. **AI/ML Process:** - Raw data collection, human labeling, pattern recognition, and model deployment. - Challenges in data pipeline and explicit data labeling. 7. **Self-Supervised Learning:** - Language models predicting the next word without explicit human labeling. - Applications in language modeling, computer vision, and various tasks. 8. **Reinforcement Learning:** - Exploration to learn optimal actions and policies. - Applications in AlphaGo, Tesla Autopilot, and Alphabet Waymo One. 9. **Deep Reinforcement Learning:** - Use of deep neural networks in approximating complex functions. - Techniques like Deep Q-Network and Policy Gradient Methods. 10. **Multi-Armed Bandit Problems:** - Choosing actions at each time step to maximize expected rewards. - Utilizing policies and state changes to optimize aggregate rewards.