up:: [[DBA806 - Applied AI Innovation]] tags:: #source/course #on/AI #on/data people:: [[Praphul Chandra]] # DBA806 M2 - A Practitioners Guide to AI Capabilities #### Key Discussions in Lecture [Session 2 - A Practitioners Guide to AI Capabilities | Dr. Praphul Chandra - YouTube](https://www.youtube.com/watch?v=1qDvlZMvP_Q&feature=youtu.be) **Key Points:** - The main focus is on clarifying terms related to [[Artificial Intelligence (AI)]], [[Machine Learning (ML)]], [[Deep Learning]], and [[Data Science]]. - The speaker emphasizes the importance of understanding these terms for effective communication within the class and the field. - The objective is to provide frameworks for thinking about *AI capabilities*, specifically looking at data in today's session and a computational view in the next. - The democratization of AI is mentioned, with the *statistic that 46% of code on GitHub is already using AI-powered tools*. - Definitions are provided: AI as a problem statement aiming to replicate human intelligence, ML as a set of algorithms, and deep learning as a subset of ML inspired by the human brain. - Data science is introduced as a slightly different approach, focusing on specialized skills and solving business problems using tools like [[Exploratory Data Analysis (EDA)]]. - AI can be done using methods other than machine learning (ML). In the early decades, *rule-based systems, involving if-then-else statements, were used for AI*. - Currently, the dominant approach to AI is using machine learning and deep learning. - Data science problems involve *visualization, exploratory data analysis, and machine learning*. Searching and optimization techniques are considered under data science. - Search, especially web search like Google, is considered an application of AI under natural language processing and text processing. - Optimization is often hidden under machine learning and deep learning, *involving minimizing a loss function using algorithms*. During a Q&A session, a participant asks for reading recommendations on [[Generative Adversarial Network (GAN)]]. The speaker directs them to check the bottom right of each slide for reading recommendations. The speaker recommends two books for the course: "[[The Age of AI]]" and "[[The Coming Wave]]" by Mustafa Suan, the founder of an AI company. The course, titled "Innovating with AI," aims to provide a broad overview of AI concepts and applications. #### Summary of Slide Deck Dr. Praphul Chandra explores the scope of AI, its applications, and the linguistic framework used in the field. The document touches upon the impact of AI on various aspects of daily life, from *determining flight prices to influencing job hiring decisions*. It also delves into the classification of data types, [[Structured Data]] and [[Semi-Structured Data]], and their applications in real-world scenarios. The practitioner's guide outlines the significance of AI in diverse domains, including drug discovery, autonomous vehicles, and coding assistance. The text concludes with a detailed examination of data types, real-world deployments, and challenges in [[Natural Language Processing (NLP)]] and computer vision. **AI Achievements:** - DeepMind's AlphaGo's victory in 2016. - [[Tesla Autopilot (2015)]] and Alphabet [[Waymo One (2018)]]'s AI-driven autonomous driving. - [[MIT Halicin (2019)]] discovering antibiotics. - [[GitHub Copilot (2021)]] revolutionizing AI-powered coding. **AI Applications in Daily Life:** * [[AI Pricing]], when you book a flight, the price you pay is determined by an AI algorithm. * [[AI Security]], at the airport, AI systems observe and monitory what you do. * AI Co-Pilot (Flight), on the plane, AI systems assist pilots in flying planes. * [[AI Loan]], when you apply for a loan, AI systems decide whether or not you get a loan. * [[AI Hiring]], when you apply for a job, AI systems influence whether you get hired or not. * [[AI Editor]], when you scroll social media, AI algorithms determine which content you see next. * AI Cars, [[Self-Driving Cars]] are powered by AI. * AI Weapons, [[Autonomous Weapons]] are powered by AI. * AI Jailer, AI algorithms are being used to decide who gets parole. **Data Types and Sources:** - [[Structured Data]]: Relational databases, spreadsheets, CRM, etc. - MySQL, Postgres, Redis, Memcached - Spreadsheets, CSV - Enterprise Data Warehouses - ERP e.g. SAP, Oracle, Zoho, Tally - CRM e.g. Salesforce, Zendesk, Adobe - HR mgmt. eg. Workday, Ceridian, Oracle - [[Semi-Structured Data]]: Document databases, graphs, social media platforms. - MongoDB, CouchDB, Neo4j, HBase - Data exchange between server and client in apps, webservices - Configuration files in software applications - Application log files w/ info about events, errors, user interactions - *Used for encoding geographical data e.g. GeoJSON* - IoT devices exchange data between sensors, devices, and servers. - **Text**: Storage and usage in media, written content, and entertainment. - Webpages and PDFs - Books, Novels & Literature - Social Media Platforms e.g. Twitter, Facebook - Chat and Forums e.g. Whatsapp, Stackoverflow, Quora - Research Publications, arXiv - **Images**: Storage and usage in media, movies, and entertainment. - User Generated Content from Social Media platforms - *Satellite imagery e.g. from NASA or commercial providers* - *Medical Imaging e.g. CT-Scans, MRIs, X-Rays* - Frames extracted from videos - **Video**: Storage and usage in media, movies, and entertainment. - Video streaming platforms like YouTube - OTT platforms like Netflix, Amazon Prime - EdTech platforms e.g. Khan academy - Broadcast media from TV, Cable, Satellite - Movies, Documentary and other entertainment - **Audio**: Storage, streaming, and diverse applications. - Streaming platforms like Spotify, Wynk - Podcasts & Audiobooks - Webinars and Event livestreams - Field Recordings of nature sounds, animals, machinery, cities - Voice assistants like Amazon Alexa, Google Assistant #### Business Use Cases - **Marketing Data**, improve sales and marketing with… - [[Customer Segmentation]] - Predicting [[Customer Lifetime Value (CLV)]] - [[Churn Prediction]] - [[Personalized Recommendations]] - [[Dynamic Pricing]] - [[Lead Scoring]] - [[Customer Journey Analytics]] - [[Channel Attribution]] Modelling - **Supply Chain Data**, improve supply chain efficiency with… - [[Demand Forecasting]] - [[Inventory Optimization]] - [[Route Optimization]] - [[Risk Modelling]] for disruptions & delays - [[Supplier Segmentation]] - Commodity [[Price Forecasting]] - Supply Chain Network Design - **Manufacturing**, improve manufacturing operations efficiency with… - [[Predictive Maintenance]], [[Asset Management]] - [[Quality Control]] & Defect Detection - [[Energy Optimization]] - [[Demand Forecasting]] - Fault Detection & [[Root Cause Analysis (RCA)]] - Automated [[Production Planning]] - Process Optimization #### Text Data, NLP, NLU **Text Classification:** * Email Spam Detection * [[Sentiment Analysis]] of Social Media * Product Reviews * Fake News Detection * Customer Support * Ticket Routing * [[Topic Categorization]] of News * [[Language Identification]] * Resume Screening * Legal Case Prediction * **Challenges:** * Lack of annotated data | Creating Labelled datasets for learning is costly * Imbalanced datasets | Some labels are naturally rare making learning hard * Language Ambiguity | Words have different meanings in different contexts * Concept Drift | Topics evolve, so does user behaviour * Domain Specificity | Models trained for one domain may not do well on others * Misc.… | Short texts, Relevance Subjectivity in evaluation, Long tail of queries, Multiple Labels, Fine grained labels, Multilingual content **Text Summarization** * News * Legal Documents * Scientific Papers * Discussion threads on social media * Webpages for search engine snippets * Multiple product reviews * Lecture transcripts, textbooks * Healthcare records * **Challenges:** * Extractive | Select the most important sentences and re-arrange * Abstractive | Generate new sentences: Harder but maybe more coherent * Semantics | Language nuances, ambiguity, context-dependent interpretations * Content Diversity | News vs. Legal vs. Health vs. Financial reports vs. … * Information Loss vs. Length | Shorter the summary, More lossy it is. * Others | Evaluation Metrics, Document structure, Multimodal documents, Real time requirements, Personalized summaries, **Machine Translation** * Website, Mobile App * Localization E-Commerce * Product Description * Subtitle and Captions * Legal Documents * Medical Records * Scientific Papers * Real-time multilingual customer support * Real-time international business communication * **Challenges:** * Language Ambiguity | Words have different meanings in different contexts * Syntax, Grammar Variations | Different languages have different grammar * [[Word Order Variability]] | Noun before Verb or … * Lack of parallel data | Creating ‘same’ corpus in different languages is costly * Rare Language Pairs | Need training data for every unique pair * Others | Cultural nuances like idioms, Domain-specific terminology, Ambiguous pronouns, Gender sensitivity, Rare words, Evolving language **Text Generation** * Autocomplete * Chatbots & Virtual Assistants * Marketing Content & Personalized Ads * Product Description * [[Code Autocomplete]] * Generating Code Comments * Data Augmentation for ML * Storyline an Dialogue Generating for Games * **Challenges:** * [[Language Ambiguity]] | Words have different meanings in different contexts * Coherence | Maintaining coherence and context over long passages is hard * Realistic Dialogue | Conversational dynamics include context, tone, sentiment * Concept Drift | Docs contain images, tables, multiple languages * Domain Specificity | Corpus changing may require significant retraining * Others | Corpus Scale, Evaluation (Precision, Recall, F1), Relevance Subjectivity, Long tail of queries with Multiple Labels, Fine grained labels **Information Retrieval** * Web Search * Enterprise Search * E-Commerce Product Search * Social Media Search * Job Search * Geo-Spatial Information Retrieval * Customer Support Knowledge Base * Legal document search for precedence * **Challenges:** * Semantic Gap | What users want from queries and what documents contain * Language Ambiguity | Apple? Bank? Jaguar? * Context & User Intent | “Food” – Recipe? Restaurant? Crop? Prey? * Multimodal, Multilingual | Docs contain images, tables, multiple languages * Dynamic Database | Corpus changing in real-time e.g. Social Media * Others | Corpus Scale, Evaluation (Precision, Recall, F1), Relevance Subjectivity, Long tail of queries with Multiple Labels, Fine grained labels #### Image, Video Data [Everything You Need To Know About Computer Vision 2023 | DDD Blog](https://www.digitaldividedata.com/blog/everything-about-computer-vision) **Object Detection, Recognition** * [[Self-Driving Cars|Autonomous Vehicles]] | Detecting passengers, vehicles, obstacles * Traffic Management | Detecting congestion, Intelligent traffic signals * Video Surveillance | Monitoring public spaces, crowd control * Retail Analytics | Counting customers, Tracking in-store movement * Disaster Response | Identifying survivors, damages for search & rescue Environment * Monitoring | Monitoring wildlife, deforestation, pollution src. * Content Retrieval | condensed representation makes it easier to index, retrieve **Anomaly Detection** * [Anomaly detection using edge computing in video surveillance system: review | International Journal of Multimedia Information Retrieval](https://link.springer.com/article/10.1007/s13735-022-00227-8#Fig1) * A truck moving on the footpath * Pedestrian walking on the lawn * A person throwing an object * A person carrying a suspicious bag * Incorrect parking of vehicle * People fighting * A person catching a bag * Vehicles moving on the footpath **Face Recognition** * Access Control and Security * Unlocking Smartphones and Devices * Law Enforcement and Surveillance * Identity Verification in Financial Transactions * Attendance Tracking * Employee Time and Attendance Management * Airport Security and Border Control * Social Media Tagging **Video Summarization** * [Introducing AI Video Summarization from Cloudinary Labs](https://cloudinary.com/blog/introducing-the-video-summarization-ui-from-cloudinary-labs) * Surveillance & Security | Quickly review events, identify anomalies * Shorts | Create short, informative clips from news stories or media content * Highlights | Key moments from sports broadcasts, live events, gaming streams * Editing, Film making | Generating short previews or summaries of raw footage * Consumer Video | Provide users with condensed previews of longer videos * Emergency Response | Quick analysis of surveillance footage during crises #### Audio Data **Speech Recognition** * [Speech AI Concepts You Should Know | NVIDIA Technical Blog](https://developer.nvidia.com/blog/a-guide-to-understanding-essential-speech-ai-terms/) * Voice Assistants | Understand and respond to user commands e.g. Alexa, Siri * Speech-to-Text Transcription | Note-taking (medical), generating subtitles, captions * IVR | understand & respond to user inputs in customer service, bill payments, etc. * Voice Search | For search engines, mobile devices, and smart home devices * Accessibility | For those with disabilities and/or hands-free features in cars etc. * Language Translation | Pre-requisite for translating spoken words to another **Speaker Recognition** * [Speaker Diarization — NVIDIA NeMo](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/speaker_diarization/intro.html) * Phone Banking | Enhance security of phone-based banking by verifying user identity * Voice Assistant | Access control personalized information by verifying user identity * Law Enforcement | Aid forensics in criminal investigations by identifying speaker * Home Security | Allows residents to control home-access based on their voiceprints * UX Customization | [[Personalized UX]] by identifying individuals e.g. in Smart TV * Vehicle Control | Hands free feature authenticates speaker before taking action **Sound Analysis** * [Frontiers | From Soundwave to Soundscape: A Guide to Acoustic Research in Captive Animal Environments](https://www.frontiersin.org/articles/10.3389/fvets.2022.889117/full) * Industrial Events | Detecting machine malfunction, equipment failure * Healthcare Monitoring | Remote patient monitoring e.g. irregularities in heartbeat * Home Security | Detect sounds like glass breaking, alarms etc. * Vehicle Health Monitoring | detect engine problems, abnormal vibrations etc. * Disaster Response | Identifying survivors, damages for search & rescue * Wildlife Monitoring | Track animal behaviour, identify species, [[Biodiversity]] * Urban Sound Monitoring | Assessing [[Noise Pollution]], environmental impact * Sonar Analytics | underwater navigation, marine biology research, and detection of underwater objects