class: title-slide, center, middle, remark-slide-content, inverse, title-slide, hljs-github # Get to Know Your Algorithms ### Jared P. Lander ### Chief Data Scientist <img src="data:image/png;base64,#C:/Users/jared/Documents/Consulting/talks/images/Lander_logo.png" width="40%" /> ??? - Tour of different machine learning algorithms - Pros/cons - When to use them - Use cases from personal experience at my company - Not too technical - A little math for reference - Classically trained statistician... --- class: center, middle background-image: url("data:image/png;base64,#C:/Users/jared/Documents/Consulting/talks/images/ProfessorFink.gif") ??? - Lifelike image of me in grad school - Statisticians helped win WWII - Not considered cool - Barely better than an actuary - Ten years ago rebranded as a data scientist... - Rock stars... <!-- ```{r show-hide-slides,echo=FALSE,include=FALSE} --> <!-- hide_slides <- !params$show_all --> <!-- ``` -->
<style type="text/css"> .largest { font-size: 200% } .large { font-size: 130% } .small { font-size: 70% } .smallest { font-size: 50% } .smallCode .remark-code { //font-size: 50%; font-size: 12px; } .remark-slide-number { font-size: 10pt; margin-bottom: -11.6px; margin-right: 10px; color: #FFFFFF; /* white */ opacity: 0; /* default: 0.5 */ } .center2 { margin: 0; position: absolute; top: 50%; left: 50%; -ms-transform: translate(-50%, -50%); transform: translate(-50%, -50%); } .indent1 { display:block; text-indent: 5%; } .indent2 { display:block; text-indent: 10%; } .indent3 { display:block; text-indent: 15%; } </style> <style type="text/css"> .section{ vertical-align:middle; //display: block; text-align: center; background-color: #272822; color: #d6d6d6; text-shadow: 0 0 20px #333; } ..section h3, .remark-slide ..section h3 { font-family: 'Roboto Condensed', 'Avenir Next', 'Helvetica Neue', 'Helvetica', sans-serif; font-weight: 400; padding-top: 15px; font-size: 28px; } ..section h3:nth-of-type(2), .remark-slide ..section h3:nth-of-type(2) { margin-top: 10px; font-size: 28px; } ..section table, .remark-slide ..section table { padding-top: 35px; } ..section i.fa, .remark-slide ..section i.fa { font-size: 2em; } .section h1, .inverse h2, .inverse h3 { color: #f3f3f3; } </style> <style type="text/css"> .remark-slide-content:after { content: "www.landeranalytics.com"; position: absolute; bottom: 5px; right: 115px; height: 40px; width: 120px; background-repeat: no-repeat; background-size: contain; } </style> --- class: middle, center background-image: url("data:image/png;base64,#C:/Users/jared/Documents/Consulting/talks/images/SpringsteenGuitar.jpg") background-size: contain ??? - Data scientists: rock stars - Sexiest job of 21st century: Harvard business review - That wasn't enough... --- class: center, middle background-image: url("data:image/png;base64,#C:/Users/jared/Documents/Consulting/talks/images/daft-punk.jpg") background-size: contain ??? - Now we do AI --- class: center, middle # Let's Talk About AI ??? - You hear all about it - Let's dive in - Define artificial intelligence - I like... --- class: center, middle background-image: url("data:image/png;base64,#C:/Users/jared/Documents/Consulting/talks/images/fundraising-ai-tweet.png") ??? - Baron Schwartz - https://twitter.com/xaprb/status/930674776317849600 - linguistic implications - deflate the hype --- class: center, middle # Can We Do Better? ??? - Let's try something else --- class: center, middle background-image: url("data:image/png;base64,#C:/Users/jared/Documents/Consulting/talks/images/VennDiagram_AI.gif") ??? - From [Deep Learning](http://amzn.to/2t6wT2a): Goodfellow, Bengio & Courville - nested Venn diagram - What's hard for computers but easy for humans - ML is subset of AI - Usually when people talk about AI... --- class: middle, center # So We're Really Talking About Machine Learning ??? - Most of the time AI means machine learning - Let's define machine learning - Also called ML --- class: center, middle background-image: url("data:image/png;base64,#C:/Users/jared/Documents/Consulting/talks/images/stats-ml-10-year-challenge.png") background-size: contain ??? - this tweet speaks to me - we've been doing ML or stats for hundreds of years - just do it faster and bigger now - https://twitter.com/andrewjdyck/status/1086084972937539584 - Call it ML: paid more --- class: center, middle # Types of Machine Learning ??? - break it down - Into three broad realms... --- class: center, middle background-image: url("data:image/png;base64,#C:/Users/jared/Documents/Consulting/talks/images/ml-types.png") background-size: contain ??? - This image all over internet - Reinforcement Learning: Semi-supervised - learning tasks - Alpha Go - Pacman - Computer beat human in games - Unsupervised Learning: mainly clustering - Supervised Learning: predict or explain - regression - classification - focus for today --- class: middle, center # Supervised Learning ??? - Most common - Really powerful --- class: section # Main Modes of Supervised Learning ??? - Main outcomes for supervised learning - Determined by outcome variable - What you are trying to model --- class: middle, center <img src="data:image/png;base64,#GetToKnowYourAlgorithms_files/figure-html/tip-plot-1.png" width="50%" style="display: block; margin: auto;" /> ??? - Regression - Outcome: Number - Income - Altitude - Intensity - Average output conditional on input - Next most common... --- class: middle, center <img src="data:image/png;base64,#GetToKnowYourAlgorithms_files/figure-html/binomial-plot-1.png" width="50%" style="display: block; margin: auto;" /> ??? - Binary classification - win/lose - live/die - true/false - Click/don't click - What class something falls in - Extends naturally to... --- class: middle, center <img src="data:image/png;base64,#GetToKnowYourAlgorithms_files/figure-html/multinomial-plot-1.png" width="50%" style="display: block; margin: auto;" /> ??? - Multiclass Classification - Trade: fraud/fat finger/bad judgment - Vessel: trawler/speed boat/fishing boat/skimmer - Vehicle: Pickup Truck/SUV/Sedan/Convertible/Van - Regression and classification by far the most common... --- class: middle, center <img src="data:image/png;base64,#GetToKnowYourAlgorithms_files/figure-html/poisson-plot-1.png" width="50%" style="display: block; margin: auto;" /> ??? - Count Data - Number of accidents - Number of lawsuits - Number of children - Number of attacks - Often overlooked - Doesn't get the glory - Likewise... --- class: middle, center <img src="data:image/png;base64,#GetToKnowYourAlgorithms_files/figure-html/cox-plot-1.png" width="50%" style="display: block; margin: auto;" /> ??? - Survival Models - Health overtime - Probability of dying - Equipment failure - Similar - Proportional hazards - Lastly... --- class: center, middle
Country
Rank
GDP
Population
United States
1
$20.49T
328.20M
China
2
$13.40T
1.40B
Japan
3
$4.97T
126.30M
Germany
4
$2.83T
83.02M
United Kingdom
5
$2.78T
66.65M
??? - Ranking - Most Wanted - Greatest Threat - Chess players - Best selling products - Importance - Just a subset of outcomes - Separate but related... --- class: section # Main Machine Learning Algorithms ??? - Each mode can be solved by multiple algorithms - Each algorithm can solve multiple modes - Today focus mainly on - Regression - Classification - Hazards - Many drawn from personal experience of Lander Analytics - Just some of the algorithms available are... --- class: middle - Linear Regression / Logistic Regression - Cox Proportional Hazards - Poisson Regression - Non-linear Regression - Decision Trees - Random Forests - Boosted Trees - Support Vector Machines - Multivariate Adaptive Splines - Cubist Models - Multilevel Models - Generalized Additive Models - Linear Discriminant Analysis - Bayesian Adaptive Regression Trees - K-Nearest Neighbors - Neural Networks ??? - Many, many different ways - Each can solve multiple modes - Linear/logistic Regression - Decision Tree - Random Forest - Boosted Tree - Multilevel Models - Generalized Additive Models - More than this - Start at the top... --- class: section # Generalized Linear Models ??? - GLMs - Regression - Classification - Count Data --- class: center, middle <img src="data:image/png;base64,#GetToKnowYourAlgorithms_files/figure-html/tip-plot-1.png" width="50%" style="display: block; margin: auto;" /> ??? - Model an outcome with linear math - Here: regression - Also: classification, count data - Straight line - Doesn't have to be straight, not what linear means - Find slope and intercept of line --- class: center, middle $$ y \sim \beta_0 + \beta_1x_1 + \beta_2x_2 + \cdots + \beta_px_p $$ ??? - Can solve with pencil and paper - Beta is affect, or weight, of input - Goal: Solve for betas - Generalize to... -- $$ log(\frac{p}{1-p}) \sim \beta_0 + \beta_1x_1 + \beta_2x_2 + \cdots + \beta_px_p $$ ??? - Two class outcome - Same linear combination - Logit: log odds - Extends to multiclass -- $$ log(\lambda) \sim \beta_0 + \beta_1x_1 + \beta_2x_2 + \cdots + \beta_px_p $$ ??? - Count data - Same general idea for all - Left hand is different: Generalized - Right hand is the same: Linear --- class: center, middle ## Why Use a Generalized Linear Model ??? - When is this useful --- class: middle - Baseline - Understandable - Fast - Simple ??? - Baseline: Always start here - Understandable - Fast - Simple --- class: center, middle ## Drawbacks of a Generalized Linear Model ??? - What's wrong? --- class: middle - Simplifying assumptions are not always true - Hard to capture complex relationships between inputs and output ??? - Simplifying assumptions are not always true - Hard to capture complex relationships between inputs and output - Failings aside: very useful - Often go to model --- class: center, middle <img src="data:image/png;base64,#GetToKnowYourAlgorithms_files/figure-html/network-viz-1.png" width="65%" style="display: block; margin: auto;" /> ??? - Network analysis - Typically think of clustering - Model flows of information - Based on network metrics - Like centrality, betweeness - Find key people in terrorist networks - Find key people in IP theft rings - Simple regression ---> complex uses - Other times, not complex enough --- class: section # Decision Trees ??? - Captures non-linearity - Common in real life --- class: center, middle background-image: url("data:image/png;base64,#C:/Users/jared/Documents/Consulting/talks/images/DecisionTreeImagination3.jpg") background-size: contain ??? - Ask a series of questions - Elevation: below ---> price per sq ft ---> year built - 74% chance of good - Creating buckets - Data in bucket gives prediction --- class: center, middle $$ \hat{f}(x) = \sum_{m=1}^M \text{avg}(y_i | x_i \in R_m)\mathbb{I}(x \in R_m) $$ ??? - Create buckets - Take average in each bucket - Regression -- $$ \hat{p}\_{mk} = \frac{1}{N\_m} \sum_{x_i \in R_m} \mathbb{I}(y_i=k) $$ ??? - Classification - Percent of each class in each bucket - Math is easy - Algorithm has to compute buckets - Which questions to ask --- class: center, middle ## Why Use a Decision Tree ??? - When is this useful --- class: middle - Better predictions than linear models - Somewhat understandable - Fast ??? - Better predictions than linear models - Somewhat understandable - Fast --- class: center, middle ## Drawbacks of a Decision Tree ??? - What's wrong? --- class: middle - Highly volatile - Too many buckets can be hard to understand ??? - Highly volatile - Too many buckets can be hard to understand --- background-image: url("data:image/png;base64,#C:/Users/jared/Documents/Consulting/talks/images/classroom-drob.jpg") background-size: cover ??? - Predict someone finishing school - Segment user base - Who needs extra attention: both top and bottom - Fix volatility... - One of the first attempts --- class: section # Random Forests ??? - Why have just one tree? - Have a forest - Solves volatility problem --- class: center, middle background-image: url("data:image/png;base64,#C:/Users/jared/Documents/Consulting/talks/images/RandomForestImagination.png") background-size: contain ??? - Grow hundreds of trees - Each built randomly - Rows - Columns - Average results --- class: middle, center $$ \hat{f} = \frac{1}{B} \sum_{b=1}^B \hat{f}^{*b}(x) $$ ??? - Make prediction from each tree - Average the predictions - Bagging: Bootstrap aggregating - Regression, classification, others --- class: center, middle ## Why Use a Random Forest ??? - Why is this useful --- class: middle - More powerful predictions than linear models - More consistent predictions than a decision tree - Fast - Easy to tune ??? - More powerful predictions than linear models - More consistent predictions than a decision tree - Fast: grow trees in parallel - Easy to tune: will become important - ML is just brute forcing through tuning parameters --- class: center, middle ## Drawbacks of a Random Forest ??? - Biggest problem is... --- class: middle - Very hard to understand ??? - Very hard to understand - We know how they work - But not the reasoning behind choices - Can be used for many things... --- class: center, middle background-image: url("data:image/png;base64,#C:/Users/jared/Documents/Consulting/talks/images/aluminum-block.jpg") background-size: cover ??? - Predictive maintenance - Survival forests: instead of proportional hazards - Perform maintenance before failure - Engines - Pumps - Turbines - Propeller shafts - Anything that spins - Service before failure - Eliminates downtime - Jeeps - Aircraft - Another fix for volatility... --- class: section # Boosted Trees ??? - Trees that learn from each other - Darling of Kaggle competitions --- class: center, middle background-image: url("data:image/png;base64,#C:/Users/jared/Documents/Consulting/talks/images/Boosted-Tree-Cycle.png") background-size: contain ??? - Sequence of small, stumpy trees - Can boost anything - Each time improve over previous fit - Add them all up --- class: middle, center $$ \hat{y}\_i^{t} = \sum_{k=1}^t f_k(x_i) = \hat{y}_i^{(t-1)} + f_t(x_i) $$ ??? - looks simple - Estimate a series of functions - Each is a tree - Learned from previous trees - But much more involved - Fit a tree, learn from mistakes --- class: center, middle ## Why Use a Boosted Tree ??? - Reason they're so popular --- class: middle - Better predictions than most other algorithms - Fast ??? - Better predictions than most other algorithms - Fast - Even though it computes quickly... --- class: center, middle ## Drawbacks of a Boosted Tree ??? - Can run into issues --- class: middle - Many tuning parameters - Very hard to understand ??? - Not necessarily faster - Many tuning parameters - 7 for xgboost: most popular - Very hard to understand: almost impossible - Hundreds of trees - Each: Multiple questions - Very powerful predictions... --- class: center, middle background-image: url("data:image/png;base64,#C:/Users/jared/Documents/Consulting/talks/images/StockExchange.jpg") background-size: contain ??? - Financial crimes - Detect fraudulent trades - Money laundering - Good at predicting things that are hard to find - Unbalanced data: few negative samples - Just saw three tree-based models - Back to linearity --- class: section # Generalized Additive Models ??? - GAM - Linear Model with wiggliness... --- class: center, middle <img src="data:image/png;base64,#GetToKnowYourAlgorithms_files/figure-html/tips-gam-plot-1.png" width="504" style="display: block; margin: auto;" /> ??? - Earlier: these points with straight line - Now curvy line - Actually wiggly: technical term - A mixture of penalized basis functions --- class: middle, center $$ y \sim \beta_0 + \beta_1f_1(x_1) + \beta_2f_2(x_2) + \cdots + \beta_pf_p(x_p) $$ ??? - Outcome y - Regression or classification - Fitting y onto linear combination of smoothing curves of the x's - Multiplied by weights - Splines - A lot of computation goes into this --- class: center, middle ## Why Use a Generalized Additive Model ??? - When is this useful --- class: middle - Better predictions than linear models - Allows for more complex relationships - More understandable than random forests and boosted trees ??? - Better predictions than linear models - Allow for more complex relationships - More understandable than random forests and boosted trees - Provides confidence Intervals --- class: center, middle ## Drawbacks of a Generalized Additive Model ??? - What's wrong? --- class: middle - Still searching ??? - Still searching - These are magical - Can be a little slower - Apply to so many domains - One fascinating use... --- class: center, middle <img src="data:image/png;base64,#GetToKnowYourAlgorithms_files/figure-html/nyc-map-plot-1.png" width="65%" style="display: block; margin: auto;" /> ??? - Spatio-temporal data - GIS - Maps - Vessel movement - Tracking fisheries - Vehicle movement - Optical sensors - Where looking - Related to... --- class: section # Multilevel Models ??? - Linear Model - Effects vary by group --- class: center, middle <img src="data:image/png;base64,#GetToKnowYourAlgorithms_files/figure-html/tips-multilevel-plot-1.png" width="504" style="display: block; margin: auto;" /> ??? - Multiple regression lines - Each: different intercept and slope - Each group learns from itself - And the other groups - Take advantage of structure in the data --- class: middle, center $$ y\_i \sim \beta\_{0j[i]} + \beta\_{1j[i]}x\_{1i} + \beta\_{2j[i]}x\_{2i} + \cdots + \beta\_{pj[i]}x\_{pi} $$ ??? - Bayesian: hot topic - Effect sizes vary for each group --- class: center, middle ## Why Use a Multilevel Model ??? - When is this useful --- class: middle - Better predictions than linear models - Allows for more complex relationships - Can understand group dynamics - Interpretable ??? - Better predictions than linear models - Allows for more complex relationships - Can understand group dynamics - Interpretable - Confidence intervals --- class: center, middle ## Drawbacks of a Multilevel Model ??? - What's wrong? --- class: middle - Can be slow to fit - Intimidating to people getting started ??? - Can be slow to fit: MCMC - Intimidating to people getting started - Intimidation worth it... --- class: center, middle background-image: url("data:image/png;base64,#C:/Users/jared/Documents/Consulting/talks/images/cargo-container.jpg") background-size: cover ??? - Anomaly detection - Compare typical weight of different containers - Different containers are different - Learns overall trend - And group trend - Know when something is off ---> investigate - Great with sampling: Can't look at every container - A completely different class of model... --- class: section # Time Series ??? - Time dependent data --- class: center, middle <img src="data:image/png;base64,#GetToKnowYourAlgorithms_files/figure-html/time-series-plot-1.png" width="504" style="display: block; margin: auto;" /> ??? - Forecast into future - Based on previous time points - Usually univariate - Can be multivariate --- class: middle, center $$ (1 - \phi_1B^1 - \ldots - \phi_pB^p)(1-B)^dy_t = (1 + \theta_1B^1 + \ldots + \theta_qB^q)\epsilon_t $$ ??? - Look to previous time steps for the future - Worst math we'll see - Backshift - Superscripts - ARIMA: Autoregressive Integrated Moving Average - Other Types: ETS, GAM, etc --- class: center, middle ## Why Use a Time Series Model ??? - When is this useful --- class: middle - Time needs to be accounted for ??? - Time needs to be accounted for - Time is everywhere --- class: center, middle ## Drawbacks of a Time Series Model ??? - What's wrong? --- class: middle - Only meant for time models - Not straight forward to interpret ??? - Only meant for time models - Not straight forward to interpret: no choice - Time is everywhere... --- class: center, middle background-image: url("data:image/png;base64,#C:/Users/jared/Documents/Consulting/talks/images/electric-towers.jpg") background-size: cover ??? - Predict energy usage - Gas consumption - Other resources: water - Electricity today vs yesterday - Five minute intervals - Big manufacturers buy electricity in advance - Drug sales - User acquisition - Commodity pricing - Transit ridership - Many methods for one problem - Larger problem - Threaded through many examples - Happy to discuss offline - Or connect with right people - You've heard a lot about... --- class: section # Deep Learning ??? - Neural network - Loosely inspired by the brain --- class: center, middle background-image: url("data:image/png;base64,#C:/Users/jared/Documents/Consulting/talks/images/NeuralNetImagination.jpeg") background-size: contain ??? - Input data - Before relate to output - Go through hidden states - All the rage - So much hype - Extreme non-linear modeling --- class: middle, center $$ h_1 = f_1(XW_1 + b_1) $$ $$ h_2 = f_2(h_1W_2 + b_2) $$ $$ h_3 = f_3(h_2W_3 + b_3) $$ $$ \vdots $$ $$ y = f_o(h_pW_o + b_o) $$ ??? - Multiply inputs by weights: afine transformation (linear) - Call non-linear activation function - Repeat - Solving for weights:Hard - A lot of work goes into this - Everyone wants to use deep learning these days... --- class: center, middle ## Why Use Deep Learning ??? - That's because --- class: middle - Extremely good predictions - Particularly for vision and language ??? - Extremely good predictions - Especially for vision and language - And a lot of hype - Strong predictions... --- class: center, middle ## Drawbacks of Deep Learning ??? - The problem is... --- class: middle - Complete black box - Slow to fit - Slow to tune ??? - Complete black box - Slow to fit: GPU ---> electricity - Slow to tune: so many tuning variables - Number of layers - Size of layers - Batch size - Epochs - Learning rate - Optimizer - Dropout percent - Batch normalization - Regularization - Make xgboost look easy - Pretty amazing... --- background-image: url("data:image/png;base64,#C:/Users/jared/Documents/Consulting/talks/images/yolo_basketball.gif") background-size: contain ??? - Object recognition - YOLOv5: you only look once - Like the movies - But real - Detect objects in photos - Real time video - Threat detection - Assist search teams - Remote sensing - All open source... --- class: middle, center # Open Source ??? - Everything we've seen - Free as in speech and beer - Audit the code - Innovation - Have to foster the community - Brings value: both intrinsic and economic - DOD: Insisted we open source the work - Benefit from the community ---> contribute back - Lear more: nyhackr, R Conference - NY, R Gov: rstats.ai --- class: section # Where is this being done? ??? - Who does ML --- class: middle, center # Some of Our First-Hand Knowledge ??? - Been sharing first hand knowledge - Who we helped --- class: middle, center background-image: url("data:image/png;base64,#C:/Users/jared/Documents/Consulting/talks/images/client-logos.png") background-size: contain ??? - Just some customers - we implemented - CarFax: car prices - Macmillan: Book sales - Vikings: draft picks - Department of Defense: Many projects - Particularly: location data - Developed some really great methods - They want you to know about it - Happy to discuss - Many companies and organizations --- class: middle, center # Other Industries ??? - Other industries - Can't name companies --- class: middle .pull-left[ - Banking - Insurance - Retail - Pharmaceutical - Sports - Law - Manufacturing - Genetics - Tech - Public Safety - Telecom ] .pull-right[ - Disaster Response - Investments - Publishing - Food - Mining - Construction - Marketing - Human Resources - Defense - Politics - Healthcare ] ??? - Industries we touched - Banking - Defense - Legal - Political - Manufacturing - Pharmaceuticals - Finance - Insurance - Small company and we've done this - Imagine what's out there --- class: section # Wrapping Up ??? - To draw down - On everything we've seen --- class: middle, center # ML Can Model Many Types of Data ??? - Numeric - Classification - Count data - Survival rates - Ranking - Outcome type indicates the mode - Then choose model... --- class: middle, center # There are Many Model Types ??? - Generalized Linear Models - Tree-based models - Bayesian methods - Deep Learning - To name a few - Can solve multiple outcome types --- class: middle, center # Countless Use Cases ??? - Object recognition - Anomaly detection - Failure prevention - Remote sensing --- class: middle, center # In Just About Any Industry ??? - Government - Private sector - Academic - Non-profit - The key... --- class: middle, center # Know When to Apply Which Model ??? - Knowing the right method for the given problem - Know when the gains are worth the investment - Sometimes simpler is better - And faster - Hopefully better sense --- class: middle, center # Thank You <img src="data:image/png;base64,#C:/Users/jared/Documents/Consulting/talks/images/Lander_logo.png" width="40%" style="display: block; margin: auto;" /> info@landeranalytics.com ??? - Thank you - Be in touch - We're available to help - Love to hear from you