Introduction to machine learning

Introduction to Machine Learning Introduction to Machine Learning Tanujit Chakraborty Indian Statistical Institute, Kolkata Email tanujitisigmail com July 10, 2019 Talk by Tanujit Chakraborty Worksho.

Trang 1

Introduction to Machine Learning

Trang 2

Statistics

e “Statistics is the universal tool of inductive inference, research in natural and social sciences, and technological applications

Statistics, therefore, must always have purpose, either in the pursuit

of knowledge or in the promotion of human welfare”

- P.C Mahalanobis, Father of Statistics in India

e Role of Statistics:

@ making inference from samples

@ development of new methods for complex data sets

e Remember: “Figure won't lie, but liars figure”

Trang 3

Machine Learning

e “Machine learning is the field of study that gives computers the ability to learn without being explicitly programmed”

- Arthur L Samuel, Al pioneer

e Role of Machine Learning: efficient algorithms to

@ solve an optimization problem

@ represent and evaluate the model for inference

e Remember: “Prediction is very difficult, especially if it’s about the future” - - Niels Bohr, Father of Quantum

Trang 4

Introduction to Machine Learning

e Designing algorithms that ingest data and learn a model of the data

e The learned model can be used to

@ Detect patterns/structures/themes/trends etc in the data

@ Make predictions about future data and make decisions

Blue: Whole Data

2 | Green: Training Set I

e Modern ML algorithms are heavily “data-driven”

e Optimize a performance criterion using example data or past

experience

Talk by Tanujit Chakraborty Workshop on Data analytics

Trang 5

Taxonomy for Machine Learning

Machine learning provides systems the ability to automatically learn

Learning using labeled data Learning using unlabeled data (usually considered harder)

7

Many other specialized flavors of ML also exist,

some of which include

RL doesn't use “labeled” or

“unlabeled” data in the traditional

sense! In RL, an agentlearns via _ „

its Interactions with an environment

(feedback-driven “policy” learning)

Trang 6

A Typical Supervised Learning Workflow (for Classification)

Supervised Learning: Predicting patterns in the data

Trang 7

A Typical Unsupervised Learning Workflow (for Clustering)

Unsupervised Learning: Discovering patterns in the data

Note: Unsupervised Learning too can Cluster 1 fA \

have (and often has) a “test” phase

E.g., in this case, given a new cat/dog A

image, predict which of the two

clusters it belongs to / Ni |

3 d đg : | E y

Can do it by assigning the image to the |

cluster with closer centroid \=

Trang 8

A Typical Reinforcement Learning Workflow

Reinforcement Learning: Learning a” policy” by performing actions and getting

rewards (e.g, robot controls, beating games)

- Senses/observes the environment

“— | ` - Takes an action based on its current policy

Ỷ - Receives a reward for that action

- Updates its policy

Trang 9

Classification

Example: Credit scoring

Differentiating between low-risk and

high-risk customers from their income and

savings

Discriminant: IF Income > 6; AND

Savings > 02 THEN low-risk ELSE

high-risk

Classification: Learn a linear/nonlinear

separator (the “model”) using training

data consisting of input-output pairs (each

output is discrete-valued “label” of the

corresponding input)

Use it to predict the labels for new “test”

inputs

Other Applications: Image Recognition,

Spam Detection, Medical Diagnosis

Trang 10

Regression

e@ Example: Price of a used car

e X: car attributes; Y : price and

Y = f(X, 6)

e f( ) is the model and Ø is the model

parameters

@ Regression: Learn a line/curve (the

“model” ) using training data consisting of

Input-output pairs (each output is a

Process Improvement, Weather =

Forecasting

Trang 11

Clustering: Learn the grouping

structure for a given set of

unlabeled inputs

Homogeneous groups as latent

structure: Clustering

Other Applications: Topic

Modelling, Image Segmentation,

Social Networking

Talk by Tanujit Chakr:

Original unclustered data

Clustered data

Trang 12

Dimensionality Reduction

e Low-dimensional latent structure:

Dimensionality Reduction

@ Goal: Learn a Low-dimensional

representation for a given set of

high-dimensional inputs

e@ Note: DR also comes in

supervised flavors (supervised

Trang 13

A Simple Example: Fitting a Polynomial

e The green curve is the true function

(which is not a polynomial)

@ We will use a loss function that

measures the squared error in the

prediction of y(x) from x The loss for

the red polynomial is the sum of the

squared vertical errors

Trang 14

Some fits to the data: which is best?

The right model complexity?

Desired: hypotheses that are not too simple, not too complex (so as to not overfit on

the training data)

Trang 15

Overfitting and Generalization

e Doing well on the training data is not

enough for an ML algorithm

@ Trying to do too well (or perfectly) on

training data may lead to bad

“generalization”

@ Generalization: Ability of an ML

algorithm to do well on future “test”

data

@ Simple models/functions tend to

prevent overfitting and generalize well:

A key principle in designing ML

algorithms (called “regularization” )

e@ No Free Lunch Theorem

Trang 16

Probabilistic Machine Learning

e Supervised Learning (“predict y given x’) can be thought of as estimating

p(Y|X)

[ ` ng “dog” — [see p(image, class) mam p(class|image)

patti xoa = : A two-step approach “generative modeling”

“cat”

Unlabeled Training Data

e@ Harder for Unsupervised Learning because there is no supervision y

Trang 17

Function Approximation in Machine Learning

a

‘fle Xe latent representation

Ve ==> | : image — of image (e.g., cluster id

Trang 18

Machine Learning: A Brief Timeline and Some Milestones

- Minsky & Edmonds’ neural net machine (SNARC)

- Arthur Samuels’ Checkers Player based on Machine Learning

- Rosenblatt's Perceptron algo

- Origins of Bayes Theorem:

Thomas Bayes and Pierre-Simon

Laplace (~1800)

- Least Squares method: Legendre

(1805)

- Widrow-Hoff's ADALINE algo

- K-means clustering algo (Lloyd)

- Early origins of Reinforcement

- Neural nets slumber due to lack of compute power

- Support Vector Machines (SVM), Kernel methods, Bayesian methods

- Random Forests, Boosting

- Continued work on neural nets for images, sequences (CNN, LSTM, etc)

- Automatic Differentiation (later’

- Software frameworks (e.g., Tensorflow, PyTorch, ease implementing ML algos)

- Drones, self-driving cars, etc

- A lot of industry/media focus, excitement and hype

- Focus on Fairness, Accountability, and Transparency in ML algorithms

- Early works on PCA (Peason), Factor

Analysis (Spearman), CCA (by

Hotelling), for exploratory data analysis

- Early works on Discriminant Analysis

methods (Fisher) for classification

- McCulloch-Pitts model of the

- Nearest Neighbors algorithm

- Early works on Genetic Algorithms inspired by natural evolution (John Holland)

- Multi-layer Perceptrons (can learn nonlinear functions)

Sunny days of Al are back!

- The Backpropagation algorithm

to train deep neural nets

- Decision Trees, ID3 (Quinlan)

- NetTalk: Neural nets that can lear to pronounce English words

- Modern Reinforcement Learnint

| 2010-2020

—

- Continued focus on classical

statistical and probabilistic models,

connections b/w learning & cognition

as opposed to rule-based methods

- ML based algorithm wins the Netflix Challenge

- Neural nets re-emerged and rebranded as Deep Learning (Hinton, Bengio, LeCun, Ng, and others), thanks to improved training, GPUs

Workshop on Data analytics

Trang 19

arning in the real-world

Broadly applicable in many domains (e.g., internet, robotics, healthcare and biology, computer vision, NLP, databases, computer systems, finance, etc.)

Trang 20

Machine Learning helps Natural Language Processing

ML algorithm can learn to translate text

English ¥ $ s) er Hindi + 0D })

Trang 21

Machine Learning meets Speech Processing

Trang 22

Machine Learning helps Computer Vision

e@ Automatic generation of text captions for images:

A convolutional neural network is trained to interpret images, and its output is

then used by a recurrent neural network trained to generate a text caption

e@ The sequence at the bottom shows the word-by-word focus of the network on

different parts of input image while it generates the caption word-by-word

Input image Convolutional feature extraction RNN with attention over image Word by word

generation

flying over

Trang 23

Machine Learning helps Recommendation systems

® A recommendation system is a machine-learning system that is based on data

that indicate links between a set of a users (e.g., people) and a set of items (e.g.,

products)

e@ A link between a user and a product means that the user has indicated an interest

in the product in some fashion (perhaps by purchasing that item in the past)

e@ The machine-learning problem is to suggest other items to a given user that he or she may also be interested in, based on the data across all users

Trang 24

Machine Learning helps Chemistry

(xÿˆ ————> Zz ——x (XJ) Zz ——> X

RL: Reinforcement learning RNN: Recurrent neural network | Hybrid approaches

Policy gradient with Monte Carlo tree search (MCTS)

Incomplete Next Reward

SMILES action upon

(state) (char) MC search completion Metrics

‘Inverse molecular design using machine learning: Generative models for matter engineering (Science=2018)

Trang 25

rning helps lmage Recognition

Trang 26

Biology

Images Convolutional Fully connected

of neurons layers layers

Talk by Tanujit Chakra

Machine Learning helps Many Other Areas

Finance

Workshop on Data analytics

Trang 27

Textbook and References

\ Machine Learning cee

Talk by Tanujit Chakr: Workshop on Data analytics

Trang 30

Định dạng
Số trang	30
Dung lượng	2,52 MB