Thursday, February 25, 2016

Introduction to Machine Learning

Machine Learning

According to Tom Michel; A computer program is said to learn from Experience E with respect to some class of Tasks T and Performance measure P, if its performance at tasks in T, as measured by P, improves with experience E."

For example, in the checkers game E=10000s games, T is playing checkers and P if you win or not.

There are different types of machine learning algorithms:
  • Supervised Learning: Teach the computer how to do something, and then let it, use its new found knowledge to do it. Here the algorithm generates a function that maps inputs to desired outputs.
  • Unsupervised Learning: Let the computer learn how to do something, and use this to determine structure and patterns in data.
  • Reinforcement Learning: The algorithm learns a policy of how to act given an observation of the world. 
  • Recommender Systems:  Recommender systems typically produce a list of recommendations in one of two ways - through collaborative or content-based filtering


We use the machine learning algorithms dozens of times without knowing them. For example:
  • During web search in background the ML algorithm of search engine is used for the ranking for the webpages.
  • The tagging feature in Facebook uses the ML algorithm to identify your friends.
  • The ML algorithm used in the email systems to identify the spam mails and filter them into the spam folder.
  • ML is used in data mining e.g. web-click data/click stream data, medical records etc.
  • ML is used in applications that can’t be programmed by hand e.g. NLP, computer vision etc.

Supervised Learning

Supervised learning is inferring a function from supervised training data. The training data consist of a set of training examples. In supervised learning, each example is a pair consisting of an input object (typically a vector) and a desired output value. A supervised learning algorithm analyzes the training data and produces an inferred function/predictor model, which is called a classifier (if the output is discrete) or a regression function (if the output is continuous). The inferred function should predict the correct output value for any valid input object. This requires the learning algorithm to generalize from the training data to unseen situations in a "reasonable" way.

Supervised learning algorithms falls into two groups based on the outputs:
  • Regression (output is continuous)
  • Classification (output is discrete)

 1) Regression

Let’s consider an example with the data set regarding housing prices and how they relate to size in feet. 
Given the above data your friend has a house of 750 feet2 and he want to know the price of his house? To find the price we can draw straight line through data as shown below: 

From the above graph it’s clear that the predicted price is 150. Let’s see what will be the predicted price if we fir the quadratic function through data instead of linear. 
Here we get the predicted price 210. Later we will learn how to decide whether to fit a straight line or quadratic function through over data for prediction. This is an example of regression which is used for the prediction of continues valued output i.e. price.

2) Classification

Consider a dataset where to predict breast cancer as malignant or benign based on tumour size?

Here classify data into one of two discrete classes i.e malignant or not. This is an example of classification problem. In classification problems, we have a discrete number of possible values for the output e.g. maybe have four values 0 – benign, 1 – type 1 cancer, 2 –type 2 cancer, 3 –type 4 cancer. We may have more than one feature for example the age of the patient and its tumor size. 


Suppose a patient has tumor size and age marked with star.

The learning algorithm may fit the data by drawing straight line to separate the two classes of tumors, there by deciding that a tumor size marked with star is benign.

Here we have two features 0 and 1, sometime we may have more than one features like clump thickness, cell size and shape etc. but what if we have infinite features? For that we will use the mathematical trick called support vector machines (SVM), that we will learn in later chapters. 

Unsupervised Learning

In supervised learning we have already true answers in the given dataset while in unsupervised learning we have given data without labels or some labels. The type algorithms used in unsupervised learning are:
  • Clustering
  • Auto-Encoders
  • Dimensionality Reduction
Suppose we have given the data set without labels as shown below.

The learning algorithm will decide that the data reside in the two different groups/clusters. Therefore, it is called the clustering algorithm.

Examples of clustering algorithm are:
  • Google news
  • Genomics
  • Social network analysis
  • Astronomical data analysis


[Note: Adopted from Machine Learning by Stanford University ]