Machine Learning Algorithm Selection

pallab_haldar

Machine Learning Algorithm Selection

In this session we will discuss about the techniques to identify proper ML Algorithm for you implementation scenario. I will discuss about the popular and most used algorithm only.

Let's start with the Supervised Learning Algorithms :

1. Regression :

A. When to consider Linear Regression : Follow Linear equation Y=MX+C for regression .Linear and Logistic Regression are parametric methods. When continuous numeric values are required to be predicted then linear regression is a good choice. For example your predicting variable(Y) continuously maintain a linear relationship with multiple variable X1 and X2 and has a dependency on both. Y=M1X1+M2X2+C. Can trained on large set of data.

For example we can use LR to -

Forecast Material in a supply chain landscape.
Move or political analyst may use linear regression to help them provide insightful information about Movies Marketing expense and revenue or team followers.
Weather forecasting.
Sales Forecasting by Sales team based in previous year unit demand, unit cost and usage.

B. When to consider Polynomial Regression :

1. collect the data point from sample set.

2. In python plot the data point using matplotlib.

3. Need continuous numeric values are required base on the curve data. Use Polynomial Regression.

A polynomial regression model is a machine learning model that can capture non-linear relationships between variables by fitting a non-linear regression line, which may not be possible with simple linear regression. It is used when linear regression models may not adequately capture the complexity of the relationship. It is a combination of multiple linear regression.

Y=C+M1X1+M2X^2+.....+MnX^n

C. When to consider Lasso Regression :

Lasso Regression is required when you a sparse model i.e. - 1. When there is a larger no of independent(predictor) variable and few are important ( with greater weight). In that case we set the confident 0 for the less important features. It helps to avoid over fitting.

2. When the number of independent(predictor) variables is much larger than the number of observations, we will to choose LASSO regression over multiple linear regression i.e. n << p.

Lasso Regression: where Ordinary Least Squares is modified to also minimize the absolute
sum of the coefficients (called L1 regularization). Ridge Regression: where Ordinary Least Squares is modified to also minimize the squared. absolute sum of the coefficients (called L2 regularization). It's a kind of multiple linear regression.

Lasso regression follow the below equation -

2. Classification:

A. When to consider Logistic Regression : When we required outcome which can have only two values then we will use logistic Regression.
it is used to determine one dependent variable that can only have two outcomes,
e.g. pass or fail. Much like classification, it is best used in situations where the outcome is binary.
The model may have one or more independent variables that it depends on.

It is a classification algorithm and used for scenario's like -

1. In medical diagnostics, Customer feedback analysis, Potential market area recognition, spam detection etc.

It use logistic function (sigmoid function) and which is of an S-shaped curve that can take any real-valued number and map it into a value between 0 and 1, but never exactly at those limits. Where e is the base of the natural logarithms number or the EXP() function.

B. When to consider Decision Tree : When we face difficulty understand a complex data or logic of a compound object. We break down the complex data or complex logic into more manageable parts with simple logic parts(Which used to create the final data by combination with others ).

For binary classification tasks we will use Decision tree. Predicting result as “yes” or “no” .

Decision tree use Entropy and the formula for Entropy is -

and then used it in information gain -.

Use case where Decision Tree ML applied : Car dealer can identify potential users who are most likely to be interested in buying or selling a car in a particular area.

C. When to use SVM(Support Vector Machines) : it is mostly used for text classification and analysis.

This Lis to find a hyperplane in an N-dimensional space(N — the number of features) that distinctly classifies the data points.

The equation for the linear hyperplane is:

Vector W --->normal vector to the hyperplane. The parameter b is the offset or distance of the hyperplane from the origin along the normal vector w. where ||w|| represents the Euclidean norm of the weight vector w. Euclidean norm of the normal vector W. For Linear SVM classifier :

D. When to use K-NN(Support Vector Machines) :

It is applied on unlabeled data when labeled data is not available or impossible to obtain. KNN is a simple algorithm, based on the local minimum of the target function which is used to learn an unknown function of desired precision and accuracy.

It classifies the data point on how its neighbor is classified. The K-NN use Euclidean distance to calculate the nearest neighbor. If we have two points (x, y) and (a, b). The formula for Euclidean distance (d) will be -

d = sqrt((x-a)²+(y-b)²)

Known use cases using KNN algorithm in ML: For junk email identifier, Add identifier in Web sites, for different human cell condition identification K-NN is used.

Hope it will help. In my next block I am going to discuss about the Depp learning, neural network and Active functions.