Technology Blogs by Members
Explore a vibrant mix of technical expertise, industry insights, and tech buzz in member blogs covering SAP products, technology, and events. Get in the mix!
cancel
Showing results for 
Search instead for 
Did you mean: 
pallab_haldar
Active Participant
0 Kudos

In this session we will discuss about the techniques to identify proper ML Algorithm for you implementation scenario.  I will discuss about the popular and most used algorithm only. 

Let's start with the Supervised Learning Algorithms :

1. Regression : 

A. When to consider Linear Regression :  Follow Linear equation  Y=MX+C for regression .Linear and Logistic Regression are parametric methods.  When continuous numeric values are required to be predicted then linear regression is a good choice. For example your predicting variable(Y) continuously maintain a linear relationship with multiple variable X1 and X2  and has a dependency on both. Y=M1X1+M2X2+C. Can trained on large set of data.

 

pallab_haldar_0-1714491857937.png

 

 

For example we can use LR to -

  1.  Forecast Material in a supply chain landscape.
  2.  Move or political analyst may use linear regression to help them provide insightful information about Movies Marketing expense and revenue or team followers.
  3. Weather forecasting.
  4.  Sales Forecasting by Sales team based in previous year unit demand, unit cost and usage.

B. When to consider Polynomial Regression : 

1. collect the data point from sample set.

2. In python plot the data point using matplotlib.

pallab_haldar_1-1714491858165.png

 

3. Need continuous numeric values are required base on the curve data. Use Polynomial Regression.

A polynomial regression model is a machine learning model that can capture non-linear relationships between variables by fitting a non-linear regression line, which may not be possible with simple linear regression. It is used when linear regression models may not adequately capture the complexity of the relationship. It is a combination of multiple linear regression.

Y=C+M1X1+M2X^2+.....+MnX^n

pallab_haldar_2-1714491858170.png

 

 

pallab_haldar_3-1714491857938.png

 

C. When to consider Lasso Regression : 

Lasso Regression is required when you a sparse model i.e. -                                                                             1. When there is a larger no of independent(predictor) variable and few are important ( with greater weight). In that case we set the confident 0 for the less important features. It helps to avoid over fitting.

2. When the number of independent(predictor) variables is much larger than the number of observations, we will to choose LASSO regression over multiple linear regression i.e. n << p.

Lasso Regression: where Ordinary Least Squares is modified to also minimize the absolute
sum of the coefficients (called L1 regularization). Ridge Regression: where Ordinary Least Squares is modified to also minimize the squared. absolute sum of the coefficients (called L2 regularization). It's a kind of multiple linear regression.

Lasso regression follow the below equation -

pallab_haldar_4-1714491858151.png

 

 

2. Classification:

A. When to consider Logistic  Regression : When we required outcome which can have only two values then we will use logistic Regression.
it is used to determine one dependent variable that can only have two outcomes,
e.g. pass or fail. Much like classification, it is best used in situations where the outcome is binary.
The model may have one or more independent variables that it depends on.

It is a classification algorithm and used for scenario's like -

1. In medical diagnostics, Customer feedback analysis, Potential market area recognition, spam detection etc.

It use logistic function (sigmoid function) and which is of  an S-shaped curve that can take any real-valued number and map it into a value between 0 and 1, but never exactly at those limits. Where e is the base of the natural logarithms number or the EXP() function.

pallab_haldar_5-1714491858172.png

 

pallab_haldar_6-1714491858155.png

 

B. When to consider Decision Tree : When we face difficulty understand a complex data or logic of a compound object. We break down the complex data or complex logic into more manageable parts with simple logic parts(Which used to create the final data by combination with others ).

For binary classification tasks we will use Decision tree. Predicting result as  “yes” or “no” .

Decision tree use Entropy and the formula for Entropy is -

pallab_haldar_7-1714491858159.png

 

and then used it in information gain -. 

pallab_haldar_8-1714491858086.png

Use case where Decision Tree ML applied  : Car dealer can identify potential users who are most likely to be interested in buying or selling a car in a particular area.

C. When to use SVM(Support Vector Machines) :  it is mostly used for  text classification and analysis.

This Lis to find a hyperplane in an N-dimensional space(N — the number of features) that distinctly classifies the data points.

pallab_haldar_0-1714533672422.png

The equation for the linear hyperplane is:

 
pallab_haldar_5-1714534173497.png

Vector W --->normal vector to the hyperplane. The parameter b is the offset or distance of the hyperplane from the origin along the normal vector w. where ||w|| represents the Euclidean norm of the weight vector w. Euclidean norm of the normal vector W.  For Linear SVM classifier :                                                                   

pallab_haldar_6-1714534360746.png

D. When to use K-NN(Support Vector Machines) :

It is applied on unlabeled data when  labeled data  is not available or impossible to obtain. KNN is a simple algorithm, based on the local minimum of the target function which is used to learn an unknown function of desired precision and accuracy.

It classifies the data point on how its neighbor is classified. The K-NN use Euclidean distance to calculate the nearest neighbor. If we have two points (x, y) and (a, b). The formula for Euclidean distance (d) will be -

d = sqrt((x-a)²+(y-b)²)

pallab_haldar_1-1715045767455.png

Known use cases using KNN algorithm in ML: For junk email identifier, Add identifier in Web sites, for different human cell condition identification  K-NN is used.

Hope it will help. In my next block I am going to discuss about the Depp learning, neural network and Active functions.

 

 

 

 

Labels in this area