Logistic Regression: The Entry To The ML World

Saurabh Tiwari
3 min readApr 20, 2021

Logistic Regression is a supervised learning technique to solve problems in which our target values are categorical in nature, i.e. having values that can be categorized in some classes. e.g. given some information about the customer predict if they should get a loan or not.

Now, due to this it is considered to be a classification technique. But I like to think it as a regression technique because all it is doing is adjusting the parameters in such a way that our S shaped figure approximates the dataset just like linear regression does to fit a line on the dataset. But we apply it to the problems that are categorical hence it is generally called a classification technique.

How ?

So how is it happening? How is it forming that S shaped figure?

If you are familiar with Linear Regression in which we try to fit a line through the dataset, first we write our normal equation of line

Z = β.X + βo

Here, Z is our calculated value (in case of linear regression), β is our weights (if there are more attributes of X then that many βs) and βo is just for adjusting the graph in vertical direction.

To get the values of β and βo, in linear regression we take the difference of calculated value and expected value, square it, differentiate it w.r.t. β and equate it to 0 to get the value of β on which the error is minimum.

But in logistic regression first we create this linear function to a function that looks like an S shape. This function is called a sigmoid function.

Y’ = 1/(1+e^(-Z))

Notice that the value of Y’ will be between 0 and 1 irrespective of what Z is.

Here, Y’ is our calculated value and not Z. To get the values of β and βo, we go on and calculate the conditional probability of Y’ and do some messy mathematics to get the result. If you are interested to know the mathematics around it you can find it here.

Now, we’ve got our sigmoid function. We feed it the values of X and the calculated value will give us a result between 0 & 1 and we can round it off to know in which class does a certain set of values belong.

Code

Doing all that mathematics on a code is a bit too much. But our friend python does it for us. Python has a built in library for logistic regression that you can use to feed the dataset and get results.

from sklearn.linear_model import LogisticRegression 
classifier = LogisticRegression(random_state=0)
classifier.fit(train, result_tr)
pred = classifier.predict(test)

There you go! Now you can solve most of the classification problems using logistic regression.

Thank you for reading.

--

--