The Logistic Regression is a regression algorithm used to estimate the probability that an example belongs to a particular class (e.g. benign or malignant tumors). It is simple: the algorithm refers to the estimated probability, if it is higher than 0.5, then the model predicts that the example belongs to the positive class (y = 1), or otherwise it belongs to the negative class (y = 0). That is the reason why Logistic Regression is a binary classifier. Let’s see how we can estimate the probabilities.

## Probabilities Estimation

Given the weighted sum of the input features plus a bias term, the Logistic Regression model estimates probabilities by computing the output value of the sigmoid function . Mathematically, considering n-dimensional input vectors (examples characterized by n features), we can define our hypothesis function:

$h_{x} = \sigma(\theta^T x)$

where    $\theta^T x = \theta_0 + \theta_1 x_1 + \hdots + \theta_n x_n$    and    $\sigma$  represents the sigmoid function, which is defined as    $\sigma(t) = \frac{1}{ 1 + \exp^{-t}}$ . The sigmoid function is also called logistic function, and it is greater than 0.5 only when the parameter t is positive.

Once the hypothesis function has estimated the probability that an example belongs to the positive class, the resulting prediction $y*$ is equal to 1, if $h(x)$ is greater than 0.5, 0 otherwise.

## The Cost Function

How do we train the Logistic Regression model? We can do it by using an optimization algorithm, called Gradient Descent, which estimates the parameters vector $\theta$ that leads to the lowest cost. The cost function over the whole training set is defined as the average cost over all training examples:

$J(\theta) = \frac{1}{m}\sum\limits_{i=1}^m [y^{(i)}log(y*^{(i)}) + (1-y^{(i)})log(1-y*^{(i)})]$

This cost function has the property of being convex, so the Gradient Descent is guaranteed to find the global minimum.

## Let’s code

You can easily practice with logistic regression using our sample code on Github or by developing it by yourself. In this tutorial you’re going to perform logistic regression on the Iris Dataset. Let’s start by importing the LogisticRegression class from Scikit-Learn and some other stuff.

Now it’s time to load up the dataset into a variable, and retrieve the training data and target labels.

You’re ready to train your logistic regression model. It can be done with just a single line of code.

In order to evaluate the trained model, we build an input example by generating 4 random input features and then we feed the classifier with it. If the example is classified as an Iris-Virginica, the output y* has to be equal to 1 (positive class), since we’re trying to answer to the question “Is this example an Iris-Virginica?”.

Full code is available on Github.