Logistic Regression

Dr. Ashish Tendulkar

Machine Learning Practice

IIT Madras

Logistic Regression (also called Logit Regression) is commonly used to estimate the probability that an instance belongs to a particular class.

Introduction

  • If the estimated probability is greater than 50%, then the model predicts that the instance belongs to that class (called the positive class, labeled “1”), or else it predicts that it does not (i.e., it belongs to the negative class, labeled “0”).
  • This makes it a binary classifier.

Loading the dataset

Cleveland Heart-disease dataset

Attribute Information:

  • Age (in years)
  • Sex (1 = male; 0 = female)
  • cp -chest pain type
  • trestbps - resting blood pressure (anything above 130-140 is typically cause for concern)
  • chol-serum cholestoral in mg/dl (above 200 is cause for concern)
  • restecg - resting electrocardiographic results (0 = normal;1 = having ST-T wave abnormality; 2 = showing probable or definite left ventricular hypertrophy by Estes' criteria)
  • thalach-maximum heart rate achieved

 

Loading the dataset

Cleveland Heart-disease dataset

Attribute Information:

  • exang - exercise induced angina (1 = yes; 0 = no)
  • oldpeak - depression induced by exercise relative to rest
  • slope - slope of the peak exercise ST segment (1 = upsloping; 2 = flat Value; 3 = downsloping)
  • ca - number of major vessels (0-3) colored by flourosopy
  • thal - (3 = normal; 6 = fixed defect; 7 = reversable defect
  • num (target) - diagnosis of heart disease (angiographic disease status)( 0: < 50% diameter narrowing ; 1: > 50% diameter narrowing)

Visualizing dataset and features

Visualizing dataset and features

Visualizing dataset and features

Visualizing dataset and features

Visualizing dataset and features

Visualizing dataset and features

Visualizing dataset and features

Visualizing dataset and features

Visualizing dataset and features

Visualizing dataset and features

Visualizing dataset and features

Visualizing dataset and features

Visualizing dataset and features

Visualizing dataset and features

Understanding the correlation between Input features

Confusion Matrix

A confusion matrix is a summary of prediction results on a classification problem.

Confusion Matrix

Hyperparameter tuning with RandomizedSearchCV and GridSearchCV

RandomizedSearchCV

Hyperparameter tuning with RandomizedSearchCV and GridSearchCV

GridSearchCV

With Pipeline

Logistic Regression

By Debajyoti Biswas

Logistic Regression

  • 95