Introduction to Linear Modeling
Regression Analysis
DISCLAIMER: The images, code snippets...etc presented in this presentation were collected, copied and borrowed from various internet sources, thanks & credit to the creators/owners
Agenda
-
Machine Learning
-
Regression Analysis
-
Simple Linear Regression
-
Ordinary Least Square Method
-
Example - 1
-
Coefficient of Determination (R-Square)
-
Example - 2
-
Sum of Squares Regression (SSR)
-
Total Sum of Squares (SST)
-
Sum of Square Errors (SSE)
- Correlation Coefficient
Machine Learning
(what it does)
Find patterns in the data
Use those patterns to predict
the future
Regression models are used for predicting a real value
Regression Analysis
Studying the relation between two or more variables using Regression technique
Ex: relationship between advertising expenditure and sales
Simple Linear Regression


Y-intercept
or
Slope
or
error term
simple - one independent variable & one dependent variable
formula
the variable we are trying to predict
the variable we use to predict the dependent variable


Ordinary Least Square Regression
import the data & split into training and test dataset
# Simple Linear Regression
# Importing the dataset
dataset = read.csv('Land_Price.csv')
cor(dataset$Price,dataset$Land)
# Splitting the dataset into the Training set and Test set
# install.packages('caTools')
library(caTools)
set.seed(123)
split = sample.split(dataset$Price, SplitRatio = 2/3)
training_set = subset(dataset, split == TRUE)
test_set = subset(dataset, split == FALSE)Example - 1
Fitting the model & Predicting new data
# Fitting Simple Linear Regression to the Training set
regressor = lm(formula = Price ~ Land,
data = training_set)
# Predicting the Test set results
y_pred = predict(regressor, newdata = test_set)Visualising the Training & Test set results
# Visualising the Training set results
library(ggplot2)
ggplot() +
geom_point(aes(x = training_set$Land, y = training_set$Price),
colour = 'red') +
geom_line(aes(x = training_set$Land, y = predict(regressor, newdata = training_set)),
colour = 'blue') +
ggtitle('Price vs Land (Training set)') +
xlab('Land') +
ylab('Price')
# Visualising the Test set results
library(ggplot2)
ggplot() +
geom_point(aes(x = test_set$Land, y = test_set$Price),
colour = 'red') +
geom_line(aes(x = training_set$Land, y = predict(regressor, newdata = training_set)),
colour = 'blue') +
ggtitle('Price vs Land (Training set)') +
xlab('Land') +
ylab('Price')Example - 2















Other Sources for adv study
Other Models (SVM)
#svm
library(e1071)
# quick look at the data
plot(iris)
# feature importance
plot(iris$Sepal.Length, iris$Sepal.Width, col=iris$Species)
plot(iris$Petal.Length, iris$Petal.Width, col=iris$Species)
#split data
s <- sample(150, 100)
col <- c('Petal.Length','Petal.Width','Species')
iris_train <- iris[s,col]
iris_test <- iris[-s,col]
#create model
svmfit <- svm(Species ~ ., data = iris_train, kernel="linear", cost=.1, scale = FALSE)
print(svmfit)
plot(svmfit, iris_train[, col])
tuned <- tune(svm, Species~., data = iris_train, kernel='linear', ranges = list(cost=c(0.001, 0.01,.1, 1.10, 100)))
summary(tuned)
p <- predict(svmfit, iris_test[,col], type='class')
plot(p)
table(p, iris_test[,3])
mean(p==iris_test[,3])
Support Vector Machine (Classification)
Machine Learning - Linear Models
By sumendar karupakala
Machine Learning - Linear Models
Demonstration
- 78