## Linear Regression

- A simple yet useful supervised learning approac for predicting quantitative (numeric) response
- Makes prediction by simply computing a weighted sum of input feature, plus a contst called as bias term.
- Finding Parameters
- Choose parameters in a way so that prediction is close to actual values for the training samples
- Define a cost function to find the parameters that minimize the cost function => MSE (Mean squared Error)

- Some methods: Least Squares Method, Gradient Descent
- Main Steps:
- Use least-squares to fit a line to the data
- Calculate R-Squared
- Calculate p-value

- Terminology:
- R-Squared: a goodness of fit measure for linear regression modesl
- Null Hypothesis: An initial statement claiming that there is no relationship between two measured events
- P-Value: Tests the null hypothesis
- Low p-value (
`< 0.05`

): Null hypothesis can be rejected- Predictor likelya meaningful addition to your model
- Changes in the predictor’s values are related to changes in the response variable

- Large p-value: Suggest that predictor not associated with changes in response

## Tidymodels steps

- Split data ({rsample})
- prepare recipe ({recipes})
- Specify model ({parsnip})
- Tune hyperparameters ({tune})
- Fit model ({parsnip})
- Analyze model ({broom})
- Predict ({parsnip})
- Interpret the results ({yardstick})

## Linear Regression with known dataset diamonds

- Lets build the price predictor

- Now lets find all the columns which have high correlation with price

- Now lets split the training and testing data from this data

- Now lets use
`lm`

to create the model for the training data

- broom package has a method to summarize models in a way they are easy to read
`broom::tidy(model)`

- According to work which we have done so far carat, x, y, z can be used to predict the price of diamond, Lets use all the variables and see the results, y and z will be insignificant if we consider all variables