Practical Regression Problem
-
Lets try to do the regression analysis of the the following dataset Refer Here
-
Lets check for Linearity
- Mileage vs price
- Engine vs price
- Year vs price
- Mileage vs log(price)
- Engine vs log(price)
- Year vs log(price)
- Mileage vs price
-
Since we have 7-distinct categories in category variable Brand, we have created the six dummy variables (Rule => n-1)
-
Lets create one more variable for log(mileage) just for understanding
-
Regression Analysis of
- Log price and Log mileage
- Log price and year
- Log price and Engine
- log price and log mileage, year and Engine
- Log price vs log mileage, year, engine and other categorical dummies (brands)
- Log price and Log mileage
Lets do some more analysis
- For 1% in mileage, the price decreases by 0.11%
- For each unit (litre) increase in Engine volume, log price increase by 0.14
- For each extra unit (litre increase) of volume log increases by 0.14 and price increases by 15%
- When year increases by 1 year the log price increases by 0.04 and price increase by 4%
- Solved sheet Refer Here