## What is A Model?

• The goal of a model is to provide a simple low-dimensional summary of the dataset
• There are two parts to a model
• Define a family of models that express a precise, but general, pattern that you want to capture
• Generate a fitted model by finding the model from the family that is closest to your data.

## CAVEAT

• A fitted model is just the closest model from the family of models
• “Best” model according to some criteria
• Does not imply that you have good model
• Does not imply that the model is true
• A goal of a model is not to uncover truth, but to discover a simple approximation which is still useful
• “All models are wrong, but some are useful” – George Box

## Quantify Distance

• Need a way to quantify the distance between the data and a model
• One option: To find the vertical distance between each point on the model
• Predection: y values given by the Model
• Response: Actual y values in data
• Distance: Difference between prediction and response
• Overall all distance: Collaps all the individual distances into a single number
• Commonly used Method: Root Mean Squared Deviation

## Activity: Finding Best Fitted Model

• Here we will be using linear regression model, the basic idea behind this activity is to understand what is meant by building and evaluating models.
• In this activity we would take simulated data
• We create around 250 models with different slopes and intercepts
• We try to find the best fitting model by choosing the distance between model and actual data by calculating root mean squared deviation     This site uses Akismet to reduce spam. Learn how your comment data is processed. ## About continuous learner

devops & cloud enthusiastic learner