Algorithmic Challenges
- Overfitting the training data
- The model is too complex relative to the amount and noisiness of the training data. The model performs well on training data, but it doesnot generalize well
- Simplify the model by select one with fewer parameter, by reducing the number of features in the training data or by constraining the model
- Gather more training data
- Reduce noise in the training data
- Underfitting the training data
- The model is too simple relative to the underlying structure of data. The model does not perform well even on the training data
- Select a powerful model, with more parameters
- Feed better features into the learning algorithm (feature Engineering)
- Reduce the constraints on the model
ML PROJECT DEVELOPMENT FLOW
- Frame the problem and look into at the big picture
- Get the data
- Explore the data to gain insights
- Prepare the data for ML algorithms
- Train/explore many different models and short-list the best ones
- Fine-tune your model
- Present your solution
- Launch, monitor, and maintain
TidyModels
- The tidymodels framework is a collection of packages for modeling and machine learning using the tidyverse principles
- Refer Here for the official page of tidy models
- rsample: Provides the basic building blocks for creating and analyzing resamples of the data set
- recipes: Methods for creating data encoding and preprocessing recipes
- parsnip: Provides a tidy, unified interface to models that can be used to try a range of models without dealing with underlying package details
- tune: Facilitates hyperparameter tuning for the tidymodels packages
- yardstick: A package to estimate how well models are working using tidy data principles
Anaconda Installation
- Lets download anaconda from Refer Here
- Once the installation is finished. Launch Anaconda Navigator
- Select Environments
- Now create a new environment