DataScience Classroomnotes 06/Feb/2022

Algorithmic Challenges

  • Overfitting the training data
  • The model is too complex relative to the amount and noisiness of the training data. The model performs well on training data, but it doesnot generalize well
    • Simplify the model by select one with fewer parameter, by reducing the number of features in the training data or by constraining the model
    • Gather more training data
    • Reduce noise in the training data
  • Underfitting the training data
  • The model is too simple relative to the underlying structure of data. The model does not perform well even on the training data
    • Select a powerful model, with more parameters
    • Feed better features into the learning algorithm (feature Engineering)
    • Reduce the constraints on the model


  • Frame the problem and look into at the big picture
  • Get the data
  • Explore the data to gain insights
  • Prepare the data for ML algorithms
  • Train/explore many different models and short-list the best ones
  • Fine-tune your model
  • Present your solution
  • Launch, monitor, and maintain


  • The tidymodels framework is a collection of packages for modeling and machine learning using the tidyverse principles
  • Refer Here for the official page of tidy models
  • rsample: Provides the basic building blocks for creating and analyzing resamples of the data set
  • recipes: Methods for creating data encoding and preprocessing recipes
  • parsnip: Provides a tidy, unified interface to models that can be used to try a range of models without dealing with underlying package details
  • tune: Facilitates hyperparameter tuning for the tidymodels packages
  • yardstick: A package to estimate how well models are working using tidy data principles

Anaconda Installation

  • Lets download anaconda from Refer Here
  • Once the installation is finished. Launch Anaconda Navigator
  • Select Environments
  • Now create a new environment

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

About continuous learner

devops & cloud enthusiastic learner