DataScience Classroomnotes 30/Mar/2022

Estimates of Variability

  • Location is just one dimension in summarizing a feature, A second dimension variability, also referred to as dispersion, measures whether the data values are tightly clustered or spread out.
  • At the heart of statistics:
    • measuring it
    • reducing it
    • distinguishing random from real variability
    • identifying various sources of real variability
    • making decisions int the presence of it
  • Refer Here for the calculations of variability estimates

  • Exercise: Calculate the standard deviation and IQR for the gpa in the following dataset Refer Here

Exploring the Data Distribution

  • Each of the estimates we’ve covered sum up the data in a single number to descirbe the location or variablility.
  • It is also useful to explore how the data is distributed overall
  • Box Plots
  • Refer Here for the notebook.

