DataScience Classroomnotes 27/Mar/2022

Visualizations Contd

  • Refer Here for the jupyter notebook, where we have explored matplotlib and seaborn visualizations

An Overview of Statistics

  • Statistics is all about working with data, be it processing, analyzing or drawing a conclusion from the data we have.
  • Statistics has two main goals
    • describing the data
    • drawing conclusions from it.
  • These two goals coincide with two main categories of statistics
    • descriptive statistics:
      • Questions are asked about the general characteristics of a dataset (What is average price?, What is minimum value and max value)
      • The answers to these questions (kind of) help us get an idea of what te dataset constitutes.
    • inferential statistics:
      • Goal is to go a step further: after gaining approximate insights from a given dataset. we’d like to use that information and infer on unknown data (Predictions for the future from observed data)
      • This is typically done via various statistics and machine learning models.

Types of Data in Statistics

  • There are two main types of data:

    • Categorical data
    • Numerical data
  • Summary of Categorical and Numerical data

| Features | Categorical data | Numerical data |
| ——– | —————–| ————– |
| Characteristic | Discrete Values | Continuous Values |
| Ordinality | No | yes |
| Models | Categorical/discrete probability distributions | Continuous Probability distributions |
| Data Processing | One-hot encoding | Scaling and Normalizations |
| Descriptive Stats | Mode | Mean and standard deviation |
| Predective Modeling | Classification | Regression |
| Visualization Techniques | Pie Charts and Bar graphs | Histograms, line graphs and Scatter Plots |

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

About continuous learner

devops & cloud enthusiastic learner