Data Science Classroom Series – 26/Oct/2021

How to Visualize Numerical Variables

  • To Visualize Numerical Variables we use the following
    • Frequency Distribution Tables
    • Histograms
  • Consider the following dataset with each numerical variable appearing only once Preview
  • It makes more sense to group the numerical data into intervals
  • Generally statisticians prefer 5 to 20 intervals, but it depends on volume of the data
  • The formula for interval width
(largest number - smallest number)/number of desired intervals
  • Then according to convience we can round the interval width
  • A number is included in an interval if that number
    • Greater than lower bound
    • is lower and equal to upper bound
  • The frequency distribution table will be as shown below Preview
  • Histograms:
    • These are one of the most common ways to reprsent numerical data.
    • Each bar has width of the interval. Preview

Exercise: Represent the Frequency Distribution table and histogram for the following data

  • Refer Here for the excel with dataset

  • Frequency Distribution Table: Preview

  • Histogram Preview

Two Variable Relationships

  • Till now we have been dealing with single variables (Categorical and numerical variables)
  • Now lets try to focus on how to visualize relationship between two variables
  • For this lets start with categorical variables
    • Cross Tables (Contingency tables):
      • These are used to represent categorical variables.
      • One set of categories is labelling the rows and other set is labelling the columns Preview
    • Side by Side Bar Chart: A common way to represent data from a cross table is by using a side by side bar chart Preview
    • Refer Here for the excel sheet
  • When we want to represent relationship between two numerical variables in the same graph, we usually use a scatter plot Preview

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

About learningthoughtsadmin