Elements of Structured Data
- Most common forms of structured data is a table with rows and columns.
- There are two basic types of structured data
- Numerical: Comes in two forms
- Continuous
- Discrete
- Categorical: Takes only a fixed set of values:
- Examples:
- Types of TV Screens (plasma, LED, LCD etc)
- State Names (Telangana, Andhra Pradesh, Tamil Nadu, Karnataka, Kerala)
- Binary data is an important special case of categorical value which takes one out two values (0/2, yes/no, true/false)
- Another form of categorical data is ordinal data i.e. categories which are ordered
- Example: Ratings (1/5, 2/5, 3/5, 4/5, 5/5)
- Example: Ratings (1/5, 2/5, 3/5, 4/5, 5/5)
- Examples:
- Numerical: Comes in two forms
Rectangular Data
- This is the general term for two-dimensional matirx with rows indicating records (cases) and columns indicating features (variables).
- Dataframe is the format which we generally use in python (pandas) and R
Estimates of Location
- Variables with measured or count data might have thousands of distinct values.
- A basic step in exploring your data is getting a typical value for each feature: an estimated of where most of the data is located (i.e its central tendency)
- Refer Here
- Exercise: Try to take a dataset from kaggle Refer Here to calculate mean, median, weighted mean of total with ratings as weight and trimmed mean with any trim percentage.