DataScience Classroomnotes 08/Jan/2022

Stringr Continued

  • Refer Here for the changes done where we used replace and split methods of string

Factors with forcats

  • In R, factors are used to work with categorical variables.
  • Prereq’s
library(tidyverse)
install.packages(forcats)
library(forcats)
  • sorting ordinal categorical variables
    Preview
  • Parsing vectors into factors in the case of invalid values
    Preview
  • For the next set of examples lets use General Social Survey Dataset forcats::gss_cat
View(forcats::gss_cat)
glimpse(forcats::gss_cat)
  • Lets count the different races in gss_cat
    Preview
  • Summary by religion
    Preview
    Preview
  • Lets summarize by income
    Preview
    Preview
  • Lets summarize by age
    Preview
    Preview

  • Explore the below R statements

gss_cat %>%
  mutate(marital = marital %>% fct_infreq() %>% fct_rev()) %>%
  ggplot(aes(marital)) +
  geom_bar()

gss_cat %>%
  mutate(marital = marital) %>%
  ggplot(aes(marital)) +
  geom_bar()
  • Recoding factors
    Preview
    Preview
  • Collapsing multiple categories into one
    Preview

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

About continuous learner

devops & cloud enthusiastic learner