Data Science Classroom Notes – 17/Dec/2021

Reading Data into R

  • Reading CSV files:

    • The best way to read data from CSV files is use read.table, There are other functions such read.csv which is a wrapper around read.table with sep argument preset to comma.
    • The result of reading.table is data.frame
    • Download Employees.csv into your workspace
    • Try to read using read.table and read.csv methods Preview
    • Different functions to read the data
    Function sep dec
    read.table <empty> .
    read.csv , .
    read.csv2 ; ,
    read.delim \t .
    read.delim2 \t ,
    • The utils package provides a family of functions for reading text files
  • Reading Excel Data

    • Excel is most widely used to represent data.
    • To read Excel data we need a package called as readxl.
    install.package(readxl) 
    library(readxl)
    
    • Now we can view the sheets of the excel by using excel_sheets and to read the excel we have read_excel Preview
    • Lets use the Excel Multisheet dataset Refer Here
    • Reading data from sheets Preview
  • Reading from databases

    • Databases arguably store the vast majority of the data.
    • Most of theses might be in PostgreSQl, MySQL, Microsoft SQL or Access.
    • These databases can be access either through various driver or typically via an ODBC connection.
    • For the most popular opensource databases we have packages such as RPostgreSQL and RMySQL.
    • Other databases without a specific package can make use of generic package RODBC.
    • To handle Datbase connections a Package called as DBI package was written to create uniform experience working with different databases
    • Lets Download a sample file Refer Here.
    • Extract this and you will have a .db file
    • SQLite:
      • RSQLite is a package which we can use to deal with SQLite Database
      • Install Package and Load the Package in the console
      install.package('RSQLite')
      library('RSQLite')
      
      • To connect to the database, we first need to specify the dbDriver. The functions main argument is of the type of the driver => "SQLite", "ODBC"
      driver <- dbDriver('SQlite')
      
      • To establish the connection use the dbConnect Preview
      • To view all the tables in the database dbListTables
      • To view all the fields in a particular table Preview
      • To execute some query we can use dbGetQuery Preview Preview
  • To download sqlite database Refer Here

  • Exercise: Try to read some database tables from mysql or postgres (Whatever is installed on your system)

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

About continuous learner

devops & cloud enthusiastic learner