Getting Started in R

Josh Allen

Department of Political Science at Georgia State University

1/20/23

Research Data Services

via GIPHY

Our Team

Get Ready Badges

How To Get the Badges

The Workshop

Why Use R?

Part of this first portion is just a sales pitch.

R is a vastly popular language in the datascience industry. While it is less popular than sequel or Python it is still one of the most heavily demanded language from private industry.

This sort of makes sense once we dig into what they are used for and who uses them. Sequel stands for structured query language. You have probably heard of big data before. Think of the amount of data you generate from what ever apps you use. Think about your use every minute. Now think about the N of people in this workshop. Now think about all the people at Georgia State. The amount of data being generated in a few minutes. The max storage of excel file 1,048,576 rows by 16,384 columns. We could all pretty quickly overwhelm a single excel file. Enter SQL you can store large data bases in SQL and just as importanly that is how you get that data.

Python is a general purpose programming language it is used for data analysis as well but it has applications in everything from web development to game development. Dropbox is basically just a ton of python code. Lots of people who grow up to be data scientists come from a CS background where you are introduced to python pretty early.

However, if you simply add up the propietary softwares there are fewer industry available jobs for you or you students compartively.

Why R and RStudio?(cont)

Alongside Python, R has become the de facto language for data science.
- See: The Impressive Growth of R, The Popularity of Data Science Software
Open-source (free!) with a global user-base spanning academia and industry.
The community is insanely nice
- Especially compared to Python and Stata
A great “first” language to learn
- Source: Google Data Analytics Professional Certificate
Supports all types of statistical methods and data collection

In the data science industry R has enjoyed similar growth rates to python in popularity. R is kind of quirky compartively to python for a whole host of computer sciency reasons and just path dependency. Whatever the case there is no denying that both are hugely in demand skills not just in Silicon Valley but at places that produce data analysis or data visualization

R is also becoming wildly popular in econ and political science because not only is it free but it is a great skill to have given the difficulties of the academic job market. I am political scientist so I will mostly making these references. The Rstudio team in particular has worked really hard on adding support for Python, Julia, and java.

Much of this workshops materials are based on materials people publicly share as long as they get the proper acknowledment. Again thank you Grant.

Lots of data is readily available in R. Census API, Twitter API. If you are into sports you can grab a ton of data to compute those fancy metrics you are into. I know there is also data on Rupaul’s drag race you can download.

R? Rstudio? Whats the Difference?

R is a statistical programming language
RStudio is a convenient interface for R (an Integrated Developer Environment, IDE)
At its simplest:
- R is like a car’s engine
- RStudio is like a car’s dashboard

Navigating RStudio

project files are here

imported data
shows up here

code can go here

Navigating RStudio

project files are here

imported data
shows up here

code can go here

Setting Your Working Directory

Your working directory is where all your files live
You may know where your files are…
But R does not
If you want to use any data that does not come with a package you are going to need to tell R where it lives

Cats and Boxes

You can put a box inside a box.
You can put a cat inside a box
You can put a cat inside a box inside of a box
You cannot put a box inside a cat
You cannot put cat in a cat

How working directories work is that they are comprised of files and folders. You need to let R know what file is in what folder. You can also put a cat in a box, but you must never try to put a box in a cat. Boxes are like folders/directories, cats are like files. This sort of represents the hierarchy of this all. Folders come first than the last thing is the file itself

We are basically just telling R where things live. Kind of like how we put a strange address into the gps. We are telling it exactly where things live and what house number they are.

When we organize our files into HW 1 or manuscript whatever name what we are doing is creating a new neighborhood on our computer. R will default to places it knows. Most commonly where it lives. In order to do something as simple as loading our dataset in R needs directions to this neighborhood

Setting Your Working Directory(cont)

Seeing What Working Directory You are Using

getwd()## The working directory where all the materials for the workshops live

[1] "/Users/josh/Dropbox/Research-Data-Services-Workshops/research-data-services-r-workshops/slides"

Setting Your Working Directory

setwd("your/working/directory/here/") ## sets the working directory on mac
setwd("your\working\directory\here") ## sets the working directory on windows

How To Make Your Life Easier

source: Jenny Bryan

How To Make Your Life Easier

Working Directory for My Laptop

"/Users/josh/Dropbox/Research-Data-Services-Workshops/research-data-services-r-workshops/slides"

Working Directory of My Office Computer

"/Volumes/6TB Raid 10/Dropbox/Research-Data-Services-Workshops/research-data-services-r-workshops/slides"

R Projects

Objects

Everything is an object
Everything has a name
You do stuff with functions
Packages(i.e. libraries) are homes to pre-written functions.
- You can also write your own functions and in some cases should.

Everything that exists in R is an object in the sense that it is a kind of data structure that can be manipulated. I think this is better understood with functions and expression

Before we start R is an object oriented programming(sort of) What this means is just how we are defining what things we have and how they relate to each other. A dog has various things associated with it. They are four legged have a good sense of smell, a member of the canine family, they eat a certain set of food.

Once we define what those things are and how they relate to each other R will figure out what class it is.

What is this object to it. Once it figures this out this sets out strict limitations on what R can do with those objects but just as importantly it tells R what it can’t do with those objects. Think of like a set of tricks or in CS speak methods to do things. There are things we can do with dogs or to dogs that are acceptable. This differs from cats. Cats and dogs have similar attributes but they are different classes.

While sometimes it is frustrating because sometimes you just want to do a thing it helps you protect you from yourself.

Returning back to our pet metaphor. Each pet has a name and the thing we want it to do has names. Sit, stay, come here, hey you what are you doing in there. These are sort of like functions. We are manipulating the object.

Install and loading packages

Console or Script install.packages("package-i-need-to-install")
- In the case of multiple packages you can do install.packages(c("Packages", "I", "don't","have"))
RStudio Click the “Packages” tab in the bottom-right window pane. Then click “Install” and search for these two packages.

Install and load(cont.)

Once the packages are installed we need load them into our R session with the library() function

# We talk to ourselves using #
library(Package) 
library(I)
library(JustInstalled)

Notice too that you don’t need quotes around the package names any more.

R now recognises these packages as defined objects with given names
Everything in R is an and everything has a name

`R` Some Basics

Basic Maths

R is equipped with lots of mathematical operations

2+2 ## addition

[1] 4

4-2 ## subtraction

[1] 2

600*100 ##multiplication

[1] 60000

100/10 ##division

[1] 10

10*10/(3^4*2)-2 ## Pemdas

[1] -1.382716

log(100)

[1] 4.60517

sqrt(100)

[1] 10

Basic Maths

R is also equipped with modulo operations (integer division and remainders), matrix algebra, etc

100 %/% 60 # How many whole hours in 100 minutes?

[1] 1

100 %% 60 # How many minutes are left over?

[1] 40

m <- matrix(1:8, nrow=2) # Don't worry about the <- for now 
n <- matrix(8:15, nrow=4) # this is just me creating matrices 
mat <- matrix(1:15, ncol = 5)
m %*% n # Matrix multiplication

     [,1] [,2]
[1,]  162  226
[2,]  200  280

t(mat) # transpose a matrix

     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    4    5    6
[3,]    7    8    9
[4,]   10   11   12
[5,]   13   14   15

Logical Statements & Booleans

Test	Meaning	Test	Meaning
`x < y`	Less than	`x %in% y`	In set
`x > y`	Greater than	`is.na(x)`	Is missing
`==`	Equal to	`!is.na(x)`	Is not missing
`x <= y`	Less than or equal to
`x >= y`	Greater than or equal to
`x != y`	Not equal to
`x \| y`	Or
`x & y`	And

Booleans and Logicals in Action

1>2

[1] FALSE

1<2

[1] TRUE

1 == 2

[1] FALSE

1 < 2 | 3 > 4 ## only one test needs to true to return true

[1] TRUE

1 < 2 & 3>4 ## both tests must be true to return true

[1] FALSE

Logicals, Booleans, and Precedence

R like most other programming languages will evaluate our logical operators(==, >, etc) before our booleans(|, &, etc).

1 > 0.5 & 2

[1] TRUE

What’s happening here is that R is evaluating two separate “logical” statements:
1 > 0.5, which is is obviously TRUE.
2, which is TRUE(!) because R is “helpfully” converting it to as.logical(2).
It is way safer to make explicit what you are doing.
If your code is doing something weird it might just be because of precedence issues
- See R Cookbook 2.11

1 > 0.5 & 1 > 2

[1] FALSE

Other Useful Tricks

Value matching using %in%

To see whether an object is contained within (i.e. matches one of) a list of items, use %in%.

4 %in% 1:10

[1] TRUE

4 %in% 5:10

[1] FALSE

Cool Now What?

While this is boring it opens up lots
We may need to set up a group of tests to do something to data.
We may need all this math stuff to create new variables
However we need to Assign them to reuse them later in functions.
- Including datasets

Everything is an Object

Thanos Infinity War GIFfrom Thanos GIFs

Assignment

The most popular assigment operator in R is <- which is just < followed by -
- read aloud as “gets”

a <- 2 + 2

a * 2

[1] 8

h <- "harry potter" # note that text needs to be wrapped in quotes

You can also use -> but this is far less common and makes me uncomfortable

 a^2 -> b

Assignment(cont)

Using = as an assignment operator also works and is the one I tend to use
- Note: = is also used to evaluate arguments within functions

b = b * 2

d = b/3

Tbh this is a matter of taste really.
- R added = in the 2000’s to make it easier for people coming from other object oriented programming languages¹
Just keep it consistent..or become ungovernable and use all three in one script.

Working with Objects

e = c(1,3,5,6,67,7) # creates a vector

length(e) ## How many things are in there?

[1] 6

sum(e)/length(e) # Hand calculate mean

[1] 14.83333

mean(e) # Why make our life hard when there is a built in function?

[1] 14.83333

e = data.frame(x = 1:22,
               y = 20:41)

mean(y)

Error in mean(y): object 'y' not found

Global Environment(cont)

Error in mean(y): object 'y' not found

Gives us a hint out about what went wrong

Fixing Our Issue

To do this we need to index e to get to y

mean(e$y)

[1] 30.5

R will look for named objects in the environment
If the interpreter can’t find y or any other object it will give up because it does not think it exists
You need to tell the interpreter what to look for inside of the object

What are Objects?

Objects are what we work with in R

 [1] "is.array"                "is.atomic"              
 [3] "is.call"                 "is.character"           
 [5] "is.complex"              "is.data.frame"          
 [7] "is.double"               "is.element"             
 [9] "is.environment"          "is.expression"          
[11] "is.factor"               "is.finite"              
[13] "is.function"             "is.infinite"            
[15] "is.integer"              "is.language"            
[17] "is.list"                 "is.loaded"              
[19] "is.logical"              "is.matrix"              
[21] "is.na"                   "is.na.data.frame"       
[23] "is.na.numeric_version"   "is.na.POSIXlt"          
[25] "is.na<-"                 "is.na<-.default"        
[27] "is.na<-.factor"          "is.na<-.numeric_version"
[29] "is.name"                 "is.nan"                 
[31] "is.null"                 "is.numeric"             
[33] "is.numeric_version"      "is.numeric.Date"        
[35] "is.numeric.difftime"     "is.numeric.POSIXt"      
[37] "is.object"               "is.ordered"             
[39] "is.package_version"      "is.pairlist"            
[41] "is.primitive"            "is.qr"                  
[43] "is.R"                    "is.raw"                 
[45] "is.recursive"            "is.single"              
[47] "is.symbol"               "is.table"               
[49] "is.unsorted"             "is.vector"              
[51] "isa"                     "isatty"                 
[53] "isBaseNamespace"         "isdebugged"             
[55] "isFALSE"                 "isIncomplete"           
[57] "isNamespace"             "isNamespaceLoaded"      
[59] "isOpen"                  "isRestart"              
[61] "isS4"                    "isSeekable"             
[63] "isSymmetric"             "isSymmetric.matrix"     
[65] "isTRUE"

Vectors

Come in two flavors
Atomic: all the stuff must be the same type
Lists: stuff can be different types

my_vec <- c(1:10)
is.vector(my_vec)

[1] TRUE

my_list <- list(a = c(1:4), b = "Hello World", c = data.frame(x = 1:10, y = 1:10))
is.vector(my_list)

[1] TRUE

Atomic Vectors

Come in a variety of flavors
Numeric: Can contain whole numbers or decimals
Logicals: Can only take two values TRUE or FALSE
Factors: Can only contain predefined values. Used to store categorical data
- Ordered factors are special kind of factor where the order of the level matters.
Characters: Holds character strings
- Base R will often convert characters to factors. That is bad because it will choose the levels for you

Lists

Lists are everywhere in R

data_frame <- data.frame(a = rnorm(3),
                         b = rnorm(3))
typeof(data_frame)

[1] "list"

dats_wrong <- data.frame(a = 1:3,
                         b = 1:4)

Error in data.frame(a = 1:3, b = 1:4): arguments imply differing number of rows: 3, 4

example_mod <- lm(body_mass_g ~ bill_depth_mm, data = penguins)
typeof(example_mod)

[1] "list"

length(example_mod$residuals);length(example_mod$coefficients)

[1] 342

[1] 2

A Quick Aside on Naming Stuff

Things we can never name stuff

The reason we can’t use any of these are because they are reserved for R

if 
else 
while 
function 
for
TRUE 
FALSE 
NULL 
Inf 
NaN 
NA

A Quick Aside on Naming Stuff(cont)

Semi-reserved words

For simple things like assigning c = 4 and then doing d = c(1,2,3,4) R will be able to distinguish between assign c the value of 4 and the c that calls concatenate which is way more important in R.

However it is generally a good idea, unless you know what you are doing, to avoid naming things that are functions in R because R will get confused.

my_cool_fun <- function(x){
 x <- x*5
return(x)
}

datas <- c(1:10)

my_cool_fun(datas)

 [1]  5 10 15 20 25 30 35 40 45 50

my_cool_fun[1]

Error in my_cool_fun[1]: object of type 'closure' is not subsettable

How and What to Name Objects

The best practice is to use concise descriptive names

When loading in data typically I do raw_my_dataset_name and after data all of my cleaning I do clean_my_dataset_name

Objects must start with a letter. But can contain letters, numbers, _, or .
- snake_case_like_this_is_what_I_use
- somePeopleUseCamelCase
- some_People.are_Do_not.like_Convention

Navigating Objects in R

The Data We are Working With

artwork by @allison_horst

Importing Data

You have the option of pointing and clicking via import dataset
I would recommend importing data via code
- You don’t have to remember what you named the object originally
- Saves future you time
This is a common error you will get

penguins = read.csv("peguins.csv")

Error in file(file, "rt"): cannot open the connection

penguins = read.csv("penguins.csv")

Error in file(file, "rt"): cannot open the connection

This happens most often when
- the file name is spelled wrong
- the file is in a subdirectory or your working directory is not set correctly

Your Turn

Create a vector in R named my_vec with “Game of Thrones” in it.
Create a vector in R named my_second_vec with 1:100 in it
Read in the data included to the website using read.csv
- What happens when you do not assign the dataset?
- If you are on a Windows machine right click on the zip file and then click extract all
Assign the penguins dataset to an object named penguins
Use View, head, and tail to inspect the dataset
Using install.packages() install ggplot2

04:00

Our Data

species	island	bill_length_mm	bill_depth_mm	flipper_length_mm	body_mass_g	sex	year
Adelie	Torgersen	39.1	18.7	181	3750	male	2007
Adelie	Torgersen	39.5	17.4	186	3800	female	2007
Adelie	Torgersen	40.3	18.0	195	3250	female	2007
Adelie	Torgersen	NA	NA	NA	NA	NA	2007
Adelie	Torgersen	36.7	19.3	193	3450	female	2007
Adelie	Torgersen	39.3	20.6	190	3650	male	2007

Indexing `[]`

We can use column position to index objects.
There are two slots we can use rows and columns in the brackets if we are using a dataframe like this.
object_name[row number, column number]
We can also subset our data by column position using : or c(column 1, column 2)

penguins[1,1]

species
Adelie

penguins[1,1:2]

penguins[1,c(1,4)]

species	island
Adelie	Torgersen

species	bill_depth_mm
Adelie	18.7

knowing how to index stuff is important because often times we neeed to tell R what to get. Which is pretty critical especially if we want to use all the flexibility of R. You will need to be able to work with values of your dataset and you need to be able to navigate the software. The drop down menus for R kind of stop here so now you are in more coding territory.

R is a bit of an odd duck compared to other languages. There are three different ways to index an object. These all have a place in your workflow. Some of it will be a little unclear where this will apply but once you start using R more it is super helpful.

One thing to keep in mind is that in R indexing starts at 1. So if you want to get the first element of a vector you use 1. Whereas in other languages indexing starts at zero. What does that mean substantivly. Well to get the first element of a vector in another language you would use zero.

I use lists a lot to report coefficients from a regression or to automatically update my syllabus when things change.

Negative Indexing

We can also exclude various elements using - and/or tests that I showed you earlier

penguins[,-1]

island	bill_length_mm	bill_depth_mm	flipper_length_mm	body_mass_g	sex	year
Torgersen	39.1	18.7	181	3750	male	2007
Torgersen	39.5	17.4	186	3800	female	2007
Torgersen	40.3	18.0	195	3250	female	2007
Torgersen	NA	NA	NA	NA	NA	2007
Torgersen	36.7	19.3	193	3450	female	2007
Torgersen	39.3	20.6	190	3650	male	2007

Negative Indexing(cont)

We can use - or : as well to subset stuff

penguins[,-(1:4)]

flipper_length_mm	body_mass_g	sex	year
181	3750	male	2007
186	3800	female	2007
195	3250	female	2007
NA	NA	NA	2007
193	3450	female	2007
190	3650	male	2007

penguins[,-c(2,3,5,8)]

species	bill_depth_mm	body_mass_g	sex
Adelie	18.7	3750	male
Adelie	17.4	3800	female
Adelie	18.0	3250	female
Adelie	NA	NA	NA
Adelie	19.3	3450	female
Adelie	20.6	3650	male

Indexing `[]` (cont)

We can also do the same thing with lists.
We can tell R what element of a list using a combo of [] and [[]]

my_list = list(a = 100:110, b = "Learning R was the best of times and the worst of times",
               c = data.frame(x = 1:3, y = 4:6))

my_list[[1]][2] ## get the first item in the list and the second element of that item

[1] 101

my_list[2]

$b
[1] "Learning R was the best of times and the worst of times"

my_list[[3]][[1]]

[1] 1 2 3

`[]` vs `[[]]`

Subsetting By Tests

penguins[penguins["sex"] == "female", c("species", "sex")]

species	sex
Adelie	female
Adelie	female
NA	NA
Adelie	female
Adelie	female
NA	NA
NA	NA
NA	NA
NA	NA
Adelie	female

`$` Indexing

A really useful way of indexing in R is referencing stuff by name rather than position. - The way we do this is throught the $

my_list$a

 [1] 100 101 102 103 104 105 106 107 108 109 110

my_list$b

[1] "Learning R was the best of times and the worst of times"

my_list$c

Indexing(cont)

my_list[[3]][[2]] ## these are just returning the same thing

[1] 4 5 6

my_list$c$y

[1] 4 5 6

`$` in action

This will just subset things

penguins[penguins$species == "Gentoo", c("species", "island", "bill_length_mm")]

species	island	bill_length_mm
Gentoo	Biscoe	46.1
Gentoo	Biscoe	50.0
Gentoo	Biscoe	48.7
Gentoo	Biscoe	50.0
Gentoo	Biscoe	47.6
Gentoo	Biscoe	46.5
Gentoo	Biscoe	45.4
Gentoo	Biscoe	46.7
Gentoo	Biscoe	43.3
Gentoo	Biscoe	46.8

`$` in action(cont)

summary(penguins)

      species          island    bill_length_mm  bill_depth_mm  
 Adelie   :152   Biscoe   :168   Min.   :32.10   Min.   :13.10  
 Chinstrap: 68   Dream    :124   1st Qu.:39.23   1st Qu.:15.60  
 Gentoo   :124   Torgersen: 52   Median :44.45   Median :17.30  
                                 Mean   :43.92   Mean   :17.15  
                                 3rd Qu.:48.50   3rd Qu.:18.70  
                                 Max.   :59.60   Max.   :21.50  
                                 NA's   :2       NA's   :2      
 flipper_length_mm  body_mass_g       sex           year     
 Min.   :172.0     Min.   :2700   female:165   Min.   :2007  
 1st Qu.:190.0     1st Qu.:3550   male  :168   1st Qu.:2007  
 Median :197.0     Median :4050   NA's  : 11   Median :2008  
 Mean   :200.9     Mean   :4202                Mean   :2008  
 3rd Qu.:213.0     3rd Qu.:4750                3rd Qu.:2009  
 Max.   :231.0     Max.   :6300                Max.   :2009  
 NA's   :2         NA's   :2

mean(penguins$bill_depth_mm)

[1] NA

uh oh what happened?

Finding Help

Asking for help in R is easy the most common ways are help(thingineedhelpwith) and ?thingineedhelpwith

?mean

?thingineedhelpwith is probably the most common because it requires less typing.

Fixing our issue

mean(penguins$bill_depth_mm, na.rm =TRUE)

[1] 17.15117

Good documentation fluctuates wildly because it is an open source language
If in doubt

:::

Your Turn

Find the minimum value of bill_length_mm
Find the maximum value of body_mass_g
Subset the penguins data any way you want using column position or $
Assign each of them to an object
Create a vector from 1:10 index that vector using [] to return 2 and 4

05:00

Some additional useful stuff

Sometimes we want summary statistics per group
- What kind of penguins live where
- Are their any interesting patterns by group etc
Fortunately R comes with some handy functions to use
table counts each factor level
tapply will let you group stuff by a factor and get some useful balance statistics

Table

table(penguins$sex)


female   male 
   165    168

table(penguins$sex, useNA = "ifany")


female   male   <NA> 
   165    168     11

tapply and calculating descriptive statistics by groups

tapply(penguins$species,penguins$sex, table, useNA = "ifany")

$female

   Adelie Chinstrap    Gentoo 
       73        34        58 

$male

   Adelie Chinstrap    Gentoo 
       73        34        61

tapply(penguins$bill_depth_mm, penguins$species, mean, na.rm = TRUE)

   Adelie Chinstrap    Gentoo 
 18.34636  18.42059  14.98211

Plotting

plot(penguins$bill_length_mm,
   penguins$body_mass_g,
   xlab = "Bill Length(mm)",
   ylab = "Body Mass(g)")

Plotting(cont)

hist(penguins$bill_length_mm,
 xlim = c(30, 60))

Making New Things

To foreshadow our next workshop often we need to do things with our data
- Like deal with all those pesky missing values
- Create new variables
- subset our data(kind of like we have been doing)
- recode our variables
To add new variables we can use what we know

penguins$range_body_mass = max(penguins$body_mass_g, na.rm = TRUE) - min(penguins$body_mass_g, na.rm = TRUE)

penguins$chinstrap[penguins$species == "Adelie" | penguins$species == "Gentoo"] <- "Not Chinstrap"

penguins$chinstrap[penguins$species == "Chinstrap"] <- "Chinstrap"

penguins[,c("species", "range_body_mass", "chinstrap")]

# A tibble: 344 × 3
   species range_body_mass chinstrap    
   <fct>             <int> <chr>        
 1 Adelie             3600 Not Chinstrap
 2 Adelie             3600 Not Chinstrap
 3 Adelie             3600 Not Chinstrap
 4 Adelie             3600 Not Chinstrap
 5 Adelie             3600 Not Chinstrap
 6 Adelie             3600 Not Chinstrap
 7 Adelie             3600 Not Chinstrap
 8 Adelie             3600 Not Chinstrap
 9 Adelie             3600 Not Chinstrap
10 Adelie             3600 Not Chinstrap
# … with 334 more rows

Cleaning up after yourself

rm(objectname) will remove the objects you created
rm(list=ls()) will remove all the objects your created
You can remove packages, sometimes, with detach(package:packageyouwanttoremove)
- This can be iffy for a variety of reasons
- Some packages automatically load another package or depend on another.
However, restarting your R session is generally best practice because it will do both

Getting Good at R

The only way to write good code is to write tons of shitty code first. Feeling shame about bad code stops you from getting to good code
— Hadley Wickham (@hadleywickham) April 17, 2015

Tell Us How We Did

https://gsu.qualtrics.com/jfe/form/SV_9nucJR3soZ9lkqO

via GIPHY

Getting Started in R

Research Data Services

Our Team

Get Ready Badges

How To Get the Badges

The Workshop

Why Use R?

Why R and RStudio?(cont)

R? Rstudio? Whats the Difference?

Navigating RStudio

Navigating RStudio

Setting Your Working Directory

Cats and Boxes

Setting Your Working Directory(cont)

Seeing What Working Directory You are Using

Setting Your Working Directory

How To Make Your Life Easier

How To Make Your Life Easier

Working Directory for My Laptop

Working Directory of My Office Computer

R Projects

Objects

Install and loading packages

Install and load(cont.)

R Some Basics

Basic Maths

Basic Maths

Logical Statements & Booleans

Booleans and Logicals in Action

Logicals, Booleans, and Precedence

Other Useful Tricks

Cool Now What?

Everything is an Object

Assignment

Assignment(cont)

Working with Objects

Global Environment(cont)

Fixing Our Issue

What are Objects?

Vectors

Atomic Vectors

Lists

A Quick Aside on Naming Stuff

A Quick Aside on Naming Stuff(cont)

How and What to Name Objects

Navigating Objects in R

The Data We are Working With

Importing Data

Your Turn

Our Data

Indexing []

Negative Indexing

Negative Indexing(cont)

Indexing [] (cont)

[] vs [[]]

Subsetting By Tests

$ Indexing

Indexing(cont)

$ in action

$ in action(cont)

Finding Help

Fixing our issue

Your Turn

Some additional useful stuff

Table

tapply and calculating descriptive statistics by groups

Plotting

Plotting(cont)

Making New Things

Cleaning up after yourself

Getting Good at R

Tell Us How We Did

`R` Some Basics

Indexing `[]`

Indexing `[]` (cont)

`[]` vs `[[]]`

`$` Indexing

`$` in action

`$` in action(cont)