AnalyticsDojo

Introduction to R - Conditional Statements and Loops

rpi.analyticsdojo.com

Overview

  • What are conditional statements? Why do we need them?
  • If statements in R
  • Why, Why not Loops?
  • Loops in R

What are conditional statements? Why do we need them?

if Statements

  • Enables logical branching and recoding of data.
  • BUT, if statements can result in long code branches, repeated code.
  • Best to keep if statements short.

Conditional Statements

  • if statemenet enable logic.
  • else gives what to do if other conditions are not met.
  • The else if function is achieved through nesting a if statement within an else function.
#How long did the homework take? 
#Imagine this is the hours for each assignment. 
hours<- c(1,3,4,3)
#This is the experience of the individual, which can be high, medium, or low.
experience<-'low'
#experience<- 'high'

#toy = 'old'
if(experience=='high'){
exp.hours <- hours/2       
} else {
  if(experience=='low'){
     exp.hours <- hours * 2    
  } else {
    exp.hours <- hours      
  }
}
#Notice how this adjusted 
print(exp.hours)


[1] 2 6 8 6

R Logit and Conditions

  • < less than
  • <= less than or equal to
  • > greater than
  • >= greater than or equal to
  • == exactly equal to
  • != not equal to
  • !x This corresponsds to not x.
  • x & y This cooresponds to and. (This does and element by element comparson.)
  • x | y This cooresponds to or. (This does and element by element comparson.)
#simple
x<-FALSE
y<-FALSE
if (!x){
print("X is False")
}else{
print("X is True")
}




[1] "X is False"
x<-TRUE
y<-TRUE
if((x==TRUE)|(y==TRUE)){
print("Either X or Y is True")
}

[1] "Either X or Y is True"
if((x==TRUE)&(y==TRUE)){
print("X and Y are both True")
}

[1] "X and Y are both True"

Conditionals and ifelse

  • ifelse(*conditional*, True, False) can be used to recode variables.
  • ifelse can be nested.
  • Use the cut function as an alternative for more than 3 categroies.
  • This can be really useful when
# create 2 age categories 
age<-c(18,15, 25,30)
agecat <- ifelse(age > 18, c("adult"), c("child"))
agecat

<ol class=list-inline>
  • 'child'
  • 'child'
  • 'adult'
  • 'adult'
  • </ol>
    df=read.csv(file="../../input/iris.csv", header=TRUE,sep=",")
    
    #Let's say we want to categorize sepalLenth as short/long or short/medium/long.
    sl.med<-median(df$sepal_length)
    sl.sd<-sd(df$sepal_length)
    sl.max<-max(df$sepal_length)
    
    df$agecat2 <- ifelse(df$sepal_length > sl.med, c("long"), c("short"))
    df$agecat3 <- ifelse(df$sepal_length > (sl.med+sl.sd), c("long"), 
            ifelse(df$sepal_length < (sl.med-sl.sd), c("short"), c("medium")))
    
    
    #This sets the different cuts for the categories. 
    cuts<-c(0,sl.med-sl.sd,sl.med+sl.sd,sl.max)
    cutlabels<-c("short", "medium", "long") 
    
    df$agecat3altcut<-cut(df$sepal_length, breaks=cuts, labels=cutlabels)
    df[,c(1,6,7,8)]
    
    
    sepal_lengthagecat2agecat3agecat3altcut
    5.1 short mediummedium
    4.9 short short short
    4.7 short short short
    4.6 short short short
    5.0 short mediummedium
    5.4 short mediummedium
    4.6 short short short
    5.0 short mediummedium
    4.4 short short short
    4.9 short short short
    5.4 short mediummedium
    4.8 short short short
    4.8 short short short
    4.3 short short short
    5.8 short mediummedium
    5.7 short mediummedium
    5.4 short mediummedium
    5.1 short mediummedium
    5.7 short mediummedium
    5.1 short mediummedium
    5.4 short mediummedium
    5.1 short mediummedium
    4.6 short short short
    5.1 short mediummedium
    4.8 short short short
    5.0 short mediummedium
    5.0 short mediummedium
    5.2 short mediummedium
    5.2 short mediummedium
    4.7 short short short
    6.9 long long long
    5.6 short mediummedium
    7.7 long long long
    6.3 long mediummedium
    6.7 long long long
    7.2 long long long
    6.2 long mediummedium
    6.1 long mediummedium
    6.4 long mediummedium
    7.2 long long long
    7.4 long long long
    7.9 long long long
    6.4 long mediummedium
    6.3 long mediummedium
    6.1 long mediummedium
    7.7 long long long
    6.3 long mediummedium
    6.4 long mediummedium
    6.0 long mediummedium
    6.9 long long long
    6.7 long long long
    6.9 long long long
    5.8 short mediummedium
    6.8 long long long
    6.7 long long long
    6.7 long long long
    6.3 long mediummedium
    6.5 long mediummedium
    6.2 long mediummedium
    5.9 long mediummedium

    Why, Why Not Loops?

    Why, Why Not Loops?

    • Iterate over arrays or lists easily.
    • BUT, in many cases for loops don’t scale well and are slower than alternate methods involving functions.
    • BUT, don’t worry about prematurely optimizing code.
    • Often if you are doing a loop, there is a function that is faster. You might not care for small data applications.
    • Here is a basic example of For/While loop.
    sum<-0
    avgs <- numeric (8)
    for (i in 1:8){
        print (i)
        sum<-sum+i  
    }
    print(sum)
    for (i in 1:8) print (i)
    
    
    
    [1] 1
    [1] 2
    [1] 3
    [1] 4
    [1] 5
    [1] 6
    [1] 7
    [1] 8
    [1] 36
    [1] 1
    [1] 2
    [1] 3
    [1] 4
    [1] 5
    [1] 6
    [1] 7
    [1] 8
    
    for (i in 1:8) print (i)
    
    
    [1] 1
    [1] 2
    [1] 3
    [1] 4
    [1] 5
    [1] 6
    [1] 7
    [1] 8
    

    While Loop

    • Performs a loop while a conditional is TRUE.
    • Doesn’t auto-increment.
    #This produces the same.
    i<-1
    sum<-0
    x<-TRUE
    while (x) {
      print (i)
      sum<-sum+i
      i<-i+1
      if (i>8){x<-FALSE}
    }  
    print(sum)
    
    
    [1] 1
    [1] 2
    [1] 3
    [1] 4
    [1] 5
    [1] 6
    [1] 7
    [1] 8
    [1] 36
    

    For Loops can be Nested

    #Nexting Example,
    x=c(0,1,2)
    y=c("a","b","c")
    #Nested for loops
    for (a in x){
        for (b in y){
            print(c(a,b), quote = FALSE)
    }}
    
    
    [1] 0 a
    [1] 0 b
    [1] 0 c
    [1] 1 a
    [1] 1 b
    [1] 1 c
    [1] 2 a
    [1] 2 b
    [1] 2 c
    

    Copyright AnalyticsDojo 2016. This work is licensed under the Creative Commons Attribution 4.0 International license agreement.