Introduction to R - 6th lesson (programming)

Nathalie Villa-Vialaneix - http://www.nathalievialaneix.eu
September 14-16th, 2015

Master TIDE, Université Paris 1

Conditions

if/else
switch

Using "if"

x <- sample(1:10, 1); x

[1] 4

if (x > 5) {
  print("I am large!")
}

Using "if" with "else"

x <- sample(letters[1:5], 1); x

[1] "d"

if (x %in% c("a","e")) {
  print("I am a vowel!")
} else {
  print("I am not a vowel. ")
}

[1] "I am not a vowel. "

Using "switch" for multiple cases

x <- sample(letters[1:5], 1); x

[1] "a"

res <- switch(x,
              "a"="b", "b"="c", "c"="d",
              "d"="e", "e"="f")
res

[1] "b"

Functions

syntax
arguments and default values
multiple inputs and outputs

A first function

myFirstFunction <- function(x) {
  res <- x^2+1
  return(res)
}
myFirstFunction; myFirstFunction(1)

function(x) {
  res <- x^2+1
  return(res)
}

[1] 2

sapply(seq(0,1,length=5),
       myFirstFunction)

[1] 1.0000 1.0625 1.2500 1.5625 2.0000

Gestion of error

myFirstFunction <- function(x) {
  if (!is.numeric(x)) {
    stop("argument 'x' must be numeric")
  }
  res <- x^2+1
  return(res)
}
myFirstFunction(1)

[1] 2

## Not run: calling myFirstFunction("A") returns an error

Default argument

myFirstFunction <- function(x=0) {
  res <- x^2+1
  return(res)
}
myFirstFunction(); myFirstFunction(1)

[1] 1

[1] 2

Default argument (finite set)

mySecondFunction <- 
  function(x=c("A","B","C")) {
    x <- match.arg(x) # gestion of error
    res <- paste("this is a", x)
    return(res)
}
mySecondFunction(); mySecondFunction("C")

[1] "this is a A"

[1] "this is a C"

## Not run: mySecondFunction(1) returns an error

Multiple inputs/outpus

Multiple objects can be returned using lists.

myThirdFunction <- function(x, y=2) {
    return(list("val"=x, "power"=x^y))
}
myThirdFunction(2); myThirdFunction(2, 3)

$val
[1] 2

$power
[1] 4

$val
[1] 2

$power
[1] 8

Loops

for
repeat/while

Using "for"

for (ind in 1:10) {
  cat(ind," ")
}

1  2  3  4  5  6  7  8  9  10

!!VERY IMPORTANT WARNING: Do not use loop unless you have no other solution because they are MUCH slower than vectorized expressions!!

Computational time comparison

system.time({for (ind in 1:100000) ind^2})

   user  system elapsed 
  0.016   0.000   0.016

system.time((1:100000)^2)

   user  system elapsed 
  0.000   0.000   0.001

For complicated operations, this can make a huge difference.

Computational time comparison

data(USArrests)
system.time({for (ind in 1:100000) mean(USArrests)})

   user  system elapsed 
  7.442   0.000   7.447

system.time(apply(USArrests, 2, mean))

   user  system elapsed 
  0.001   0.000   0.001

For complicated operations, this can make a huge difference.

Repeat loop

ind <- 1
repeat {
  ind <- ind+1
  cat(ind, " ")

  if (ind > 10) break
}

2  3  4  5  6  7  8  9  10  11

While loop

ind <- 1
while (ind < 10) {
  ind <- ind+1
  cat(ind, " ")
}

2  3  4  5  6  7  8  9  10

Exercise 1

Write a function myFavoriteStats that takes as an input a numeric vector and returns (in that order) its mean, median, standard deviation, minimum and maximum. Apply it to the columns of the dataset USArrests.

         Murder   Assault UrbanPop      Rape
mean    7.78800 170.76000 65.54000 21.232000
median  7.25000 159.00000 66.00000 20.100000
sd      4.35551  83.33766 14.47476  9.366385
min     0.80000  45.00000 32.00000  7.300000
max    17.40000 337.00000 91.00000 46.000000

Exercise 2

Write a function sumCalc that calculates the sum of the digits of a given integer:

sumCalc(123)

[1] 6

Use the operators %% and %/% and a loop, remarking that

123%%10; 123%/%10

[1] 3

[1] 12

Exercise 2

Using the same approach, write a function prodCalc that calculates the product of the digits of a given integer:

prodCalc(123)

[1] 6

Exercise 2

Use the two previous functions to define a function checkEqual that returns TRUE if the sum of the digits is equal to the product of the digits and FALSE otherwise.

checkEqual(123)

[1] TRUE

Use this function to find all numbers between 1 and 100 such that the sum of their digits is equal to the product of their digits. Do not use a loop to perform this step:

 [1]  1  2  3  4  5  6  7  8  9 22