Exercise 1: Solutions third part

Task 3.1

Define a vector f1 containing 5 arbitrary elements of the type character.

f1 <- c("apple", "banana", "cherry", "date", "elderberry")

Task 3.2

Define a vector f2 containing 5 arbitrary elements of the type factor.

f2 <- factor(c("red", "green", "blue", "yellow", "purple"))

Task 3.3

Define a vector `f3?? containing 5 other arbitrary elements of the type numeric.

f3 <- c(10.5, 23.3, 42.7, 5.8, 12.1)

Task 3.4

Create a list L containing the vectors f1, f2, f3.

L <- list(f1 = f1, f2 = f2, f3 = f3)
L
$f1
[1] "apple"      "banana"     "cherry"     "date"       "elderberry"

$f2
[1] red    green  blue   yellow purple
Levels: blue green purple red yellow

$f3
[1] 10.5 23.3 42.7  5.8 12.1

Task 3.5

Look at the structure of the list.

str(L)
List of 3
 $ f1: chr [1:5] "apple" "banana" "cherry" "date" ...
 $ f2: Factor w/ 5 levels "blue","green",..: 4 2 1 5 3
 $ f3: num [1:5] 10.5 23.3 42.7 5.8 12.1

Task 3.6

Create a data.frame df1 using L. Look at the structure again.

df1 <- data.frame(L)
str(df1)
'data.frame':   5 obs. of  3 variables:
 $ f1: chr  "apple" "banana" "cherry" "date" ...
 $ f2: Factor w/ 5 levels "blue","green",..: 4 2 1 5 3
 $ f3: num  10.5 23.3 42.7 5.8 12.1

Looks like a list, but with a little bit more info

Task 3.7

What are the element on the second row?

second_row <- df1[2, ]
second_row
      f1    f2   f3
2 banana green 23.3

Task 3.8

What are the element on the second column?

second_column <- df1[, 2]
second_column
[1] red    green  blue   yellow purple
Levels: blue green purple red yellow

Task 3.9

What are the values between the 2nd and the 4th rows?

rows_2_to_4 <- df1[2:4, ]
rows_2_to_4
      f1     f2   f3
2 banana  green 23.3
3 cherry   blue 42.7
4   date yellow  5.8

Task 3.10

Save the data set as a csv.

write.csv(df1, file = "data_set.csv", row.names = FALSE)

Task 3.11

Load the data set into R as a new object df2 and compare it with the original one df1.

df2 <- read.csv("data_set.csv")
all.equal(df1, df2)
[1] "Component \"f2\": 'current' is not a factor"

The second variable is not a factor! We can change the type to factor:

df2$f2 <- as.factor(df2$f2)
all.equal(df1, df2)
[1] TRUE