#> [1] 9 8 5
#> [1] 9 10 5
#> [1] 9 8 10
R’s subsetting operators are fast and powerful and mastering them allows you concisely perform complex operations
Subsetting in R easy to learn but hard to master because you need to internalize a number of interrelated concepts
There are three subsetting operators, [[
, [
, and $
subsetting operators interact differently with different vector types (e.g. atomic vectors, lists, factors, matrices, and data frames)
Subsetting can be combined with assignment
#> [1] 9 8 5
#> [1] 9 10 5
#> [1] 9 8 10
#> [,1] [,2] [,3]
#> [1,] 1 4 7
#> [2,] 2 5 8
#> [3,] 3 6 9
Select specific element of a matrix
#> [1] 7
#> [1] 7 8
Select rows or columns of a matrix
#> [,1] [,2]
#> [1,] 4 7
#> [2,] 5 8
#> [3,] 6 9
#> [,1] [,2] [,3]
#> [1,] 1 4 7
#> [2,] 3 6 9
#> [,1] [,2] [,3]
#> [1,] 1 4 7
#> [2,] 2 5 8
#> [3,] 3 6 9
#> [1] 1 4 7
#> [1] 1 2 3
#> [,1] [,2] [,3]
#> [1,] 1 4 7
#> [,1]
#> [1,] 1
#> [2,] 2
#> [3,] 3
#> x y z
#> 1 1 a 11
#> 2 2 b 12
#> 3 3 c 13
#> 4 4 d 14
#> [1] "x" "y" "z"
#> 'data.frame': 4 obs. of 3 variables:
#> $ x: int 1 2 3 4
#> $ y: chr "a" "b" "c" "d"
#> $ z: int 11 12 13 14
#> [1] "1" "2" "3" "4"
Positional Indexing
#> y z
#> 1 a 11
#> 2 b 12
#> x y z
#> 2 2 b 12
#> 3 3 c 13
#> x y
#> 1 1 a
#> 2 2 b
#> 3 3 c
#> 4 4 d
Extract a specific variable x
from data frame
#> [1] 1 2 3 4
#> [1] 1 2 3 4
#> [1] 1 2 3 4
#> x
#> 1 1
#> 2 2
#> 3 3
#> 4 4
#> [1] "x" "y" "z"
#> x y
#> 1 1 a
#> 2 2 b
#> 3 3 c
#> 4 4 d
#> x y
#> 1 1 a
#> 2 2 b
#> 3 3 c
#> 4 4 d
#> x y
#> 1 1 a
#> 2 2 b
Logical Indexing
#> x y z
#> 1 1 a 11
#> 2 2 b 12
#> 3 3 c 13
#> 4 4 d 14
#> x y z
#> 1 1 a 11
#> 3 3 c 13
#> [1] FALSE FALSE TRUE TRUE
#> x y z
#> 3 3 c 13
#> 4 4 d 14
How many variables are in mtcars
? Show the list of these variables.
Extract the vector mpg
from mtcars
, and calculate its mean and standard deviation.
Check whether there is any missing value in wt
of mtcars
Obtain a data frame with mpg > 22
Obtain a data frame from mtcars
with gear=5
and cyl=4
and keep only the variables gear
, and cyl
list()
is the most flexible data structure of R, vectors of different lengths and/or a data frame can be included in a list
Data frame is a special case of a list and [[ ]]
is useful for extracting elements of a list
List is considered as a heterogeneous vector as its elements could be of different types
The operators [[
, [
, and $
can be used to selecting elements from a list
Create a list
#> List of 4
#> $ : int [1:3] 1 2 3
#> $ : chr "a"
#> $ : logi [1:3] TRUE FALSE FALSE
#> $ : int [1:3] 2 5 9
#> [1] 1 2 3
#> [[1]]
#> [1] 1 2 3
#> [1] "integer"
#> [1] "list"
#> $x
#> [1] 1 2 3 4 5
#>
#> $y
#> [1] TRUE FALSE
#>
#> $z
#> [,1] [,2]
#> [1,] 1 3
#> [2,] 2 4
Extracting x
#> [1] 1 2 3 4 5
#> [1] 1 2 3 4 5
#> [1] 1 2 3 4 5