5 Subsetting

Subsetting

  • R’s subsetting operators are fast and powerful and mastering them allows you concisely perform complex operations

  • Subsetting in R easy to learn but hard to master because you need to internalize a number of interrelated concepts

    • There are three subsetting operators, [[, [, and $

    • subsetting operators interact differently with different vector types (e.g. atomic vectors, lists, factors, matrices, and data frames)

    • Subsetting can be combined with assignment

Subsetting atomic vectors

  • We have already discussed how to select elements from an atomic vector using numerical and logical indexing previously
#> [1] 9 8 5
#> [1]  9 10  5
#> [1]  9  8 10

Subsetting matrices

#>      [,1] [,2] [,3]
#> [1,]    1    4    7
#> [2,]    2    5    8
#> [3,]    3    6    9

Select specific element of a matrix

#> [1] 7
#> [1] 7 8

Select rows or columns of a matrix

#>      [,1] [,2]
#> [1,]    4    7
#> [2,]    5    8
#> [3,]    6    9
#>      [,1] [,2] [,3]
#> [1,]    1    4    7
#> [2,]    3    6    9

Subsetting matrices

#>      [,1] [,2] [,3]
#> [1,]    1    4    7
#> [2,]    2    5    8
#> [3,]    3    6    9
#> [1] 1 4 7
#> [1] 1 2 3
#>      [,1] [,2] [,3]
#> [1,]    1    4    7
#>      [,1]
#> [1,]    1
#> [2,]    2
#> [3,]    3

Subsetting data frames

#>   x y  z
#> 1 1 a 11
#> 2 2 b 12
#> 3 3 c 13
#> 4 4 d 14
#> [1] "x" "y" "z"
#> 'data.frame':    4 obs. of  3 variables:
#>  $ x: int  1 2 3 4
#>  $ y: chr  "a" "b" "c" "d"
#>  $ z: int  11 12 13 14
#> [1] "1" "2" "3" "4"

Subsetting data frames

Positional Indexing

#>   y  z
#> 1 a 11
#> 2 b 12
#>   x y  z
#> 2 2 b 12
#> 3 3 c 13
#>   x y
#> 1 1 a
#> 2 2 b
#> 3 3 c
#> 4 4 d

Subsetting data frames

Extract a specific variable x from data frame

#> [1] 1 2 3 4
#> [1] 1 2 3 4
#> [1] 1 2 3 4
#>   x
#> 1 1
#> 2 2
#> 3 3
#> 4 4

Subsetting data frames

#> [1] "x" "y" "z"
#>   x y
#> 1 1 a
#> 2 2 b
#> 3 3 c
#> 4 4 d
#>   x y
#> 1 1 a
#> 2 2 b
#> 3 3 c
#> 4 4 d
#>   x y
#> 1 1 a
#> 2 2 b

Subsetting data frames

Logical Indexing

#>   x y  z
#> 1 1 a 11
#> 2 2 b 12
#> 3 3 c 13
#> 4 4 d 14
#>   x y  z
#> 1 1 a 11
#> 3 3 c 13
#> [1] FALSE FALSE  TRUE  TRUE
#>   x y  z
#> 3 3 c 13
#> 4 4 d 14

Exercise 5

  • How many variables are in mtcars? Show the list of these variables.

  • Extract the vector mpg from mtcars, and calculate its mean and standard deviation.

  • Check whether there is any missing value in wt of mtcars

  • Obtain a data frame with mpg > 22

  • Obtain a data frame from mtcars with gear=5 and cyl=4 and keep only the variables gear, and cyl

Subsetting lists

  • list() is the most flexible data structure of R, vectors of different lengths and/or a data frame can be included in a list

  • Data frame is a special case of a list and [[ ]] is useful for extracting elements of a list

  • List is considered as a heterogeneous vector as its elements could be of different types

  • The operators [[, [, and $ can be used to selecting elements from a list

Subsetting lists

Create a list

#> List of 4
#>  $ : int [1:3] 1 2 3
#>  $ : chr "a"
#>  $ : logi [1:3] TRUE FALSE FALSE
#>  $ : int [1:3] 2 5 9
#> [1] 1 2 3
#> [[1]]
#> [1] 1 2 3
#> [1] "integer"
#> [1] "list"

Subsetting lists

#> $x
#> [1] 1 2 3 4 5
#> 
#> $y
#> [1]  TRUE FALSE
#> 
#> $z
#>      [,1] [,2]
#> [1,]    1    3
#> [2,]    2    4

Extracting x

#> [1] 1 2 3 4 5
#> [1] 1 2 3 4 5
#> [1] 1 2 3 4 5
1 / 14
5 Subsetting

  1. Slides

  2. Tools

  3. Close
  • 5 Subsetting
  • Subsetting
  • Subsetting atomic vectors
  • Subsetting matrices
  • #> [,1] [,2]...
  • Subsetting data frames
  • Positional Indexing...
  • Extract a specific...
  • #> [1] "x" "y" "z"...
  • Logical Indexing...
  • Exercise 5
  • Subsetting lists
  • Create a list ...
  • #> $x #> [1] 1 2...
  • f Fullscreen
  • s Speaker View
  • o Slide Overview
  • e PDF Export Mode
  • r Scroll View Mode
  • ? Keyboard Help