<- c(2, 3, 1, 6, 4, 3, 3, 7)
my_vec my_vec
[1] 2 3 1 6 4 3 3 7
(AST230) R for Data Science
Up until now we’ve been creating simple objects by directly assigning a single value to an object.
It’s very likely that you’ll soon want to progress to creating more complicated objects. Happily, R has a multitude of functions to help you do this
The first function we will learn about is the c()
function.
The c()
function is short for concatenate and we use it to join together a series of values and store them in a data structure called a vector or “atomic vector”
Now that we’ve created a vector. we can use other functions to do useful stuff with this object
For example, we can calculate the mean, variance, standard deviation and number of elements in our vector by using the mean()
, var()
, sd()
and length()
functions
Important
Scalar is a vector of length one
logical
integer
double
character
complex
raw
Every vector has two key properties:
typeof()
length()
.Logical vectors are the simplest type of atomic vector because they can take only two possible values: FALSE
and TRUE
logical operator | symbol in R |
---|---|
equal to | == |
greater or greater equal | > ,>= |
less or less equal | < ,<= |
not equal | != |
Integer and double vectors are known collectively as numeric vectors
In R, numbers are doubles by default. To make an integer, place an L
after the number:
[1] 1.000 5.500 20.134 0.320
[1] "double"
[1] TRUE
Character vectors are used to represent string values. You can think of character strings as something like a word (or multiple words).
It is represented by a collection of characters between double quotes ("
)
NULL
is often used to represent the absence of a vector
NULL
typically behaves like a vector of length 0NA
is used to represent the absence of a value in a vector.interger
and double
\rightarrow quantitative data
character
\rightarrow qualitative data
logical
\rightarrow binary data
Sometimes it can be useful to create a vector that contains a regular sequence of values in steps of one.
Here we can make use of a shortcut using the :
(colon) symbol.
seq()
Other useful functions for generating vectors of sequences include the seq()
and rep()
functions.
For example, to generate a sequence from 1
to 5
in steps of 0.5
[1] 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
[1] 1.000000 1.571429 2.142857 2.714286 3.285714 3.857143 4.428571 5.000000
Here we’ve used the arguments from =
and to = to
define the limits of the sequence and the by =
argument to specify the increment of the sequence.
Play around with other values for these arguments to see their effect
rep()
rep()
function allows you to replicate (repeat) values a specified number of times. To repeat the value 2, 10 timesThe arguments times
, each
and length.out
are used in rep()
to obtain different vectors
We can also repeat non-numeric values. e.g.
[1] "boy" "boy" "boy" "girl" "girl" "girl"
[1] "boy" "boy" "girl" "girl" "boy" "boy" "girl" "girl" "boy" "boy"
[11] "girl" "girl"
c()
, :
, and seq()
c()
and once seq()
c()
, seq()
with a by
argument, and seq()
with a length.out
argument.rep()
or seq()
functions.Create the vector (101, 102, 103, 200, 205, 210, 1000, 1100, 1200) using a combination of the c()
and seq()
functions
Create a vector that repeats the integers from 1 to 5, 10 times, i.e. (1, 2, 3, 4, 5, 1, 2, 3, 4, 5, \ldots), and the length of the vector should be 50!
Create the same vector as before, but this time repeat 1, 10 times, then 2, 10 times, etc., i.e. (1, 1, 1, \ldots, 2, 2, 2, \ldots, \ldots, 5, 5, 5) and the length of the vector should also be 50