R and RStudio
(AST230) R for Data Science
Introduction to R
- R is an extremely powerful programming language for statistical computing and graphics generation
- In 1993, R was created at the University of Auckland, New Zealand by two statistics professors
- Ross Ihaka (University of Auckland, New Zealand)
- Robert Gentleman (University of Waterloo, Canada)
- R is a dialect of S Programming Language
R is flexible and free to download (under GNU general public license), and has been widely used in academic environments over last two decades
R is open source and is supported by an extensive user community
R is currently maintained by the R Development Core Team
CRAN (the Comprehensive R Archive Network) is a repository of additional R packages, contributed by the R user community
Why should we learn R
- R is open source and freely available.
- R is available for Windows, Mac and Linux operating systems.
- R has an extensive and highly flexible graphical facility capable of producing publication quality figures
- R has an expanding set of freely available ‘packages’ to extend R’s capabilities.
Scripts
R is a command line driven program. With R, all the steps used in the analysis (e.g. from reading the data to produce final results) can be saved and can redo the analysis without much effort
- E.g. only running the scripts is needed, no need to remember the sequence of clicks used for the analysis
Working with scripts makes the steps used in the analysis clear, and the code can be inspected by others (this will improve the codes and remove mistakes, if there is any!)
Working with scripts helps to understand the associated statistical methods more clearly
Reproducibility
The term reproducibility is used when someone else (including your future self!) can obtain the same results from the same data set when using the same analysis (coded in scripts)
Now-a-days funding agency and peer-reviewed journals expect the analyses to be reproducible (journals often ask for the data and codes before publishing the accepted manuscripts)
R becomes an integral part of reproducible research and it can be used to generate (dynamic) documents (e.g. manuscripts, report, etc.) from the codes (i.e. a small change of data, analysis, and organization can be updated automatically by running the scripts again)
Download R
Download R installer from Comprehensive R Archive Networks (CRAN) https://cran.r-project.org
To download R installer for Windows OS
- Click on Download R for Windows
- Click on Install R for the first time
R interface
Download RStudio
Go to the page https://posit.co/download/rstudio-desktop/ to download RStudio
RStudio is an integrated development environment (IDE) for R. IDE is a GUI, where you can write your codes, see the results and also see the variables that are generated during the course of programming.
R is the language
RStudio is a software created to facilitate our use of R
RStudio interface
The RStudio user interface has 4 primary panes:
Source pane: used to write and edit R codes and other related documents
Console pane: This is the workhorse of R. This is where R evaluates all the code you write.
Environment pane, containing the Environment, History, Connections, Build, VCS , and Tutorial tabs
Output pane, containing the Files, Plots, Packages, Help, Viewer, and Presentation tabs
Set the working directory in RStudio
R is always pointed at a directory on our computer. You can check the file path of your working directory by looking at bar at the top of the Console pane.
We can also find out the working directory by running the
getwd()
function in console.We can set the working directory manually in two ways:
- The first way is to use the console and using the command
setwd(“directory/path”)
.- You can use this function
setwd()
and give the path of the directory which you want to be the working directory for RStudio, in the double quotes.
- You can use this function
- The second way is to set the working directory from the GUI. Click on this 3 dots button which opens up a file browser, which will help you to choose your working directory.
Once you choose your working directory, you need to use this setting button in the more tab and click it and then you get a popup menu, where you need to select “Set as working directory”