Getting help with functions and features

R has an inbuilt help facility. To get more information on any specific function, for example matrix, the command is help(matrix). An alternative is ?matrix. The examples on a help topic can normally be run by example("matrix"). apropos("matrix") returns a character vector giving the names of objects in the search that include matrix.

R commands, case sensitivity, etc.

  • R is an expression language
  • case sensitive (A and a are different symbols)
  • Commands are separated either by a semi-colon (‘;’), or by a newline
  • Comments can be put almost2 anywhere, starting with a hashmark (‘#’)

Simple manipulations; numbers and vectors

  • Vectors and assignment
x <- c(10.4, 5.6, 3.1, 6.4, 21.7)

or

x <- c(10.4, 5.6, 3.1, 6.4, 21.7)

or The following further assignment would create a vector y with 11 entries consisting of two copies of x with a zero in the middle place.

y <- c(x, 0, x)

Different types of vectors

x<-c(0.5,0.6)       ##numeric
x<-c(TRUE,FALSE)    ##logical
x<-c("a","b","c")   ##character
x<-9:29             ##integer

Using the vector() function

x<-vector("numeric",length=10)
x
##  [1] 0 0 0 0 0 0 0 0 0 0
Vector Arithmetic
  • The elementary arithmetic operators are the usual +, −, ∗, /, ∧ for raising to a power.

  • In addition all of the common arithmetic functions are available, e.g., log, exp, sin, cos, tan,sqrt, max and min(select the largest and smallest elements of a vector respectively).
  • range is a function whose value is a vector of length two, namely c(min(x),max(x)), length(x) is the number of elements in x, sum(x) gives the total of the elements in x and prod(x) their product.
  • Two statistical functions are mean(x) which calculates the sample mean, which is the same as sum(x)/length(x), and var(x) which gives sum((x − mean(x))2)/(length(x) − 1) or sample variance.
  • If the argument to var() is an n-by-p matrix the value is a p-by-p sample covariance.
  • sort(x) returns a vector of the same size as x with the elements arranged in increasing order.
  • seq(-5, 5, by=.2) -> x generates in x the vector c(-5.0, -4.8, -4.6,. . . , 4.6, 4.8, 5.0).
  • y <- rep(x, times=5) which will put five copies of x end-to-end in y.
  • Another useful version is z <- rep(x, each=5) which repeats each element of x five times before moving on to the next.
  • x[i] gives the i-th element of the vector x and x>0 gives a logical array of TRUE and FALSE with the i-th element TRUE if x[i] > 2.
  • For a vector x,table(x) returns the frequency distribution of the elements in x.
  • sample function takes a sample of the specified size from the elements of x using either with or without replacement using a probability vector.

Mixing Objects: coercion occurs

y<-c(1.7,"a") ##character
y<-c(TRUE,2)  ##numeric

Explicit Coercion

Object of one class can be converted to object of another class using as.*function

x<-1:12
class(x)
## [1] "integer"
as.numeric(x)
##  [1]  1  2  3  4  5  6  7  8  9 10 11 12
as.logical(x)
##  [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
as.character(x)
##  [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12"

Nonsencical coercion will give you “NAs”

x<-c("a","b","c")
as.numeric(x)
## Warning: NAs introduced by coercion
## [1] NA NA NA

Lists

Special type of vector that can contain elements of different classes.

x<-list(1,"a",TRUE)
x
## [[1]]
## [1] 1
## 
## [[2]]
## [1] "a"
## 
## [[3]]
## [1] TRUE

matrices

m<-matrix(1:10,ncol=5,nrow=2)
m
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1    3    5    7    9
## [2,]    2    4    6    8   10
attributes(m)
## $dim
## [1] 2 5

OR

m<-1:25
dim(m)<-c(5,5)
m
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1    6   11   16   21
## [2,]    2    7   12   17   22
## [3,]    3    8   13   18   23
## [4,]    4    9   14   19   24
## [5,]    5   10   15   20   25

Matrices can be created combining vectors using cbind() or rbind() function

x<-1:6
y<-11:16
rbind(x,y)
##   [,1] [,2] [,3] [,4] [,5] [,6]
## x    1    2    3    4    5    6
## y   11   12   13   14   15   16
cbind(x,y)
##      x  y
## [1,] 1 11
## [2,] 2 12
## [3,] 3 13
## [4,] 4 14
## [5,] 5 15
## [6,] 6 16

Factors

Factors are used to represented categorical data.

x<-factor(c("yes","yes","no","yes","no"))
x
## [1] yes yes no  yes no 
## Levels: no yes
table(x)
## x
##  no yes 
##   2   3

The orders of the lables can be set using level argument to factor()

x<-factor(c("yes","yes","no","yes","no"), levels=c("yes","no"))
levels(x)
## [1] "yes" "no"
Missing value

Missing values are denoted by NA or NaN

x<-c(1,2,NA,10,3)
is.na(x)
## [1] FALSE FALSE  TRUE FALSE FALSE
is.nan(x)
## [1] FALSE FALSE FALSE FALSE FALSE
y<-c(1,2,NaN,NA,10,3)
is.na(y)
## [1] FALSE FALSE  TRUE  TRUE FALSE FALSE
is.nan(y)
## [1] FALSE FALSE  TRUE FALSE FALSE FALSE
Data frames

Data frames are usually created by calling read.table() or read.csv(). Data frames can be converted to a matrix by calling data.matrix(). Data frames are usually created by calling read.table() or read.csv(). Can be converted to matrix by calling data.matrix()

data<-read.csv("TobaccoVars.csv",sep=",",header=T)
data[1:4,1:4]
##              X lymphInvaIndicator hsa.let.7b hsa.let.7d
## 1 TCGA-BA-5555                  0   2.310488  1.0057871
## 2 TCGA-BA-5556                  0   2.283006  0.9996545
## 3 TCGA-BA-6869                  0   2.031786  1.0039997
## 4 TCGA-BA-6873                  1   2.349883  0.9900227
dim(data);
## [1] 148  37
nrow(data); ncol(data)
## [1] 148
## [1] 37
Names attributes

R objects can also have names

x<-1:3
names(x)<-c("foo","bar","norf")
x
##  foo  bar norf 
##    1    2    3

Lists can also have names

x<-list(a=1,b="charac",c=TRUE)
x
## $a
## [1] 1
## 
## $b
## [1] "charac"
## 
## $c
## [1] TRUE

rownames and column names of matrices

m<-matrix(1:6,nrow=3)
dimnames(m)<-list(c("a","b","k"),c("c","d"))
m
##   c d
## a 1 4
## b 2 5
## k 3 6