R arithmetic operations - Boston University



Introduction to R Data Analysis and CalculationsKatia Oleinikkoleinik@bu.edu 5524507774940Scientific Computing and VisualizationBoston University020000Scientific Computing and VisualizationBoston UniversityR arithmetic operationsOperationDescriptionx + yadditionx - ysubtractionx * ymultiplicationx / ydivisionx ^ yexponentiationx %% yx mod yx %/% yinteger divisionVariable Name rulesCase sensitive : Party ≠ partyLetters, digits, underscores and dots can be used: DNA.data.2012 Cannot start with a digit, underscore or a dot followed by a digit: 2012.DNAShould not use reserved words (if, else, repeat, etc.) whichR atomic constants types:Integer: n <- 1 or n <- as.integer(1) or n <- 1LNumeric: a <- 2.5 Complex: d <- 3 + 12iLogical:ans <- TRUE Character:name <- “Katia” or name <- ‘Katia’Special:NULL, NA, Inf, NanR operators:OperationsDescription+ -*/%%^Arithmetic> >=<<===!=Relational! &|Logical~Model Formulas-> <-Assignment$List indexing:SequenceR built-in constants:ConstantsDescriptionLETTERS26 upper-case letters of the Roman alphabetletters26 lower-case letters of the Roman alphabetmonth.abb3-letter abbreviations of month namesmonth.namemonth namespiπ: ratio of circle circumference to diameterT , FTRUE, FALSER math functions for scalars and vectors:FunctionDescriptionsin, cos, tan, asin, acos, atan, atan2, log, log10, log(x,base), exp, sinh, cosh, …Various standard trig, log and exp. functions min(x), max(x), range(x), abs(x)Minimum/maximum, range and absolute valuesum(x), diff(x), prod(x)Sum, difference and product of vector elementsmean(x), median(x),sd(x), var(x)Mean, median, standard deviation, varianceweighted.mean(x,w)Mean of x with weights wquantile(x,probs=)Sample quantiles corresponding to the given probabilities (defaults to 0,.25,.5,.75,1)round(x, n) Rounds the elements of x to n decimalsRe(x), Im(x), Conj(x) Real, imaginary part of a complex number, Conjugate of a numberArg(x)Angle in radians of the complex numberfft(x)Fast Fourier Transform of an arraypmin(x,y,…), pmax(x,y,…)A vector which ith element is min/max of (x[i],y[i],…)cumsum(x), cumprod(x)A vector, which ith element is a sum/product from x[1] to x[i]cummin(x), cummax(x)A vector, which ith element is a min/max from x[1] to x[i]var(x,y) or cov(x,y)Covariance between 2 vectorscor(x,y)Linear correlation between x and ylength(x)Get the length of the vectorfactorial(n)Calculate n!choose(n,m)Combination function: n! / ( k! * (n - k)! )*Note: Many math functions have a logical parameter na.rm=FALSE to specify missing data (NA) removal.Directories and Workspace:FunctionDescriptiongetwd()Get working directorysetwd(“/projects/myR/”)Set current directoryls()List objects in the current workspacerm(x,…)Remove objects from the current workspacelist.files()List files in the current directorylist.dirs()List directories (“myfile.xls”)Get file propertiesfile.exists(“myfile.xls”)Check if file existsfile.remove(“myfile.xls”)Delete filefile.append(file1, file2)Append file2 to file1 file.copy(from, to, …)Copy filesystem(“ls -la”)Execute command in the operating systemsave.image()Save contents of the current workspace in the default file .Rdatasave.image(file=”myR.Rdata”)Save contents of the current workspace in the filesave(a,b, file = “ab.Rdata”)Save a and b in the fileload(“myR.Rdata”)Restore workspace from the fileLoading and Saving Data:FunctionDescriptionread.table(file=”myData.txt”, header=TRUE)Read text fileread.csv(file=”myData.csv”)Read csv file (“,” – default separator)list.files(); dir()List all files in current directoryfile.show(file=”myData.csv”)Show file contentwrite.table(file=”myData.txt”,…)Save data into a filewrite.csv(file=”myData.csv”,…)Save data into csv formatted filecenter2108200Performance Tip: For large data files, specify optional parameters if known:read.table(file, nrows=10000, colClasses=c(”integer”,…), comment.char=””)When reading matrices, use scan() function instead of read.table()00Performance Tip: For large data files, specify optional parameters if known:read.table(file, nrows=10000, colClasses=c(”integer”,…), comment.char=””)When reading matrices, use scan() function instead of read.table()Exploring the data:FunctionDescriptionclass(x)Get class attribute of an objectnames(x)Function to get or set names of an objecthead(x), tail(x)Returns the first/last parts of vector, matrix, dataframe, functionstr(x)Structure of an objectdimnames(x)Retrieve or set dimnames of an objectlength(x)Get or set the length of a vector or factor summary(x)Generic function – produces summary of the dataattributes(x)List object’s attributesdim(x)Retrieve or set the dimension of an objectnrow(x), ncol(x)Return the number of rows or columns of vector, matrix or dataframerow.names()Retrieve or set the names of the rows R script fileR script is usually saved in a file with extension .R (or .r). # - serves as a comment indicator (every character on the line after #-sign is ignoredsource(“myScript.R”) will load the script into R workspace and execute itsource(“myScript.R”, echo=TRUE) will load and execute the script and also show the content of the fileR script example (weather.R)# This script loads data from a table and explore the data# Script is written for Introduction to R tutorial# Load datafileweather <- read.csv(“BostonWeather_sept2012.csv”)# Get header namesnames(weather)# Get class of the loaded objectclass (weather)# Get attributesattributes(weather)# Get dimensions of the loaded datadim(weather)# Get structure of the loaded objectstr(weather)# Summary of the datasummary(weather)Installing and loading R packagesTo install R package from cran website: install.packages(“package”)library( package )- loads package into workspace. Library has to be loaded every time you open a workspace.Another way to load package into workspace is require(package). Usually used inside functions. It returns FALSE and gives a warning (rather than error) if package does not exist.installed.packages() – retrieve details about all packages installed library() lists all available packages search() lists all loaded packages library(help = package) provides information about all the functions in a package Getting helpFunctionDescriptionExample?topicGet R documentation on topic?meanhelp(topic)Get R documentation on topichelp(mean)help.search(“topic”)Search the help for topichelp.search(“mean”)example(topic)Get example of function usageexample(mean)apropos(“topic”)Get the names of all objects in the search list that match string “topic”apropos(“mean”)methods(function)List all methods of the functionmethods(mean)function_namePrinting a function name without parenthesis in most cases will show its code meanR object types:Vector – a set of elements of the same type.Matrix - a set of elements of the same type organized in rows and columns.Data Frame - a set of elements organized in rows and columns, where columns can be of different types.List - a collection of data objects (possibly of different types) – a generalization of a vector.Vector creation (examples):#Create a vector using concatenation of elements: c()v1 <- c( 5,8,3,9)v2 <- c( “One”, “Two”, “Three” )#Generate sequence (from:to)s1 <- 2:5#Sequence function: seq(from, to, by, length.out)seq(0,1,length.out=5)[1] 0.00 0.25 0.50 0.75 1.00seq(1, 6, by = 3)[1] 1 4seq(4)[1] 1 2 3 4#Generate vector using repeat function: rep(x,times)rep(7, 3)[1] 7 7 7Accessing vector elements:Indexing vectorsDescriptionx[n]nth elementx[-n]all but nth elementx[1:n]first n elementsx[-(1:n)]elements starting from n+1x[c(1,3,6)]specific elementsx[x>3 & x<7]all element greater than 3 and less than 7x[x<3 | x>7]all element less than 3 or greater than 7-76200297815Performance Tip: R is designed to work with vectors very efficiently – avoid using loops to perform the same operation on each element – rather apply function on the whole vector!For large arrays avoid dynamic expansion if possible. Allocate memory to hold the result and then fill in the values. 00Performance Tip: R is designed to work with vectors very efficiently – avoid using loops to perform the same operation on each element – rather apply function on the whole vector!For large arrays avoid dynamic expansion if possible. Allocate memory to hold the result and then fill in the values. Useful vector operations:OperationDescriptionsort(x)Returns sorted vector(in increasing order)rev(x)Reverses elements of xwhich.max(x)Returns index of the largest elementwhich.min(x)Returns index of the smallest elementwhich (x == a)Returns vector of indices i, for which x[i]==ana.omit(x)Surpresses the observations with missing datax[is.na(x)] <- 0Replace all missing elements with zerosMatrix creation (examples):#Create a matrix using function: matrix(data,nrow,ncol,byrow=F)matrix( seq(1:6), nrow=2) [,1] [,2] [,3] [1,] 1 3 5 [2,] 2 4 6#Create a diagonal matrix: diag( )diag( 3 )diag( 4, 2, 2 ) [,1] [,2] [,3] [1,] 1 0 0 [2,] 0 1 0 [3,] 0 0 1 [,1] [,2] [1,] 4 0 [2,] 0 4 #Combine arguments by column: cbind()cbind(c(1,2,3), c(4,5,6)) [,1] [,2][1,] 1 4[2,] 2 5[3,] 3 6#Combine arguments by row: rbind()rbind(c(1,2,3), c(4,5,6)) [,1] [,2] [,3][1,] 1 2 3[2,] 4 5 6#Create matrix using array(x, dim) functionarray(1:6, c(2,3))) [,1] [,2] [,3] [1,] 1 3 5 [2,] 2 4 6Accessing matrix elements:Indexing matricesDescriptionx[i,j]Element at row i, column jx[i,]Row i (output is a vector)x[,j]Column j (output is a vector)x[c(1,5),]Rows 1 and 5 (output is a matrix)x[,c(2,3,6)]Columns 2 ,3 and 6 (output is a matrix)x[“name”,]Row named “name”x[,“name”]Column named “name”-65405220345Performance Tip: When calculating mean or a sum of a row/column elements use rowSums(), rowMeans(), colSums(), colMean() functions. They perform faster for matrices than sum() and mean() functions. For large matrices avoid dynamic expansion (using cbind() and rbind() if possible. Allocate memory to hold the result and then fill in the values. 00Performance Tip: When calculating mean or a sum of a row/column elements use rowSums(), rowMeans(), colSums(), colMean() functions. They perform faster for matrices than sum() and mean() functions. For large matrices avoid dynamic expansion (using cbind() and rbind() if possible. Allocate memory to hold the result and then fill in the values. Useful matrix operations:OperationDescriptiont(x)Transposex * yMultiply elements of 2 matricesx %*% yPerform “normal” matrix multiplicationdiag(x)Returns a vector of diagonal elementsdet(x)Returns determinant of matrixsolve(x)Returns inverse matrix (if exists), error-otherwisesolve(a,b)Returns solution vector for system Ax=browSums(), colSums()Returns vector with a sum of each row/columnrowMeans(),colMeans()Returns vector with mean values of each row/columnData frames:elements organized in rows and columns, where columns can be of different typesAll elements in the same column must have the same data typeUsually obtained by reading a data file.Can be created using data.frame() function#Create a data frame using function: data.frame()name <- c(“Paul”, “Simon”, “Robert”)age <- c(8, 12, 3)height <- c(53.5, 64.8, 35.2)family <- data.frame(Name = name, Age = age, Height = height);family Name Age Height1 Paul 8 53.52 Simon 12 64.83 Robert 3 35.2#To sort data frame using one columnfamily[order(family$Age),] Name Age Height3 Robert 3 35.21 Paul 8 53.52 Simon 12 64.8Accessing data frame elements:Indexing matricesDescriptionx[[i]]Accessing column i (returns vector)x[[“name”]]Accessing column named “name” (returns vector)x$nameAccessing column named “name” (returns vector)x[,i]Accessing column i (returns vector)x[j,]Accessing row j (returns dataframe!)x[i:j,]Accessing rows from i to jx[i,j]Accessing element in row i and column jx[i, “name”]Accessing element in row i and column “name”Lists:Generalization of vector: ordered collection of componentsElements can be of any mode or type Many R functions return list as their output objectCan be created using list() function#Create a list using function: list()lst <- list(name=“Fred”, no.children=3, child.ages=c(12,8,3))#Create a list using concatenation: c()list.ABC <- c(list.A, list.B, list.C)#List can be created from different R objectslist.misc<-list(e1 = c(1,2,3), e2 = list.B, e3 = matrix(1:4,2) )Accessing list elements:Indexing matricesDescriptionx[[i]]Accessing component i x[[“name”]]Accessing component named “name”x$nameAccessing component named “name”x[i:j,]Accessing components from i to jFactors:a numeric vector that stores the number of levels of a vector. It provides an easy way to store character strings common for categorical variables-104775379095Performance Tip: Use factors to store vectors (especially character vectors) that take only few values (categorical variables). Factors take less memory and are faster to process, than vectors00Performance Tip: Use factors to store vectors (especially character vectors) that take only few values (categorical variables). Factors take less memory and are faster to process, than vectorsFactor operations:OperationDescriptionfactor(x)Convert vector to a factorrelevel(x, ref=…)Rearrange the order of levels in a factorlevels(x)List levels in a factorattributes(x)Inspect attributes of a factortable()Get count of elements in each levelis.factor(x)Checks if x is a factor. Returns TRUE or FALSE cut(x, breaks)Divide x into intervals (factors)gl(n,k,length=n*k,labels=1:n)Generate factors by specifying patternRegression analysisFunctionDescriptionlm()Linear regressionglm()Generalized linear regressionnls()Non-linear regressionresiduals()The difference between observed values and fitted valuesdeviance()Returns the deviancegls()Fit linear model using generalized least squaresgnls()Fit nonlinear model using generalized least squaresx[,“name”]Column named “name”Miscellanies functions for data analysisFunctionDescriptionoptim()General purpose optimizationnlm()Minimize functionspline()Spline interpolationkmeans()k-means clustering on a data matrixts()Create a time seriest.test()Students’ t-testbinom.test()Binomial testmerge()Merge 2 data framessample()Samplingdensity()Kernel density estimates of xlogLik(fit)Computes the logarithm of the likelihood predict(fit,…)Predictions from fit based on input dataanova()Analysis of variance (or deviance)aov(formula)Analysis of variance modelDistributionsFunctionDescriptionrnorm(n, mean=0, sd = 1)Gaussianrunif(n, min=0, max = 1)Uniformrexp(n , rate=1)Exponentialrgamma(n , shape, scale=1)Gammarpois(n, lambda)Poissonrcauchy(n, location=0, scale=1)Cauchyrbeta(n , shape, scale=1)Betarchisq(n, df)Pearsonrbinom(n, size, prob)Binomialrgeom(n, prob)Geometricrlogistic(n, location=0, scale=1)Logisticrlnorm(n, meanlog=0, sdlog=1)Lognormalrt(n, df)Student ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download