AI, Data & Technology

Data structures in R – Part 1

August 11, 2020

862

R has wide options for holding data, such as scalars, vectors, matrices, arrays, data frames, and lists. Let’s look at each structure in this post.

Scalars

Scalars are one-element vectors. These are used to hold constants.

Example

a <- 1
b < "Phone"
c <- TRUE

Vectors

Vectors are one-dimensional arrays that hold numbers, characters, or logical data. The combine function c() is used to form a vector. Vectors can hold only one data type you can mix numbers with characters. Let’s look at some example

Numeric vector

a <- c(2,10,-5,15)

Character vector

b <- c("Male", "Female", "Neutral")

Logical vector

c <- c(TRUE, FALSE, FALSE, TRUE)

To refer an elements of a vector you can use square brackets. For example,

a<-c(2,4,6,8,10,12,14,16,18,20)
> a[6]
[1] 12
> a[3:6]
[1] 6 8 10 12
> a[c(1,7)]
[1] 2 14

Matrices

A matrix is a two-dimensional array where each element has the same data type. Matrices are created with the matrix function. The syntax for matric function is

a <- matrix(vector, nrow=number_of_rows, ncol=number_of_columns,
byrow=logical_value, dimnames=list( char_vector_rownames, char_vector_colnames))

where vector contains the elements for the matrix, nrow and ncol specify the row and column dimensions, and dimnames contains optional row and column labels stored in character vectors. The option byrow indicates whether the matrix should be filled in by row ( byrow=TRUE ) or by column ( byrow=FALSE ). The default is by column. The following listing demonstrates the matrix function.

Let’s see some examples for matrices now

Creating a 5×2 matrix

> a<-matrix(1:10, nrow=5,ncol=2)
> a
     [,1] [,2]
[1,]    1    6
[2,]    2    7
[3,]    3    8
[4,]    4    9
[5,]    5   10

Let’s create a 2x2 matrix with row and column label

> cells <- c(2,8,12,16)
> r <- c("A1","A2")
> c <- c("X1","X2")
> b<-matrix(cells,nrow = 2, ncol = 2, byrow = TRUE,dimnames = list(r,c))
> b
   X1 X2
A1  2  8
A2 12 16

In the above example, a matrix was created byrow = TRUE, try the same argument with FALSE and see the difference.

Subscripts in matrix

You can also subscript matrix using square brackets

> x<-matrix(11:20, nrow=2)
> x
     [,1] [,2] [,3] [,4] [,5]
[1,]   11   13   15   17   19
[2,]   12   14   16   18   20
> m <-x[,3]
> m
[1] 15 16
> n <-x[1,4]
> n
[1] 17
> o <-x[2,c(3,4,5)]
> o
[1] 16 18 20

First, we created a 2x5 matrix, then we subscript the matrix with square brackets mentioning the column number and row number.

Arrays

Arrays are similar to matrices, the difference is this can have more than two dimensions. If we create an array of dimension (2, 3, 4) then it creates 4 rectangular matrices each with 2 rows and 3 columns. Arrays can store only data type. This can be created with array function. The syntax for the function is

array<-array(vector, dimentions, dimnames)

Here vector contains the data for the array, dimensions is the numeric vector giving maximal index for each dimension and dimnames is an optional list of dimension labels. This is useful in programming new statistical methods.

Let’s see this with the following examples,

> column <- c("COL1","COL2","COL3") > row <- c("ROW1","ROW2","ROW3") > matrix <- c("Matrix1","Matrix2") > a <- array(1:24,c(3,3,2),dimnames = list(column,row,matrix)) > a
, , Matrix1

     ROW1 ROW2 ROW3
COL1    1    4    7
COL2    2    5    8
COL3    3    6    9

, , Matrix2

     ROW1 ROW2 ROW3
COL1   10   13   16
COL2   11   14   17
COL3   12   15   18

Keep reading about data structures. Data structures in R – Part 2

Scalars

Vectors

Subscripts in matrix

Arrays

RELATED ARTICLESMORE FROM AUTHOR

Understanding Data Ethics: A Beginner’s Guide

Data Cleaning and Preprocessing with Python

Machine Learning with Scikit-Learn

RELATED ARTICLES MORE FROM AUTHOR