# Descriptive Statistics in R

436

In a series of learning data analysis using R, Let’s see different methods to perform descriptive statistics in R. This includes measures of central tendency, variability, and distribution shape for continuous variable.

For this tutorial, we shall use built-in dataset `mtcars`. This dataset consists of 32 observations and 11 variables. We shall use three variables alone for calculating descriptive statistics.

Refine the data

```> data("mtcars")
> data<-c("mpg","hp","wt")
mpg  hp    wt
Mazda RX4         21.0 110 2.620
Mazda RX4 Wag     21.0 110 2.875
Datsun 710        22.8  93 2.320
Hornet 4 Drive    21.4 110 3.215
Valiant           18.1 105 3.460
> data("mtcars")```

The base R installation has `summary()` function which shall be used to obtain descriptive statistics.

Example for descriptive statistics

```> summary(mtcars[data])
mpg              hp              wt
Min.   :10.40   Min.   : 52.0   Min.   :1.513
1st Qu.:15.43   1st Qu.: 96.5   1st Qu.:2.581
Median :19.20   Median :123.0   Median :3.325
Mean   :20.09   Mean   :146.7   Mean   :3.217
3rd Qu.:22.80   3rd Qu.:180.0   3rd Qu.:3.610
Max.   :33.90   Max.   :335.0   Max.   :5.424```

The `summary()` function provides the minimum, maximum, quartiles, and the mean for numerical variables and frequencies for factors and logical vectors. The above results doesn’t include Standard deviation, Skewness, Kurtosis and Variance. What if you need to calculate these statistics?. For this you may use `stat.desc` function in `pastecs` package.

```> install.packages("pastecs")
> library(pastecs)
> stat.desc(mtcars[data])
mpg           hp          wt
nbr.val       32.0000000   32.0000000  32.0000000
nbr.null       0.0000000    0.0000000   0.0000000
nbr.na         0.0000000    0.0000000   0.0000000
min           10.4000000   52.0000000   1.5130000
max           33.9000000  335.0000000   5.4240000
range         23.5000000  283.0000000   3.9110000
sum          642.9000000 4694.0000000 102.9520000
median        19.2000000  123.0000000   3.3250000
mean          20.0906250  146.6875000   3.2172500
SE.mean        1.0654240   12.1203173   0.1729685
CI.mean.0.95   2.1729465   24.7195501   0.3527715
var           36.3241028 4700.8669355   0.9573790
std.dev        6.0269481   68.5628685   0.9784574
coef.var       0.2999881    0.4674077   0.3041285```

There are many other packages that are available you may try `describe()` function in `psych` package. and let me know which is your preferable function/package for calculating descriptive statistics.

Previous articleHow To Shop For CBD On Black Friday?
Next articleLearn Candlestick analysis
Author and Assistant Professor in Finance, Ardent fan of Arsenal FC. Always believe "The only good is knowledge and the only evil is ignorance - Socrates"
Subscribe
Notify of