What is Pearson Correlation?

The Pearson product-moment correlation coefficient (often shortened to Pearson correlation or just correlation coefficient) is a measure of the linear correlation between two variables. It is represented by the symbol “r” and ranges from -1 to 1. It is the most commonly used correlation coefficient and is used when both variables are continuous and normally distributed.

The Pearson correlation coefficient measures the strength and direction of the linear association between two variables. A positive correlation means that as the value of one variable increases, the value of the other variable also increases. A negative correlation means that as the value of one variable increases, the value of the other variable decreases. The strength of the correlation is determined by the value of “r”, with a value of 1 indicating a perfect positive correlation, a value of -1 indicating a perfect negative correlation, and a value of 0 indicating no correlation.

It is calculated by dividing the covariance of the two variables by the product of their standard deviations. In other words, it tells us the ratio of the variation of one variable that can be explained by the other variable.

Pearson correlation coefficient is a useful tool to identify the linear relationship between two variables, but it’s important to note that it assumes that the data is normally distributed and that the relationship between the two variables is linear.

Example to understand Pearson Correlation

An example of the Pearson correlation coefficient in action would be a study looking at the relationship between the number of hours of exercise per week and the level of cholesterol in a group of individuals. In this study, the researchers collect data on the number of hours of exercise each individual engages in per week (the independent variable) and their cholesterol level (the dependent variable). They then use the Pearson correlation coefficient to calculate the correlation between the two variables.

Let’s say the correlation coefficient (r) is calculated to be -0.6, this means that there is a moderate negative correlation between the number of hours of exercise per week and cholesterol level. It means that as the number of hours of exercise increases, the cholesterol level decreases. A value of -1 would indicate a perfect negative correlation and a value of 0 would indicate no correlation.

It’s important to note that, this correlation coefficient tells us that there’s a relationship between the number of hours of exercise and cholesterol level, but it doesn’t imply causation, it could be that other factors such as diet, genetics, or age also play a role. Therefore, it’s important to conduct further analysis and research to determine the underlying cause of this correlation.

Previous articleWhat is Correlation?
Next articleWhat is Spearman rho Correlation? With Simple example
Author and Assistant Professor in Finance, Ardent fan of Arsenal FC. Always believe "The only good is knowledge and the only evil is ignorance - Socrates"
Notify of
Inline Feedbacks
View all comments