Definition
Correlation is a measure of relationship between two variables. It has wide application in business and statistics. Correlation analysis is used to describe the strength and direction of the linear relationship between two variables.
Do the students who spend the most time studying achieve the highest marks in examinations and do those who spend the least time studying get the lowest marks? What we are asking here is whether the variable study time correlates with the variable examination performance. If we found that this was the case then we would say that there is positive correlation between the variables, that is, as a score on one variable increases so the corresponding score on the other variable does the same. Sometimes we find a correlation between two variables where as one goes up the other goes down. This is termed a negative correlation. We are likely to find a negative correlation between smoking and health as the more a person smokes the less healthy that person tends to be.
Examples of Correlation
- Marketing: The marketing manager wants to know if price reduction has any relationship with increasing sales.
- Production: The production department wants to know if the number of defective items produced has anything to do with the age of the machine.
- Human Resource: The HR department wants to know if the productivity of its workers decreases with the number of hours they put in.
- Social Sciences: A social activist wants to know if increasing female literacy has any association with increasing the age of marriage of the girl child.
- Research: An educationist wants to know if enforcing stricter attendance rules relates to students in performing better in their studies.
It is important to note that in correlation we have two different variables for which we use correlation, some hypothesis for which we can use correlation test are written below
- H1: Employee Job Satisfaction is associated with Employee Intention to Quit
- H2: Organizational Learning is related to Innovation Capability
- H3: Advertising is associated with sales of a product
In all the three hypothesis, we must note that there are 2 variables in each of the hypothesis, between whom we need to check for the relationship. On a simple level, the basic question being dealt with by correlation can be answered in one of three possible ways. Within any bivariate data set, it may be the case that the high scores on the first variable tend to be paired with the high scores on the second variable (implying, of course, that low scores on the first variable tend to be paired with low scores on the second variable). I refer to this first possibility as the high-high, low-low case. The second possible answer to the basic correlational question represents the inverse of our first case. In other words, it may be the case that high scores on the first variable tend to be paired with low scores on the second variable (implying, of course, that low scores on the first variable tend to be paired with high scores on the second variable). My shorthand summary phrase for this second possibility is high-low, low-high.
Finally, it is possible that little systematic tendency exists in the data at all. In other words, it may be the case that some of the high and low scores on the first variable are paired with high scores on the second variable, whereas other high and low scores on the first variable are paired with low scores on the second variable. I refer to this third possibility simply by the three-word phrase little systematic tendency.
There are a number of different statistics available from SPSS, depending on the level of measurement and the nature of your data. The procedure for obtaining and interpreting a Pearson product-moment correlation coefficient (r) is presented along with Spearman Rank Order Correlation (rho). Pearson r is designed for interval and Ratio level (continuous) variables. It can also be used if you have one continuous variable (e.g. scores on a measure of self-esteem) and one dichotomous variable (e.g. sex: M/F). Spearman rho is designed for use with ordinal level or ranked data and is particularly useful when your data does not meet the criteria for Pearson correlation.
Correlation Coefficient
The correlation coefficient gives a mathematical value for measuring the strength of the linear relationship between two variables. It can take values from 1 to 1 with:
- +1 representing absolute positive linear relationship (as X increases, Y increases).
- 0 representing no linear relationship (X and Y have no pattern).
- 1 representing absolute inverse relationship (as X increases, Y decreases).
Interpretation of Coefficient of Correlation
Generally, the coefficient of correlation is interpreted in verbal description. The rule of thumb for interpreting the size of a correlation coefficient is presented below :-
Size of Correlation |
Interpretation |
1 |
Perfect Positive/Negative Correlation |
+/- .90 to +/- .99 |
Very High Positive/Negative Correlation |
+/- .70 to +/- .90 |
High Positive/Negative Correlation |
+/- .50 to +/- .70 |
Moderate Positive/Negative Correlation |
+/- .30 to +/- .50 |
Low Positive/Negative Correlation |
+/- .10 to +/- .30 |
Very low Positive/Negative Correlation |
+/- .00 to +/- .10 |
Markedly Low and Negligible Positive/Negative Correlation |
References:
- Gaur, A., & Gaur, S. (2009). Statistical Methods for Practice and Research :A guide to data analysis using SPSS (2 ed.). New Delhi: Response Books.
- Huck, S. (2012). Reading Statistics and Research (6 ed.). Boston: Pearson.
- Pallant, J. (2011). SPSS Survival Manual: A step by step guide to data analysis using SPSS (4 ed.). New South Wales: Allen & Unwin.
Category: