Friday, May 8, 2009

May 9, 2009 - Correlation Notes

A. Correlation

A correlation is a relationship between two variables.

The correlated variables can be represented by an ordered pair (x, y) where x is the
independent, or explanatory, variable, and y is the dependent, or response, variable.
A scatter plot is the graphical representation of the ordered pairs in the form of points in a coordinate plane.

The correlation coefficient is a mathematical measure of the strength and the direction of a linear relationship between two variables. The symbol r represents the sample correlation coefficient. The range of r is –1 to 1.
Types of correlation:

• Negative linear correlation: A negative linear correlation implies that when x
increases, y tends to decrease. When r approaches –1, x and y are said to have a strong negative, or inverse, linear correlation.


• Positive linear correlation: A positive linear correlation implies that when x
increases, y tends to increase. When r approaches 1, x and y are said to have a strong positive, or direct, linear correlation.

• No correlation or weak correlation: No correlation or a weak linear correlation
implies that the magnitude of x has little or no effect on the magnitude of y. When r is near 0, x and y are said to have no correlation or a weak linear correlation.

• Nonlinear correlation: A scatter plot of the data shows a pattern, but it is not that of a line. It might resemble a U, an arc, or some other shape.
Cause and effect relations refer to situations when two variables, x and y, are related in such a way that changing one variable causes the other to change. Statisticians are careful to avoid claiming that because x and y are correlated, x causes y. In other words, you want to emphasize that correlation does not imply causation.

B. Linear Regression

The technique of fitting a linear equation to real data points gives a line called a regression line. This line is used to predict the value y—the response variable—for a given x, often called the predictor variable.

The equation of a regression line for an independent variable x and a dependent variable y is written as ŷ = mx + b, where m is the slope of the equation and ŷ is the predicted y-value for a given x-value.

1 comment:

  1. Hi there! This post could not be written any better!
    Looking through this post reminds me of my previous roommate!
    He constantly kept talking about this. I'll send this information to him. Pretty sure he'll have
    a good read. Thank you for sharing!

    Feel free to visit my web blog calories burned walking calculator

    ReplyDelete