A. Measures of Central Tendency
Terms such as “the most common” or “average” used in regular vocabulary refer to the typical or middle value of a data set. In descriptive statistics, this value is called a measure of central
tendency.
The three measures used most commonly to describe central tendency are mean, median, and
mode.
Mean (also called arithmetic average): The sum of the data entries divided by the number of
entries.
Median: The middle value of an ordered data set.
Outlier: A data entry that is “very different” from the other entries in the data set.
Mode: The data value that occurs most frequently in a data set.
While explaining mode, pay attention to the two special cases:
• No repeat entry
• Two entries that occur with the same highest frequency
Weighted mean: It calculates the mean of a data set by taking into consideration the weight
assigned to each data entry.
If in a frequency distribution graph, the mean, median, and mode are equal and located on the
same value of the x-axis, the distribution is symmetric.
A distribution in which the mean, median, and mode are unequal is called a skewed distribution.
A distribution where the graph has a tail stretching to the left is called skewed left. In this
distribution, mean < median < mode. If the graph of the distribution has a tail stretching to the
right, the distribution is called skewed right. In this distribution, mode < median < mean.
Outliers can create a skewed distribution.
B. Measures of Variation
Range: The difference between the largest and the smallest data entries.
Deviation: The difference between a data entry x in a population and the population mean μ, or
the difference between a data entry x in a sample and the sample mean x .
Variance: A measure of the deviation of the population data set or sample data set from its
mean. Population variance is represented using the symbol σ2—pronounced sigma square.
Saturday, March 28, 2009
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment