Boise Statistics: 2009

Friday, May 8, 2009

May 9, 2009 - Extra Credit Questions

In your work groups, you will receive extra credit for each question you answer correctly (post answers under your work group):

1. In a school, 50 boys and 50 girls are randomly selected to test the claim that the mean weights for 10-year-old boys and 10-year-old girls are the same. Which of the following tests should be used in this scenario?

a) One-sample test
b) Two-sample test

2. Which of the following is a valid pair of hypotheses?
a) H0:μ1=μ2;Ha:μ1≠μ2

b) 0 1 2 1 2 : ; :a H μ >μ H μ ≤μ
c) 0 1 2 1 2 : ; :a H μ ≠μ H μ =μ
d) 0 1 2 1 2 : ; :a H μ <μ H μ ≥μ

3. To test the claim that μ1 = μ2 , two samples are randomly selected from each population. If a hypothesis test is performed, how should you interpret a decision that rejects the null hypothesis?

a) There is sufficient evidence to support the claimμ1 = μ2 .
b) There is sufficient evidence to reject the claim μ1 = μ2 .
c) There is not sufficient evidence to support the claim μ1 = μ2 .
d) There is not sufficient evidence to reject the claim μ1 = μ2 .

4. For two-sample t-tests for small independent samples, the following condition or
conditions have to be met:

a) The samples must be randomly selected.
b) The samples must be independent.
c) Each population must have a normal distribution.
d) Each population must have a t-distribution.

5. For a two-sample t-test, if the test statistic is outside the rejection region, or regions, which of the following decisions should be made?

a) Fail to reject the alternative hypothesis.
b) Reject the alternative hypothesis.
c) Fail to reject the null hypothesis.
d) Reject the null hypothesis.

6. Two samples are considered independent if the sample selected from one population is not related to the sample selected from the second population.

a) True
b) False

7. For a two-sample hypothesis test, the alternative hypothesis is a statistical hypothesis that is reasonable when the null hypothesis is not accepted.

a) True
b) False

8. When doing 2 sample hypothesis tests, which of the following tests would you conduct if the samples sizes are less than 30, the populations are normal, the standard deviations are unknown, and the population variances are equal?

a) z-Test
b) t-Test

9. Given H0: μ1 ≥ μ2 , critical value t0 = −1.350, and standardized test statistic
t = −4.038 , you will:

a) Reject the null hypothesis.
b) Fail to reject the null hypothesis.

10. A crash test claims that the mean bumper repair cost for small cars is less than that for midsize cars. After conducting a t-test, a decision is made to reject the null hypothesis at the 5% level. The decision implies that:

a) There is enough evidence at the 5% level to support the claim that the mean bumper
repair cost is less for small cars than it is for midsize cars.
b) There is insufficient evidence at the 5% level to support the claim that the mean bumper repair cost is less for small cars than it is for midsize cars.
c) There is insufficient evidence at 5% level to reject the claim that the mean bumper repair cost is less for small cars than it is for midsize cars.
d) There is enough evidence at the 5% level to reject the claim that the mean bumper repair cost is less for small cars than it is for midsize cars.

May 9, 2009 - Content Covered

CONTENT COVERED

Elementary Statistics—Picturing the World:
o Chapter 9, Section 9.1, “Correlation,” pp. 458–473
o Chapter 9, Section 9.2, “Linear Regression,” pp. 474–483

May 9, 2009 - Animation, follow this www link

http://media.pearsoncmg.com/ph/esm/esm_larson_statlet_questions_2e/Leverage_Statlet/leverage.html

Included here is an animation that illustrates the concept of leverage that all points do not have equal influence on a fitted regression line.

May 9, 2009 - Problem One

Problem 1:

Excel has built-in functionalities that facilitate graphing of scatter plots and calculation of correlation coefficients r.

Follow the instructions listed in Student’s Solution & Technology Manual,
pages 341–344, to construct a scatter plot and calculate the correlation coefficient.

May 9, 2009 - Correlation Notes

A. Correlation

A correlation is a relationship between two variables.

The correlated variables can be represented by an ordered pair (x, y) where x is the
independent, or explanatory, variable, and y is the dependent, or response, variable.
A scatter plot is the graphical representation of the ordered pairs in the form of points in a coordinate plane.

The correlation coefficient is a mathematical measure of the strength and the direction of a linear relationship between two variables. The symbol r represents the sample correlation coefficient. The range of r is –1 to 1.
Types of correlation:

• Negative linear correlation: A negative linear correlation implies that when x
increases, y tends to decrease. When r approaches –1, x and y are said to have a strong negative, or inverse, linear correlation.

• Positive linear correlation: A positive linear correlation implies that when x
increases, y tends to increase. When r approaches 1, x and y are said to have a strong positive, or direct, linear correlation.

• No correlation or weak correlation: No correlation or a weak linear correlation
implies that the magnitude of x has little or no effect on the magnitude of y. When r is near 0, x and y are said to have no correlation or a weak linear correlation.

• Nonlinear correlation: A scatter plot of the data shows a pattern, but it is not that of a line. It might resemble a U, an arc, or some other shape.
Cause and effect relations refer to situations when two variables, x and y, are related in such a way that changing one variable causes the other to change. Statisticians are careful to avoid claiming that because x and y are correlated, x causes y. In other words, you want to emphasize that correlation does not imply causation.

B. Linear Regression

The technique of fitting a linear equation to real data points gives a line called a regression line. This line is used to predict the value y—the response variable—for a given x, often called the predictor variable.

The equation of a regression line for an independent variable x and a dependent variable y is written as ŷ = mx + b, where m is the slope of the equation and ŷ is the predicted y-value for a given x-value.

May 9, 2009 - Review and Analysis - Quick T/F Quiz

We have learned basic concepts and formulas to decide whether a relationship exits
between two variables. They learn to present paired data in a scatter plot and to calculate and interpret a correlation coefficient. If the data elements are found to be correlated, the students can find a linear equation that best models the relationship and draw a regression line using Excel. They also learn to use the equation of regression line to predict a y-value for a given x-value.

True or False - Answer in your work groups and post:

1. If there is a strong correlation between two variables, you can conclude that one
variable caused the other.

2. Correlation coefficient r close to –1 implies that there is no correlation between the two variables.

3. A correlation is a relationship between more than two variables.

4. A regression line is the line that maximizes the residuals.

5. The equation of a regression line can be used to predict the independent variable x value for a given y-value.

May 9, 2009 - CORRELATION AND REGRESSION

Overview - Key Concepts

A. Introduction to Correlation
a) Scatter plots
b) Correlation coefficient
B. Linear Regression
a) Defining regression line
b) Graphing regression line

Friday, May 1, 2009

May 2, 2009 - Extra Credit Questions

BONUS CREDIT QUESTIONS - For each question you post in your work groups correctly, you will be given 1 extra point, for a possibility of 10 extra points.

1. A random sample of 200 high school seniors is given the SAT-V test. The mean score for
this sample is 483. Assuming that the population is normally distributed, the mean score
μ for all high school seniors will be:

a) 2.415
b) 200
c) 483

2. Given the same sample statistics, which level of confidence will produce the narrowest
confidence interval?

a) 75%
b) 85%
c) 90%
d) 99%

3. The margin of error is present only when the sample size is less than 30.
a) True
b) False

4. When a t-distribution is used to estimate a population mean, the degrees of freedom are
equal to one less than the sample size.

a) True
b) False

5. A t-distribution is used when the random variable is distributed normally, the sample size
is < 30 and the value of σ (sigma) is unknown.

a) True
b) False

6. The critical value that corresponds to a 95% confidence level is:

a) ±1.645
b) ±1.96
c) ±2.33
d) ±2.575

7. The point estimate is a single value estimate for a sample statistic.

a) True
b) False

8. The most unbiased point estimate of the population mean μ is the sample mean.

a) True
b) False

9. Which of the following statements describe the properties of a t-distribution?

a) The t-distribution is bell-shaped and symmetric about the mean.
b) The total area under a t-curve is 1 or 100%.
c) The mean, median, and mode of the t-distribution are equal to zero.
d) As the degrees of freedom decrease, the t-distribution approaches the normal distribution.

10. The critical value, tc, for c = 0.99 and n = 10 is:

a) 1.833
b) 2.262
c) 2.281
d) 3.250

May 2, 2009 - Homework, Chapter 8

Complete the following 4 questions in your work groups and post as a team. *** I realize these questions are extremely difficult - and to that end, Dustin is in the LRC (Stats help) and you are encouraged to work with him for additional help. I am also distributing PowerPoint notes to help.

Complete the following exercises from your textbook
Elementary Statistics—Picturing the World:

1. Section 8.1, Exercise #4, p. 409
2. Section 8.1, Exercise #18, p. 411
3. Section 8.2, Exercise #14, p. 421
4. Section 8.2, Exercise #20, p. 422

May 2, 2009 - Final Project Outline (optional)

For those of you interested in working on a final project, the following is an example outline . The final project must be in a "PowerPoint" or MS Word format.

Steps in Statistical Stud<span style="font-weight: bold;">Des</span> Project Outline (a guide)

I. Identifying the "subject" of your study

What is the question? (What are my hypotheses?)- 1 slide
Is the data obtainable? (birth weight, socio economic, drugs, alcohol)- 1 slide
Is it ethical to obtain such data?
If not, is there a reasonable substitute?
Are the assumptions reasonable?- 1 slide

II. Designing

Identify the population of interest- 1 slide
Survey- several slides (how would you design the survey, you do not have to actually do the survey) * * * although, for extra credit (5 points) you can do a survey
- Obtain a representative sample of that population- 1 slide
  - Simple Random Sampling
  - Stratified Sampling (M-F, Age groups)
  - Systematic Sampling (class roster, census list)
  - Multi-Stage Sampling
- Sources of Bias- 1-2 slides
  - Voluntary Response
  - Non-response bias (day phone)
  - Response bias (people lie)
  - Undercoverage
Observational Studies- 1-3 slides
- Used when a designed experiment is not ethical
- Subjects studied over a period of time in natural setting
- Case/Control – Control must match
- Record Variables of interest
- Confounding is a major issue
Designing an Experiment1-5 slides
- Researcher has control over the subjects or units in the study
- An intervention takes place that otherwise would not occur
- Randomization used to assign treatments
- Strongest case for causality
EDA – Exploratory Data Analysis (trends, relationships, differences) - optional, 1 slide
Pilot Study

III. Collecting Data 1-3 slides

Identify variables
Identify types of variables
- Qualitative
- Quantitative
Identify Limits of measurement or observation

IV. Analyze the data 1-3 slides

Use proper procedures and techniques.
Check the assumptions behind the procedures and techniques.

V. Make Conclusions and Discuss Limitations 1-3 slides

What are the answers to the original hypotheses?
What are the limitations of the study?
What conclusions does the study not make?
What new questions arise from this study?

May 2, 2009 - Homework

Next week - please read Chapter 9.

CORRELATION AND
REGRESSION

May 2, 2009 - Summary

Class Summary:

Topics covered continue the previous discussion on hypothesis testing presented in
Unit 8 in which we learn to test a claim about a population mean and the difference of
means between two populations.

We learned to identify when they can conduct a z-test or a t-test for large and
small sample sizes and make a decision based on testing results.

May 2, 2009 - Class Notes

A. Testing the Difference Between Means (Large Independent Samples)

A null hypothesis, H0, is a statistical hypothesis that usually states that there is no
difference between the parameters of two populations. The null hypothesis always contains
the symbol ≤, =, or ≥.

An alternative hypothesis, Ha, is a statistical hypothesis that is true when H0 is false. The
alternative hypothesis always contains the symbol >, ≠, or <.

The Central Limit Theorem states that the difference of the sample means is normally
distributed when the following conditions are satisfied:
• The samples are randomly selected.
• The samples are independent.
• Each sample size is at least 30, or, if not, each population has a normal distribution
with a known standard deviation σ.
These three conditions are often called the assumptions of the statistical test.
When the difference of sample means is normally distributed:

B. Testing the Difference Between Means (Small Independent Samples)
When small samples—n < 30—are used and the population standard deviation is unknown,
the Central Limit Theorem does not apply. In this case, you can use a t-test to test the
difference between two population means μ1 and μ2 if the following conditions are met:

• The samples are randomly selected.
• The samples are independent.
• Each population has a normal distribution.

May 2, 2009 - HYPOTHESIS TESTING WITH TWO SAMPLES

Key Objectives for the class:

A. Testing the Difference Between Means (Large Independent Samples)
a) Defining null and alternative hypothesis
b) The Central Limit Theorem
c) Guidelines for a two-sample z-test for the difference between means

B. Testing the Difference Between Means (Small Independent Samples)
a) Guidelines for a two-sample t-test for the difference between means

May 2, 2009 - PowerPoint Notes

I will be distributing a CD ROM disc with copies of my notes with PowerPoint presentation for final exam/or project study purposes. The PowerPoints are an excellent review.

Friday, April 24, 2009

April 25, 2009 - In-Class Project, "21"

In class project review, and "21".

Think about how you could use statistics in your career, personal life, et al -- we will discuss how statistics can play an important role in the decision making process regarding your career, financial future (after you graduate from ITT), or in your own business.

April 25, 2009 - Final Exam sample questions to review (do not answer)

Examples of final exam questions to review: (*** please note, you can answer these questions, for extra credit - in your work groups -- if you have extra time)

The salaries of employees in a government agency can be classified as:
a) Quantitative data
b) Qualitative data

In a recent study of 400 randomly selected adults, the researchers found there is a
relationship between smoking cigarettes and developing emphysema." Which type of
statistics does the statement describe?
a) Inferential statistics
b) Descriptive statistics

What is the level of measurement for the data that can be classified according to color?
a) Ordinal
b) Nominal
c) Interval
d) Ratio

If the null hypothesis is rejected when in fact it is true, which type of error has been
committed?
a) Type I error
b) Type II error

Which of the following values represents a correlation coefficient?
a) –1.45
b) 0.0001
c) 1.05
d) 100

What is the type of correlation for the regression line yˆ = 5.12x − 0.3?
a) Positive linear correlation
b) Negative linear correlation

Assume that the variables, x and y, have a significant correlation. Given that the equation of
a regression line is y = –2x + 10, what is the best predicted value for y if x = 3?
a) –10
b) 4
c) 5
d) 10

April 25, 2009 - Chapter 7 Group Questions, quick multiple choice

1. A probability distribution curve is always bell-shaped and symmetric about the mean.
a) True
b) False

2. A z-score indicates a position of probability value under the normal curve.
a) True
b) False

3. The cumulative area for z = 0 is 0.5000.
a) True
b) False

4. Which of the following describes the properties of a normal distribution?
a) Mean, median, and mode are equal.
b) The normal curve is bell-shaped.
c) The normal curve is symmetric about the mean.
d) The normal curve touches the x-axis.

5. For any population distribution and any sample size n, the mean of the sampling
distribution of sample means is equal to the population mean.
a) True
b) False

6. According to the Central Limit Theorem, as the sample size increases, n ≥ 30, the
sampling distribution of sample means gets closer to a normal distribution.
a) True
b) False

7. The area under the standard normal curve between z = 0 and z = 3 is:
a) 0.0010
b) 0.4987
c) 0.9987
d) 1.0000

8. One of the properties of sampling distributions of sample means states that the mean of
the sample means, μ x is equal to the population mean.
a) True
b) False

9. The standard deviation of the sampling distribution of the sample means is called:
a) Standard error of the mean
b) Margin of error
c) Sampling error
d) Standard error of the median

April 25, 2009 - Homework, Reading

Next week, please review Chapter 8, HYPOTHESIS TESTING WITH
TWO SAMPLES.

April 25, 2009 - Summary

In this unit, we introduced to the basic concepts and techniques for hypothesis testing
and for identifying type I and type II errors.

We learned to conduct hypothesis tests for large samples and certain small samples and to
make a decision on a claim about a population parameter μ.

April 25, 2009 - Homework I

Please complete the following questions, in your groups:

1. Section 7.1, Exercise #10, p. 343
2. Section 7.1, Exercise #38, p. 344
3. Section 7.1, Exercise #48, p. 345
4. Section 7.2, Exercise #30, p. 359
5. Section 7.3, Exercise #20, p. 371

April 25, 2009 - Quick T/F in class quiz (in class work groups)

1. In a hypothesis test, you assume that the alternative hypothesis is true.

2. Statistical hypotheses are statements about the sample.

3. A type I error is committed when you fail to reject a null hypothesis when it is false.

4. If you want to support a claim, write it as a null hypothesis.

5. When using a P-value to make a conclusion in a hypothesis test with α as significance
level and P ≤ α , you should fail to reject H0.

6. When conducting z-test for mean μ, you should reject the null hypothesis if the P-value
falls within the rejection region.

7. The lower the P-value, the more evidence there is in favor of rejecting H0.

8. The degrees of freedom for a t-test, when n < 30, is equal to the sample size.

April 25, 2009 - Class Notes

A. Introduction to Hypothesis Testing

A hypothesis test is a process that uses sample statistics to test a claim about the value of a
population parameter.

A statistical hypothesis is a verbal statement, or claim, about a population parameter.
A null hypothesis H0 is a claim that contains a statement of equality such as ≤, =, or ≥.
An alternative hypothesis Ha is the complement of the null hypothesis. It is a statement that
must be true if H0 is false and the statement contains a statement of inequality such as >, ≠, or <.

A type I error occurs if the null hypothesis is rejected when it is true. For example, the null
hypothesis H0 claims that the new allergy drug lasts for 36 hours. The decision made from a
hypothesis testing is to reject H0, although H0 is true.

A type II error occurs if the null hypothesis is not rejected when it is false. In the given example,
the pharmaceutical company claims that the new allergy drug lasts for 36 hours. If the hypothesis
testing failed to reject the claim, although the actual truth is that the new drug does not last for 36
hours, we are making a type II error.
If the alternative hypothesis contains a less-than inequality symbol (<), the hypothesis test is a
left-tailed test. It can be mathematically expressed as:
H0: μ ≥ k
Ha: μ < k
If the alternative hypothesis contains a greater-than symbol (>), the hypothesis test is a righttailed
test. It can be mathematically expressed as:
H0: μ ≤ k
Ha: μ > k

If the alternative hypothesis contains the not-equal-to symbol (≠), the hypothesis test is a twotailed
test. In a two-tailed test, each tail has an area of one-half P. The hypotheses can be
mathematically expressed as:
H0: μ = k
Ha: μ ≠ k
B. Hypothesis Testing for the Mean (Large Samples)
The P-value of a hypothesis test is the probability of obtaining the sample statistic with a value as
extreme—or one that is more extreme—than the value obtained from the sample data. We reject
the null hypothesis if the P-value is less than the level of significance.
A rejection region, or critical region, of the sampling distribution is the range of values for
which the null hypothesis is not probable. If a test statistic falls in this region, the null hypothesis
is rejected.
A critical value z0 separates the rejection region from the no-rejection region.
C. Hypothesis Testing for the Mean (Small Samples)
When a sample size n is less than 30 and the random variable x is normally distributed, x follows
a t-distribution with n – 1 degrees of freedom.

April 25, 2009 - Key Concepts

KEY CONCEPTS

A. Introduction to Hypothesis Testing
a) Defining hypothesis testing
o Null and alternate hypothesis
o Type I and type II errors
b) Types of alternate hypothesis
o Left-tailed test
o Right-tailed test
o Two-tailed test

B. Hypothesis Testing for the Mean (Large Samples)
a) Hypothesis testing for large samples using the P-value
b) Hypothesis testing for large samples using rejection region
C. Hypothesis Testing for the Mean (Small Samples)
a) Hypothesis testing for small samples using t-test

Tuesday, April 14, 2009

April 18, 2009 - Summary (2nd Half of Class), & Homework Assignment

SUMMARY

In this chapter, we studied inferential statistics, finding a point estimate of
the mean and margin of error. We learned to construct and interpret confidence intervals for
the population mean for both large and small samples.

NEXT WEEK: HYPOTHESIS TESTING WITH ONE SAMPLE

Homework: Read Chapters 7 & 8 in the textbook.

April 18, 2009 - Homework, Part II

Complete the following homework activity and post as a group:

From your textbook
Elementary Statistics—Picturing the World:

1. Section 6.1, Exercise #24, p. 288

2. Section 6.1, Exercise #34, p. 288

3. Section 6.2, Exercise #8, p. 300

4. Section 6.2, Exercise #12, p. 300

April 18, 2009 - Quick T/F Group Activity

In your work groups, complete the following T/F questions and post as a group:

1. The sample mean x is a reliable point estimate of the population mean μ.

2. We can always estimate a population parameter using sample statistics, regardless of the
sample size or the type of population distribution.

3. The sample size is considered large when it reaches 100.

4. The degrees of freedom are equal to the sample size if the sample size is small.

5. A t-distribution is used when random variable is normally distributed and sample size is less
than 30.

6. The larger the sample size (n ≥ 30), the more skewed is the t-distribution.

April 18, 2009 - Class Notes (2nd Half of Class)

“Until now, you have focused on the first branch of statistics—descriptive statistics and probability.
You have learned to describe and graph data, calculate probabilities, and use properties of normal
distributions. In this unit, you will learn about inferential statistics.
This chapter will focus on how to estimate a population parameter and state how confident you are
about your estimate.”

* * * * * * *
A. Confidence Intervals for the Mean (Large Samples)

A point estimate is a single value estimate for a population parameter.
An interval estimate is an interval, or range of values, that is used to estimate the population
parameter.

The level of confidence c is the probability that the interval estimate contains the population
parameter. It states how confident we are that the interval estimate contains the population
parameter.

The difference between the point estimate and the actual population parameter value is called the sampling error and is denoted by x − μ .
Given a level of confidence, the greatest possible sampling error is called the margin of error. It is, sometimes, also called the maximum error of estimate or error tolerance and is denoted by E.
When the population standard deviation is known, E can be calculated using the formula:

n
E zc x zc
σ
= σ = .
A c-confidence interval for the population mean μ is written as:
x − E < μ < x + E

Here, c is the probability that the confidence interval contains μ.
The steps for determining the confidence interval for a population mean, when the sample size
n ≥ 30 or the sample comes from a normally distributed population, can be listed as follows:

Step 1: Find the sample statistics.
Step 2: Calculate standard deviation, s.
Step 3: Find the critical values.
Step 4: Calculate the margin of error.
Step 5: Form the confidence interval x − E < μ < x + E .

B. Confidence Intervals for the Mean (Small Samples)

A t-distribution is used when a sample size is less than 30, and the random variable x is
approximately normally distributed. The properties of t-distribution are as follows:

1. It is bell-shaped and symmetric about the mean.
2. The mean, median, and mode of the t-distribution are equal to zero.
3. The total area under a t-curve is 1, or 100%.
4. The t-distribution is a family of curves, each determined by a parameter called the degrees
of freedom, also referred to as d.f. The degrees of freedom are the number of free choices
left after a sample statistic such as x is calculated. When you use a t-distribution to estimate
a population mean, the degrees of freedom are equal to one less than the sample size.
d.f. = n – 1
5. As the degrees of freedom increase, the t-distribution approaches the normal distribution.
After 30 degrees of freedom, the t-distribution is very close to the standard normal z distribution.

Constructing a confidence interval using the t-distribution involves using a point estimate and a
margin of error. The following steps can be used for constructing a confidence interval for the mean of t-distribution.

Step 1: Assuming that the sample comes from a normally distributed population, identify the
sample statistics n, x , and variance s. Use the formulas
n
x
x Σ = and
1
( )2
−
−
= Σ
n
x x
s .

Step 2: Identify the degrees of freedom, the level of confidence c, and the critical value tc using the t-distribution table. Remember, d.f. = n – 1.

Step 3: Find the margin of error E using the formula
n
s
E = tc .

Step 4: Find the left and right endpoints and form the confidence interval. Use the following
formulas:
• Left endpoints: x − E
• Right endpoints: x + E
• Interval: x − E < μ < x + E

April 18, 2009 - CONFIDENCE INTERVALS (Chapter 6)

KEY CONCEPTS TO REMEMBER:

A. Confidence Intervals for the Mean (Large Samples)

• Point and interval estimate
• Level of confidence
• Sampling error
• Margin of error
• Confidence interval

B. Confidence Intervals for the Mean (Small Samples)
• t-Distribution
• Properties of t-distribution

April 18, 2009 - Unit Summary

SUMMARY

Key Points to remember:

• z-score

• Area under the standard normal curve

• Probabilities for normally distributed variables

• Probability for given data values
In this unit, the students learn how to interpret the most fundamental concept in inferential
statistics—the Central Limit Theorem.

April 18, 2009 - Homework Assignment 1

In your class work groups (the groups you have been working in over the last few classes) - complete the following homework assignment and post as a group:
Elementary Statistics⎯Picturing the World:
1. Section 5.1, Exercise #26, p. 225

2. Section 5.1, Exercise #60, p. 228

3. Section 5.2, Exercise #12, p. 232

4. Section 5.3, Exercise #44, p. 244

5. Section 5.4, Exercise #18, p. 255; do not sketch
the graph

April 18, 2009 - Class Notes

Class Lecture Notes for April 18, 2009 (talking points)

A. Introduction to Normal Distributions

A continuous probability distribution is the probability distribution of a continuous random variable.

A normal distribution is a continuous probability distribution describing the behavior of a normal
random variable. A normal probability distribution has a graph that is symmetric and bell shaped. Its mean, median, and mode are equal and determine the axis of symmetry. The graph of a normal distribution is defined for all numbers on the real number line. As the random variable x moves further and further from the mean—in either direction—the graph of the normal distributions approaches butnever touches the x-axis.

Between the points x = μ – σ and x = μ + σ, the graph is curved downward. To the left of x = μ – σ and to the right of x = μ + σ, the graph is curved upward. The points x = μ – σ and x
= μ + σ are called inflection points.

The normal curve, or the bell-curve, is the graph of a normal distribution.
Properties of a normal distribution can be listed as follows:
• The mean, median, and mode are equal.
• The normal curve is bell-shaped and symmetric about the mean, μ.
• The total area under the curve is equal to 1.
• The normal curve approaches the x-axis but never touches the axis as it extends further and
further away from the mean.
• At the center of the curve, between (μ − σ) and (μ + σ), the graph curves downward. The graph
curves upward to the left of (μ − σ) and to the right of (μ + σ).

Properties of a standard normal distribution can be listed as follows:
• The standard normal curve is bell shaped and symmetric about 0.
• The total area under the curve is equal to 1.
• The standard normal curve approaches the x-axis but never touches the axis as it extends
further and further away from the mean.
• At the center of the curve, between –1 and 1, the graph curves downward. The graph
curves upward to the left of –1 and to the right of 1.
• The cumulative area is close to 0 for z-scores close to z = −3.49.
• The cumulative area increases as the z-scores increase, but it never exceeds 1.
• The cumulative area for z = 0 is 0.5000.
• The cumulative area is close to 1 for z-scores close to z = 3.49.

B. Normal Distributions: Finding Probabilities
The probability of a normally distributed random variable x can be calculated using the following
guidelines:

Step 1: Find the x-values of the upper and lower bounds of the given interval.
Step 2: Convert the x-values to z-scores using the formula:
Step 3: Sketch the standard normal curve and shade the appropriate area under the curve.
Step 4: Find the area by following the same directions as given in the table for the standard normal probability distribution.

C. Normal Distributions: Finding Values of the Random Variable x, Given the Standard Normal
Random Variable z
Find variable x-values within areas under the standard normal curve:
Step 1: Determine the position of the area corresponding to the given probability.
Step 2: Find the corresponding z-scores for the area using the standard normal distribution table.
Here, you may have two cases:

• Area to the left of z
• Area to the right of z
Step 3: Transform the z-score to an x-value, using the formula: x = μ + zσ
D. Sampling Distributions and the Central Limit Theorem
A sampling distribution is the probability distribution of a sample statistic that is formed when
samples of size n are repeatedly taken from a population.
If the sample statistic is the sample mean, then the distribution is the sampling distribution of sample
means.
The properties of sampling distributions of sample means can be listed as follows:
1. The mean of the sample means μ x is equal to the population mean.
μ x = μ
2. The standard deviation of the sample means σ x is calculated by dividing the population
standard deviation σ by the square root of n—the sample size.
n
x
σ
σ =
σ x is also known as the standard error of the mean.

The Central Limit Theorem is an important concept in inferential statistics. It enables you to make inferences about a population mean based upon sample statistics.

Saturday, April 11, 2009

April 11, 2009 - Reading Assignment

Please have read through chapters 5 & 6 by April 18, 2009.

April 11, 2009 - WWW Links

Links that may be of interest:

http://www.bea.gov/

http://www.fedstats.gov/

http://www.commerce.gov/

http://www.census.gov/

http://www.dol.gov/

Friday, April 10, 2009

April 11, 2009 - Normal Probability Distributions

NORMAL PROBABILITY DISTRIBUTIONS

KEY CONCEPTS

A. Introduction to Normal Probability Distributions
a) Continuous probability distribution
o Normal distribution
o Properties of a normal distribution
o Standard normal distribution
o Properties of a standard normal distribution

B. Normal Distributions: Finding Probabilities
a) Probability of a random variable

C. Normal Probabilities: Finding Values
b) Find variable x-values within areas under the standard normal curve

D. Sampling Distributions and the Central Limit Theorem
a) Defining sampling distribution
b) Properties of sampling distributions of sample means
c) Central Limit Theorem

April 11, 2009 - Homework, Part II

Complete the following exercises from your textbook
Elementary Statistics—Picturing the World (and post them on the blog):

1. Section 4.1, Exercise #12, p. 179

2. Section 4.1, Exercise #24, p. 181

3. Section 4.2, Exercise #8, p. 194

4. Section 4.2, Exercise #10, p. 194

April 11, 2009 - Assignment, Homework, The Poll

Assume the following facts:

A political polling organization conducted a survey. As a part of the survey, the organization
calls 1,012 people and asks, “Do you approve, disapprove, or have no opinion of the way the president is handling his job?”

The random variable x represents the number of people who approve of the way the president is
handling his job.

* * * * * *

1. Is the given experiment a
binomial experiment?

2. Assume that the polling question
is revised to: “Do you approve or disapprove
of the way the president is
handling his job?”

The random variable x represents
the number of people who
approve of the way the president
is handling his job.

Now, this question represents a
binomial experiment. Determine
which of the following outcomes
will denote “success” for this
experiment.

• No opinion
• Approve
• Disapprove

April 11, 2009 - Quick "T & F" Quiz

True or False

1. The expected value of a discrete random variable is equal to the mean of the random variable.

2. Continuous random variables represent countable data, and discrete random variables represent
uncountable data.

3. It is possible for the sum of all probabilities of a random variable to exceed 1.

4. A binomial experiment is repeated for a fixed number of trials, and each trial is dependent on
the other trials.

5. There are only two possible outcomes—success and failure—from a binomial experiment.

April 11, 2009 - Quick "WWW" excercise

The following link leads to an animation that performs the parking lot simulation described in the text.

Experiment with the animation.
http://media.pearsoncmg.com/ph/esm/esm_larson_statlet_questions_2e/Pick_a_Lane_Statlet/pick.html

In your opinion, which strategy—“Pick a Row” or “Cycling”—saves most time? After comparing the time to walk and drive, which strategy are you likely to choose.

Explain your answer.

April 11, 2009 - Lecture Notes

Certain applications, such as those used for weather forecasting and space research, require the
collection and analysis of large amounts of data.

For such applications, data is often collected using probability experiments and the outcome of these experiments is organized to form probability distributions. The shape, central tendency, and variability of probability distributions help analyzers
find patterns in the data set and make predictions and decisions.

You will be introduced to discrete probability distributions—a specific type of probability
distribution. Continuous probability distributions will be covered later in the course

* * * * * * * *

A. Probability Distributions

A random variable, x, represents a numerical value associated with each outcome of a probability experiment.

A discrete random variable is a random variable with countable possible outcomes that can be listed. A continuous random variable is a random variable with an uncountable number of possible outcomes represented by an interval on the number line.

B. Binomial Distributions
Binomial experiments produce only two outcomes per trial, often called Success S and Failure F.
Examples include the possible outcomes when flipping a coin or answering a question with two answer options.

A binomial experiment is a discrete probability experiment that must satisfy the four conditions given below:

Condition 1: The experiment is repeated for a fixed number of trials, n. For example, a coin is
flipped 10 times. Each trial is independent of the other trials.

Condition 2: There are only two possible outcomes of interest for each trial. One of these outcomes is classified as a success (S) and the other as a failure (F). For example, flipping a coin has two possible outcomes—heads or tails. You may consider the occurrence of heads as a success and tails as a failure.

Condition 3: The probability of a success P(S) and the probability of a failure P(F) is the same for
each trial. For example, the probability of getting a six when tossing a fair dice is 1 6
and the probability of not getting a six is 5 6 and these probabilities are the same regardless of how many times the dice is thrown.

Condition 4: The random variable x counts the number of successful trials in the total number of
trials, n: x = 0, 1, 2, 3, …, n. If six is the result two times in 10 flips, x = 2.

April 10, 2009 - DISCRETE PROBABILITY DISTRIBUTIONS

KEY CONCEPTS

A. Probability Distributions

a) Defining random variable
b) Types of random variable
o Discrete random variable
o Continuous random variable
c) Discrete probability distribution

B. Binomial Distributions

a) Defining binomial experiment
b) Conditions for binomial experiment

Saturday, April 4, 2009

April 4, 2009 - Homework Reading

Please review Chapter 4 for next week.

Friday, April 3, 2009

April 4, 2009 - Homework, Probability

Complete the following exercises from your textbook
Elementary Statistics—Picturing the World:

1. Section 3.1, Exercise #14, p. 125

2. Section 3.1, Exercise #20, p. 126

3. Section 3.2, Exercise #16, p. 136

4. Section 3.3, Exercise #18, p. 146

April 4, 2009 - Probability & Cards

A standard deck of cards contains a total of 52 cards. Each of the four suits—Spade, Heart,
Diamond, and Club—contains 13 cards, 10 of them numbered from 2 to 10 and one each of an
A(ace), a J(jack), a Q(queen), and a K(king). The two jokers are excluded.

The probability of selecting a
card from the standard deck
and drawing a Queen of
Hearts is:
• 0.5
• 0.0192
• 1.0

Answer: 0.0192

The probability of drawing a
Queen from the deck is:
• 1
• 0
• 0.45
• 0.0769

Answer: 0.0769

What’s the probability of not
selecting a Queen from the
standard deck of cards?
• 0.9231
• 0.222
• 0.0769
• 0.126

Answer: 0.9231

Tip:
The key here is to know that
not selecting a Queen is the
complement of selecting a
Queen. In addition, from
Problem 2, we know that the
probability of selecting a
Queen is 0.0769.
Therefore, this problem can
be solved using the:
1. Probability of
selecting a Queen
2. Formula for finding
the probability of the
complement event,
P(E’) = 1 – P(E)

April 4, 2009 - True or False Questions

True, False, or Subjective Probability

1. There is a 200% chance of thunderstorm tonight.
2. Classical probability of an event is the relative frequency of the event.
3. The complement of voting for Democratic Party is voting for other parties plus not voting
for any party.
4. John expects a very high chance of winning the poker game. What type of probability is it?
5. Conditional probability is the probability of a single event.
6. If events A and B are dependent, then P(A and B) = P(A) · P(B).
7. If two events are mutually exclusive, they have no outcomes in common.
8. If two events are independent, then they are also mutually exclusive.
9. The addition rule is used to find the probability of at least one of the two events occurring.

April 4, 2009 - Probability (www) Interactive

The following link leads to an animation that illustrates the additive and multiplicative laws of
probability. The amount of flow through each pipe represents probability.

http://media.pearsoncmg.com/ph/esm/esm_larson_statlet_questions_2e/Probability_Statlet/probability.html

April 4, 2009 - Class Notes

A. Basic Concepts of Probability
a) Defining probability
b) Probability experiment
c) Types of probability
d) Fundamental concepts related to probability
o The law of large numbers
o Range of probabilities rule
o Subjective probability
o Complement of an event E
B. Conditional Probability and the Multiplication Rule
a) Defining conditional probability
b) Defining multiplication rule
C. The Addition Rule
a) Mutually exclusive events

In units 2 and 3, you learned about collecting and describing data. In this unit, you will
strengthen your foundation in descriptive statistics by learning the concepts of probability.
In this unit, you will learn about various types of probability and the method of calculating
probability using various rules.

A. Basic Concepts of Probability

Probability refers to the likelihood of the occurrence of uncertain events.
A probability experiment is a trial through which specific results or outcomes are obtained.
An event consists of one or more outcomes and is a subset of the sample space. A simple event
has a single outcome, such as rolling a dice and obtaining 4.
Classical or theoretical probability refers to the type of probability when each outcome in a
sample space is equally likely to occur.
The classical probability of occurrence of an event E is given by:
Empirical or statistical probability is based on observations obtained from probability
experiments.
The empirical probability that an event E will occur is given by:
n
f
P(E) =
where,
f is the frequency of the event E occurring.
n is the total frequency of the experiment. n is sometimes denoted as Σf.
The law of large numbers
According to the law of large numbers, if an experiment is performed repeatedly, the empirical
Number of outcomes in an event
( )=
Total number of outcomes in a sample space
P E

probability of an event will be close to its theoretical or actual probability.
Range of probabilities rule
According to this rule, the probability of an event E is always between 0 and 1. Mathematically, it
is expressed as: 0 ≤ P(E) ≤ 1
Subjective probability
It describes an individual's personal judgment about the likelihood of the occurrence of an event.
It is based on estimates, intuition, and educated guess.
Complement of an event E
It refers to the set of all outcomes in a sample space that are not included in an event E. It is
denoted as E’—pronounced E prime. The probability of the complement of an event E is
calculated as follows:
P(E’) = 1 – P(E)
B. Conditional Probability and the Multiplication Rule
Conditional probability is the probability of an event B occurring, given that another event A
has already occurred. It is denoted by P(B|A).
Independent events do not affect the probability of occurrence of another event. For example,
getting a 2 after rolling a dice and getting a 2 on the next roll are independent events.
When two events A and B are independent, then:
P(B|A) = P(B) and P(A|B) = P(A)
In other words, if two events are independent, then
P(Aand B)=P(A)iP(B)
Dependent events are not independent.
The multiplication rule is used to determine the probability of the occurrence of two events A
and B in sequence.
The formula for multiplication rule is represented as follows:
P(A and B) = P(A) · P(B|A). However, if the events are independent, this formula reduces to
P(Aand B) =P(A)iP(B) .
C. Addition Rule
Two events are mutually exclusive if they have no outcomes in common. In other words, when
events are mutually exclusive, they cannot occur at the same time.
The addition rule is used to find the probability of occurrence of event A or B. Mathematically,
the addition rule is represented as follows:
P(A or B) = P(A) + P(B) – P(A and B), where P(A and B) is the probability of events A and B
occurring at the same time. If the events A and B are mutually exclusive P(A and B) = 0, and the
formula reduces to P(A or B) = P(A) + P(B).

Saturday, March 28, 2009

March 28, 2009 - Class Notes

A. Measures of Central Tendency

Terms such as “the most common” or “average” used in regular vocabulary refer to the typical or middle value of a data set. In descriptive statistics, this value is called a measure of central
tendency.

The three measures used most commonly to describe central tendency are mean, median, and
mode.

Mean (also called arithmetic average): The sum of the data entries divided by the number of
entries.

Median: The middle value of an ordered data set.

Outlier: A data entry that is “very different” from the other entries in the data set.

Mode: The data value that occurs most frequently in a data set.

While explaining mode, pay attention to the two special cases:
• No repeat entry
• Two entries that occur with the same highest frequency
Weighted mean: It calculates the mean of a data set by taking into consideration the weight
assigned to each data entry.

If in a frequency distribution graph, the mean, median, and mode are equal and located on the
same value of the x-axis, the distribution is symmetric.

A distribution in which the mean, median, and mode are unequal is called a skewed distribution.

A distribution where the graph has a tail stretching to the left is called skewed left. In this
distribution, mean < median < mode. If the graph of the distribution has a tail stretching to the
right, the distribution is called skewed right. In this distribution, mode < median < mean.
Outliers can create a skewed distribution.

B. Measures of Variation

Range: The difference between the largest and the smallest data entries.

Deviation: The difference between a data entry x in a population and the population mean μ, or
the difference between a data entry x in a sample and the sample mean x .

Variance: A measure of the deviation of the population data set or sample data set from its
mean. Population variance is represented using the symbol σ2—pronounced sigma square.

March 28, 2009 - Descriptive Statistics (Part II)

Outline

A. Measures of Central Tendency
o Mean
o Median
o Mode
B. Measures of Variation
o Variance
o Standard deviation
C. Measures of Position
o Percentiles

March 28, 2009 - Take Home Quiz Assignment

You will have 45 minutes in class to complete the following assignment/quiz (11:45 to 12:30).

Title: Frequency Distributions and Their Graphs

Introduction: This set of exercises will help you read and construct
a frequency distribution and organize data using a graph.

Tasks:

Complete the following exercises from your textbook
Elementary Statistics—Picturing the World:

1. Section 2.1, Exercise #10, p. 43

2. Section 2.1, Exercise #24, p. 45

3. Section 2.2, Exercise #20, p. 58

GOOD LUCK!

March 28, 2009 - Using Excel

Excel has built-in functionalities that facilitate quick and easy graphing of histograms, pie
charts, and scatter plots.

Follow the instructions listed in Student Solution & Technology Manual,
pages 207–212, to construct a pie chart for the given data set.

The data represents the top seven American Kennel Club registrations, in thousands, in 2003.
(Source: American Kennel Club)

Breed Labrador 145
Retriever 53
Golden 45
Retriever 44
Beagle 39
German
Shepherd 38
Dachshund 34

March 28, 2009 - Histogram

The following link leads to an animation that illustrates the effect of a sample size and the
number of classes on a histogram:

http://media.pearsoncmg.com/ph/esm/esm_larson_statlet_questions_2e/Sample_Size_Histogra
m_Statlet/histogram.html

Experiment with the animation, using the:

• Horizontal scrollbar to change the sample size

• Vertical scrollbar to change the number of classes

Respond to this question: “How do you think the sample size and number of classes affect a histogram?”

March 28, 2009, Quick Quiz

Quiz - True or False

1. The midpoint of a class is the sum of its lower and upper limits.

2. The cumulative frequency of a class is the sample size divided by the frequency of the
class.

3. Relative frequency is the portion of the data that falls in that class.

Friday, March 20, 2009

March 21, 2009, Class Notes

A. Frequency Distributions

Classes or intervals are units used to group data entries.

A frequency distribution table shows the number of data entries—frequency (f)—in each
class.

The class width is the difference between the upper and lower limits of each class. It can be
calculated using the following formula:
____Range______

Number of classes
Range of a data set is the difference between the maximum and minimum data entries in the
set. Range is calculated as: Maximum data entry − Minimum data entry
The class boundary is the half-way point between two classes.

Midpoint = (Lower limit) + (Upper limit)

Cumulative frequency of a class is the sum of the frequencies for that class and all the
previous classes.

Relative frequency is the percentage of the data in a particular class. It can be calculated
using the formula:

Relative frequency =
n
f
where,
f is the class frequency and n is the sample size.

B. Graphs and Displays

A histogram is a bar graph often used to display quantitative data. The horizontal scale
displays the class boundaries or midpoints. The vertical scale indicates the frequencies of each
class. In a histogram, the bars touch each other.

A pie chart is a circular graph divided into sectors that represent qualitative data categories,
such as colors, races, and genders. The area of each sector is proportional to the relative
frequency of each data category.

A scatter plot is a graph that represents the relationship between paired data, where each entry
in a data set corresponds to an entry in the second data set. The pair of data entries is shown as
a point or dot in the coordinate plane.

March 21, 2009, Descriptive Statistics Outline

KEY CONCEPTS THAT MUST BE COVERED IN CLASS

A. Frequency distributions
o Meaning of a Frequency distribution
o Constructing a Frequency distribution
B. Graphs and displays
o Histogram
o Pie chart
o Scatter Plot

March 21, 2009, Case Study #2

Assume that you are conducting a study to determine the number of years of education of the
teachers in your college. Here are some situations related to this study.

1. You randomly select two departments and survey each teacher in those departments.
Each department is a naturally occurring subdivision of a college, sharing similar
characteristics. All departments are not surveyed. Which of the sampling techniques will
you use in this situation?

2. Assume that for the survey, you select only the teachers who are instructing you in the
current semester. Which of the sampling techniques will you use in this situation?

3. You categorize the teachers according to their departments and then survey some teachers
in each department. Which of the sampling techniques will you use in this situation?

March 21, 2009 - Case Study #2, 80's Hair Bands! Class Activity

The following table lists the top five music bands according to the box office ranks and the ticket
prices for their concerts as on July 23, this year.
Ranks Data Set #1 Data Set #2
Band Average Ticket Price
1 The Eagles $104
2 Dave Matthews Band $85
3 The Dixie Chicks $68
4 Fleetwood Mac $60
5 Cher $42
Questions:
1. Identify the level of measurement for the first data set—or the top five bands.
2. Identify the level of measurement for the second data set.

Thursday, March 12, 2009

March 14, 2009 - Case Study #1

Case 1:

A survey of 2,104 households in the United States found that 65% of them subscribe to cable
television. The survey also found that the households without Internet connection are twice more likely to subscribe to a daily newspaper than the households with Internet connection. The
households with Internet connection often get their news from Web sites instead of daily
newspapers.

Questions:

1. Identify the population and the sample for the survey.

2. Of the 2,104 households surveyed, how many households subscribe to cable television?

3. A follow-up survey of a sample of 1,200 U.S. households found that 360 households have
high-speed Internet connection. What does the number 360 represent?

4. Which statement in the given data represents the descriptive branch of statistics?

5. Which statement in the given data represents the inferential branch of statistics?

March 14, 2009 - Statistics T/F Excercise

TRUE OR FALSE (Post true or false for each question)

1. A statistic is a measure that describes a population characteristic.

2. The two main branches of statistics are population and sample.

3. Inferential statistics involves using a population to draw a conclusion about a
corresponding sample.

4. Data at the ordinal level is quantitative only.

5. Data at the ratio level cannot be put in order.

6. For data at the interval level, you cannot calculate meaningful differences between

7. Using a systematic sample guarantees that members of each group within a
population will be sampled.

8. A census is a count of part of a population.

9. Performing an experiment is the only way to collect reliable data.

March 14, 2009 - Overview of Statistics I

A. Overview of Statistics

Statistics is defined as the science of collecting, organizing, analyzing, and interpreting data in
order to make decisions. It uses mathematical formulas for analysis, and it involves the
understanding and interpretation of the results.
Statistics involves studying two types of data sets—population and sample.

Population:
• It is a collection of all outcomes, responses, measurements, and counts that are of interest.
• The numerical description of a characteristic of a population is called a parameter.
• A count or measure of an entire population is called a census.
Sample:
• A subset of a population is called a sample.
• The numerical description of a characteristic of a sample is called a statistic.
• A count or measure of a part of a population is called a sample.

The study of statistics is divided into two major branches:

Descriptive statistics: The branch of statistics that involves organization, summarization, and
display of data

Inferential statistics: The branch of statistics that involves the use of a sample to draw
conclusions about a population

B. Data Classification

A clear understanding of the meaning of the term “data” is central to the study of statistics. Data
consists of information related to observations, counts, measurements, or responses.

Class Activity

Qualitative: It consists of attributes, labels, or nonnumeric entries. For example, gender—male
or female—refers to qualitative data.

Quantitative: It consists of numerical measurements or counts. For example, age—1, 2, 10, or
20—refers to quantitative data.

Another way to classify data is by its level of measurement, which determines the relevance of
statistical calculations. Using this categorization, we have nominal, ordinal, interval, and ratio
data.

C. Experimental Design

In the real world, statistical results can mislead or misrepresent the facts if the research
conducted does not use proper procedures. Therefore, while making a decision based on
statistical analysis, we should be aware of the process used to obtain the data and the potential
misuse of the data. Given here are some guidelines for designing a statistical study:

1. Identify the variables of interest and the population of the study.
2. Develop a detailed plan for collecting data.
3. Collect the data using any of the following methods:
i. Doing an observational study
ii. Performing an experiment
iii. Using a simulation
iv. Using a survey
4. Describe the data using appropriate descriptive statistics or graphs or both.
5. Interpret the data and make decisions about the population by using inferential statistics.
6. Identify any possible errors.

Surveys can be done by taking a census or using a sample. To ensure an accurate representation
of the population, appropriate sampling techniques should be used; otherwise, the results from
the study may be considered invalid. Some commonly used sampling techniques are random
sampling, stratified sampling, cluster sampling, systematic sampling, and convenience sampling.

HOMEWORK: DESIGN YOUR STUDY & POST FOR EACH DESIGN NUMBER

March 14, 2009 - Why Statistics

Discuss the fact that every day we are bombarded with data and information on several issues.
These include a variety of social, economic, and political issues.

A few examples are:

For each bullet point, discuss how and why statistics is important.

• Impact of violent TV programs on children

• Outsourcing and its effects

• Claims made by presidential candidates

• U.S. nuclear policy

March 14, 2009 - Introductions

Welcome to Statistics -- please introduce yourself to the other students and complete the following:

• Name

• Educational experience

• Program of study pursued at the ITT Technical Institute

• Work experience

• Expectations from the course

• Career aspirations

March 14, 2009 - Syllabus

SYLLABUS
Instructor: ________________________________________
Office hours: ________________________________________
Class hours: ________________________________________

COURSE DESCRIPTION
This course is designed to offer students the skills necessary to interpret and critically
evaluate statistics commonly used to describe, predict and evaluate data in an
information-driven environment. The focus is on the conceptual understanding of how
statistics can be used and on how to evaluate statistical data.

MAJOR INSTRUCTIONAL AREAS
1. Experimental Design and Collecting Data
2. Describing Data
3. Determining Probabilities
4. Probability Distributions
5. Confidence Testing
6. Hypothesis Testing
7. Correlation and Regression

COURSE OBJECTIVES
1. Explain the fundamentals of a statistical study.
2. Describe data sets and their measures in different forms.
3. Use statistics to conduct and summarize an observation that has both qualitative and
quantitative components.
4. Calculate probabilities by using counting principles.
5. Identify various discrete probability distributions and calculate corresponding
probabilities.
6. Interpret a normal distribution and make calculations using standard scores.
7. Construct confidence intervals and use them to interpret population means.
8. Formulate null and alternative hypotheses for claims made about population means.
9. Use an appropriate statistical technique to test a hypothesis.
10. Describe the linear association for a set of paired data.
11. Utilize the ITT Tech Virtual Library to enhance understanding of statistics.

Related SCANS Objectives
1. Comprehend and use efficient learning techniques to acquire and apply knowledge.
2. Create documents including graphs and flowcharts to illustrate point.
3. Communicate ideas to justify positions; persuade and convince others.
4. Retrieve and organize data from a variety of sources, including computerized
databases, reference books, books, and periodicals.
5. Demonstrate the ability to utilize both traditional and electronic library sources.
6. Acquire and evaluate relevant information, and organize, maintain, analyze, interpret,
communicate, and use applicable information.
7. Develop and reinforce critical thinking processes.
8. Participate cooperatively as a team member, teaching, learning from, and negotiating
with diverse members making a contribution to team success.
9. Identify need for data; select, retrieve, and analyze information; and communicate
results to others in written, graphic, and pictorial format.

TEACHING STRATEGIES
This section details a strategy to help you run this course.
In-class time will be utilized as follows:
1. Review of concepts covered in previous unit: is included in all units in
order that there are periodic check points for all work to be completed and
instructors can track progress and follow up immediately with students who
are not getting assignments completed on-time. Review involves:
a. A discussion of homework assignments that students have completed
b. Graded quiz including a set of objective type questions (true/false and
MCQs)
2. Explanation and application of new concepts: EG381 has a dual focus—
building basic concepts/vocabulary and developing statistical skills. To
address these focal points, each unit involves:
a. Concept explanation: so that key concepts are covered in-class and
students are aware of the important formulas. To reinforce the
important concepts, concept explanation is followed by a review quiz
consisting of a set of 5–10 true/false statements.
b. Concept application: is done using the following tools:
i. A case study in most of the units. Each case study consists of
3–4 objective type questions, all based on a data set/scenario,
aimed at improving students’ calculation and interpretation
skills.

ii. Statlets from Course compass: provides about seven statlets
(java Applets) for this course. These JAVA applets consist of
a graphic display and corresponding input values that can be
changed in real time. Observing and commenting on the
changes enables development of students’ statistical analysis
skills. In most units, a statlet is included and students are
encouraged to their interpretation/analysis of the statlet. For
two units where statlets are not present, the discussion will be
based on a static graphic.

iii. Excel hands-On Sessions (when applicable): The purpose of
this component is to help students gain familiarity in the use
of MS Excel for solving certain statistical problems. These
sessions will utilize the student technology manual to
demonstrate application of built-in functionalities and
graphing utilities of MS Excel. Students will be provided with
a data set so that they can practice along with the excel
session.

Homework assignments in this course take the form of:
1. Writing Assignments: primarily consisting of 3–4 problems from the
textbook with an objective to provide more objective and mathematically
based assignments to students.

2. Preparing for the graded quiz

The following graphic illustrates instructional strategy for this course.
Note: Most units have been designed and developed consistently using a combination of
the components depicted above. A consistent organization is intended to help students
gain familiarity with content organization sooner and have clear expectations. Please feel
free to deviate from this strategy by using different examples or assignments based on
your teaching experiences and interactions with your class.
Discuss the homework assignment from the previous unit and
address students’ problems and concerns (15 minutes)

H
O
M
E
W
O
R
K

Conduct a graded quiz to assess students’ understanding of the
concepts covered in the previous unit (30 minutes)
Concept explanation with a focus on key concepts (1 hour)
Concept reinforcement using the review quiz (15-20 minutes)
Concept application and data interpretation using case study (1–
1.25 hours)
Excel hands-on session* (0.75 hours)

I
N
-
C
L
A
S
S
A
C
T
I
V
I
T
I
E
S

Writing assignment (2 hours)
Preparation for the graded quiz (0.5-0.75 hours)
Review

COURSE RESOURCES
Student Textbook Package
􀂄 Textbook: Larson, Ron, and Betsy Farber. Elementary Statistics: Picturing the World.
Indianapolis: Pearson Custom Publishing, 2006.
􀂄 Solutions and Technology Manual: Larson, Ron, and Betsy Farber. Student’s Solutions
& Technology Manual for Elementary Statistics: Picturing the World. Indianapolis:
Pearson Custom Publishing, 2006.
􀂄 CD-ROMs:
o Chapter quiz prep, videos, and data files to accompany Elementary Statistics:
Picturing the World
o Data files for use with Student’s Solutions & Technology Manual for Elementary
Statistics: Picturing the World
References and Resources
ITT Tech Virtual Library
Log on to the ITT Tech Virtual Library at http://www.library.itt-tech.edu/ to access
online books, journals, and other reference resources selected to support ITT Tech
curricula.
􀂄 General References
• Reference Resources > Statistics
• Program Links > General Education/Technical Basics > Link Library > EG381
Statistics
All links to web references outside of the ITT Tech Virtual Library are always subject to change
without prior notice.

EVALUATION & GRADING
COURSE REQUIREMENTS
1. Attendance and Participation
Regular attendance and participation are essential for satisfactory progress in this course.
2. Completed Assignments
Each student is responsible for completing all assignments on time.
3. Team Participation (if applicable)
Each student is responsible for participating in team assignments and for completing the
delegated task. Each team member must honestly evaluate the contributions by all
members of their respective teams.

Evaluation Criteria Table

The final grade will be based on the following weighted categories:
CATEGORY WEIGHT
Participation 10%
Writing Assignments 45%
Quizzes 30%
Final Exam 15%
Total 100%
Grade Conversion Table
Final grades will be calculated from the percentages earned in class as follows:
Grade Percentage Credit
A 90–100% 4.0
B+ 85–89% 3.5
B 80–84% 3.0
C+ 75–79% 2.5
C 70–74% 2.0
D+ 65–69% 1.5
D 60–64% 1.0
F <60% 0.0