PROBLEM 1
Self-efficacy is a general concept that measures how well we think we can control
different situations. A multimedia program designed to improve dietary behavior
among low-income women was evaluated by comparing women who were randomly assigned to
intervention and control groups. The intervention was a 30-minute session in a computer
kiosk in the Food Stamp office. In this study, the participants were asked, “How sure are
you that you can eat foods low in fat over the next month?” The response was measured on
a five-point scale with 1 corresponding to “not sure at all” and 5 corresponding to “very
sure.” Here is a summary of the self-efficacy scores obtained about 2 months after the
intervention:
PART A: Is it appropriate to use the two-sample t-procedures? Give reasons for your
answer. Choose one correct answer.
A.Itisnotappropriatetousethetwo-sample -proceduressincethedistributionisnot
Normal.
B.Itisnotappropriatetousethetwo-sample -proceduressincewecannotassess
Normality without knowing the detailed data.
C.Theuseofthe -proceduresisappropriate,becausetheproximityofthemeansand
standard deviations show that the two distributions are similar.
D.Assumethattherearenooutliers.The -proceduresshouldbeappropriate,becausewe
have two large samples.
PART B : Let m1 denote the mean self-efficacy scores obtained from the “Control” group
and m2 denote the mean self-efficacy scores obtained from the “Intervention” group.
If the research question is to determine whether the data provide significant evidence to
conclude that women participating in the control group (no intervention) have less self-
efficacy scores than women participating in the intervention on average, which one of the
followings includes the appropriate null hypothesis and alternative hypothesis?
A. H0 :m1-m2 =0 versus Ha :m1-m2 10
B.H0:m1-m2=0versus Ha:m1-m2>0
C. H0 :m1-m2 =0 versus Ha :m1-m2 <0
D.H0:m1-m2£0versus Ha:m1-m2>0
PART C: Suppose that you are not certain whether the two populations of interest have
the same standard deviations. Calculate the standard error of x1 – x2 (denoted by SEx1-x2 ).
Page 1 of 12
ONLINE STAT 2000 ONLINE MODULE 8 REVIEW LESSONS 1‐3
PART D: By hand, calculate the appropriate test statistic associated with the hypotheses
set up in PART B. What are the degrees of freedom for the test statistic? Use the
conservation approach (i.e. the smaller of n -1 and n -1 ) to determine the associated
12
degrees of freedom
PART E: Draw a sketch of a t-distribution curve and mark the location of your test
statistic found in PART D. Shade the appropriate area that represents the P-value. Using
an appropriate statistical table, approximate the P-value.
PART F: At the significance level of 0.001, do the data provide significant evidence to
conclude that the population mean self-efficacy scores of women participating in the
control group (no intervention) is less than the population mean self-efficacy scores of
women participating in the intervention group? Justify your answer (i.e. How do you reach
your conclusion?)
PROBLEM 2
The study of 584 longleaf pine trees in the Wade Tract in Thomas County, Georgia, had
several purposes. Are trees in one part of the tract more or less like trees in any other
part of the tract or are there differences? In this exercise we will examine the sizes of
the trees, measured as diameter at breast height (DBH), by dividing the tract into
eastern and western halves and taking random samples of 30 trees from each half. Here are
the diameters in centimeters (cm) of the sampled trees:
The eastern distribution is right-skewed and the western distribution is left-skewed as
shown in each histogram below.
30
25
20
15
10
5
0
Histogram
West
35 30 25 20 15 10
5
0
Histogram
East
Page 2 of 12
Percent
Percent
ONLINE STAT 2000 ONLINE MODULE 8 REVIEW LESSONS 1‐3
Below is the summary statistics for DBH each half of the tract:
PART A: Is it appropriate to use the methods of this section to compare the mean DBH of
the trees in the east half of the tract with the mean DBH of trees in the west half? Give
reasons for your answer.
A.The -methodsarenotappropriatebecausethetwodistributionsareskewed.
B.The -methodsarenotappropriatebecausethedistributionsofthenorthernand
southern halves are not similar.
C.The -methodsareappropriatebecausethesamplesizesarerelativelylargeeven
though the two distributions are skewed, and there are no outliers in either
distribution.
D.The -methodsareappropriatebecausetheyarerobustagainstallkindsofdeviations
from Normality.
PART B: Suppose that we use a two-sided significance test that help us investigate
whether there is a significant evidence that the mean DBH of the trees in the eastern
half of the tract is NOT the same as the mean DBH of the trees in the western half of the
tract.
What is (are) the most appropriate reason(s) for your choice of using the two-sided test?
Choose one correct answer.
A. The data shows us it is more likely that the western trees and the eastern trees have different DBHs. B. We have no reason to expect a difference in a particular direction. C. The results from the two-sided test give more information.
D. A and B are correct. E. A, B and C are correct.
PART C: Suppose that we use a two-sided significance test that help us investigate
whether there is a significant evidence that the mean DBH of the trees in the eastern
half of the tract is NOT the same as the mean DBH of the trees in the western half of the
tract.
Let m1 denote the mean DBH of the trees in the west and m2 denote the mean DBH of the
trees in the east.
State the appropriate null and alternative hypotheses in terms of m1 and m2 .
PART D: By hand, calculate the appropriate test statistic associated with the hypotheses
set up in PART C. What are the degrees of freedom for the test statistic? Use the
conservation approach (i.e. the smaller of n -1 and n -1 ) to determine the associated
12
degrees of freedom
Page 3 of 12
ONLINE STAT 2000 ONLINE MODULE 8 REVIEW LESSONS 1‐3
PART E: Draw a sketch of a t-distribution curve and mark the location of your test
statistic found in PART D. Shade the appropriate area that represents the P-value. Using
an appropriate statistical table, approximate the P-value.
PART F: Which one of the followings summarizes your conclusion based on the P-value
found in PART D and the appropriate hypotheses stated in PART C?
A. At the 5% level of significance, we have significant evidence to suggest DBH of the western trees is smaller than the mean DBH of the eastern trees. B. At the 1% level of significance, we have significant evidence to suggest DBHs of the two populations are different.
C. At the 5% level of significance, we have significant evidence to suggest DBHs of the two populations are different.
MODULE 8: LESSON 2
PROBLEM 3
that the mean that the mean that the mean
An industrial psychologist is investigating the effects of work environment
attitudes. A group of 28 recently hired sales trainees were randomly assigned to one of 7
different “home rooms” – four trainees per room. Each room is identical except for wall
color, with 7 different colors used. The psychologist wants to know whether room color
has an effect on attitude, and, if so, wants to compare the mean attitudes of the
trainees assigned to the 7 room colors. At the end of the training program, the attitude
of each trainee was measured on a 100 -pt. scale (the lower the score, the poorer the
attitude).
PART A: Identify the response variable in this study.
PART B: Identify the treatments for this study.
PART C: How many treatments are in this study ?
PART D: How many replications are in each treatment group ?
PART E: What is the total sample size in this study?
PROBLEM 4
In a completely randomized design experiment, 10 experimental units were randomly chosen
for each of three treatment groups and a quantity was measured for each unit within each
group. In the first steps of testing whether the means of the three groups are the same,
the sum of squares for treatments was calculated to be 3,110 and the sum of squares for
error was calculated to be 27,000.
PART A: Complete the ANOVA table. That is, find the values of a, b, c, d, e, f, g, h, and
i.
Page 4 of 12
on employee
ONLINE STAT 2000 ONLINE MODULE 8 REVIEW LESSONS 1‐3
PART B: Which of the following is not a condition required for a valid One-Way ANOVA F-
test for a completely randomized experiment?
A) The sampled populations all have distributions that are approximately normal.
B) The samples are chosen from each population in an independent manner.
C) The variances of all the sampled populations are different.
PROBLEM 5
A partially completed ANOVA table for a completely randomized design is shown here.
PART A: Complete the ANOVA table.
PART B: How many treatments are involved in the experiment?
PART C: Do the data provide sufficient evidence to indicate a difference among the
population means? Test using α = .05.
PROBLEM 6
An industrial psychologist is investigating the effects of work environment on employee
attitudes. A group of 20 recently hired sales trainees were randomly assigned to one of
four different “home rooms” five trainees per room. Each room is identical except for
wall color. The four colors used were light green, light blue, gray, and red. The
psychologist wants to know whether room color has an effect on attitude, and, if so,
wants to compare the mean attitudes of the trainees assigned to the four room colors. At
the end of the training program, the attitude of each trainee was measured on a 60-pt.
scale (the lower the score, the poorer the attitude). The data was subjected to a one-way
analysis of variance.
Page 5 of 12
ONLINE STAT 2000 ONLINE MODULE 8 REVIEW LESSONS 1‐3
PART A: Give the null hypothesis for the one-way ANOVA F-test shown on the computer
output above.
- A) : where the ′ represent mean attitudes for the four rooms
- B) : where the ′ represent the proportion with the corresponding
attitude - C) : where the ′ represent attitude means for the person in each room
- D) : where the ′ represent the room colors
PART B: Using the one-way ANOVA table given above, identify the following values:
Sum of Squares for Groups (SSG)=
Sum of Squares for Error (SSE)=
Degrees of Freedom in the numerator (DFG)=
Degrees of Freedom in the denominator(DFE)= Mean Squares for Groups (MSG)= Mean Squares for Error(MSE)=
PROBLEM 7
Four different leadership styles used by Big-Six accountants were investigated. As part of
a designed study, 15 accountants were randomly selected from each of the four leadership
style groups (a total of 60 accountants). Each accountant was asked to rate the degree to
which their subordinates performed substandard field work on a 10 -point scale called the
“substandard work scale”. The objective is to compare the mean substandard work scales of
the four leadership styles. The data on substandard work scales for all 60 observations
were subjected to an analysis of variance.
ONE-WAY ANOVA FOR SUBSTAND BY STYLE
SOURCE DF SS MS F P
BETWEEN
WITHIN
TOTAL
3 2728.17 909.390 5.210 0.003
56 9774.63 174.547
59 12,502.80
Interpret the results of the one-way ANOVA F-test shown on the printout for α = 0.05.
A) At α = .05, there is sufficient evidence of differences among the substandard work scale means for the four leadership styles. B) At α =.05, nothing can be said. C) At α = .05, there is no evidence of interaction.
D) At α =.05, there is insufficient evidence of differences among the substandard work
scale means for the four leadership styles
Page 6 of 12
ONLINE STAT 2000 ONLINE MODULE 8 REVIEW LESSONS 1‐3
PROBLEM 8
There have been numerous studies investigating the effects of restaurant ambience on
consumer behavior. A recent study investigated the effects of musical genre on consumer
spending. At a single high-end restaurant in England over a 3-week period, there were a
total of 141 participants; 49 of them were subjected to background pop music (for
example, Britney Spears, Culture Club, and Ricky Martin) while dining, 44 to background
classical music (for example, Vivaldi, Handel, and Strauss), and 48 to no background
music. For each participant, the total food bill, adjusted for time spent dining, was
recorded. The following table summarizes the means and standard deviations:
PART A: State a set of hypotheses that is appropriate for investigating the effects of
musical genre on consumer spending.
Let denote the followings:
mC
mP
mN
= the mean bill from the population subjected to background classical music. = the mean bill from the population subjected to background pop music.
= the mean bill from the population subjected to no background music.
PART B: Is it reasonable to assume that the variances are equal in order to be pooled?
A. It is reasonable, because the sample variances are highly similar.
B. It is not reasonable, because the ratio of largest variance 11.10 to smallest variance
5.03 is greater than 2.
C. It is reasonable, because the ratio of largest standard deviation . 0.481 to
. √
smallest standard deviation √ 0.338 is less than 2.
D. It is reasonable, because the ratio . 1.49 is less than 2.
.
PART C: Give the numerator degrees of freedom and the denominator degrees of
freedom associated with the F-statistic for the hypotheses set up in PART A.
PART D: A computer software computed the sum of squares for groups of 164.66 and the
sum of squares for error of 1,069.39. Calculate the F statistic for the hypotheses set
up in PART A.
PART E: Given that a computer software gives the P-value of 0.00005, which of the
following is the most appropriate conclusion?
A. We have strong evidence that the means are all the same. B. We have strong evidence that the means are not all the same. C. We have strong evidence that the mean is higher for classical music. D. We have strong evidence that the mean for classical music is different from the other two means.
Page 7 of 12
ONLINE STAT 2000 ONLINE MODULE 8 REVIEW LESSONS 1‐3
MODULE 8: LESSON 3
PROBLEM 9
A study of 865 college students found that 42.5% had student loans. A single random
sample of 865 college students were randomly selected from the approximately 30,000
undergraduates enrolled in a large public university. The overall purpose of the study
was to examine the effects of student loan burdens on the choice of a career. A student
with a large debt may be more likely to choose a field where starting salaries are high
so that the loan can more easily be repaid. The following table classifies the students
by field of study and whether or not they have a loan:
PART A: Identify the explanatory variable and the response variable in this study.
PART B: Suppose that we are investigating to determine whether there is significant
evidence that “Student loan” and “Field of study” are dependent. State an appropriate set
of hypotheses.
PART C: If it is true that “Student loan” and “Field of study” are independent, how
many students who study Science and who have student loans would you expect to see from
the sample?
PART D: The partial Excel output using MegaStat add-in for the above table is given
below. The output includes the observed cell counts (in yellow), the expected cell counts
(in green), and the contributions to the chi-square statistic in orange cells. Some
expected cell counts and some contributions to are missing.
Recall that a contribution to is calculated by which is
represented by in the MegaStat output below.
Page 8 of 12
ONLINE STAT 2000 ONLINE MODULE 8 REVIEW LESSONS 1‐3
Answer the following questions below:
(1) What is the expected cell count for students who study Science and who have student loans?
(2) What is the expected cell count for students who study Agriculture and who do not have student loans?
(3) Is the chi-square test appropriate for performing the significance test of the hypotheses in PART A? Why or why not?
- (4) What is the value of ?
-
(5) How many values of the response variable are there? That is, how many rows
(excluding the total row) are there in the two-way table (given above PART A)?
(6) How many values of the explanatory variable are there? That is, how many columns (excluding the total column) are there in the two-way table (given above PART A)?
(7) What is the degrees of freedom for the chi-square statistic ?
Page 9 of 12
ONLINE STAT 2000 ONLINE MODULE 8 REVIEW LESSONS 1‐3
PART E:
The area (in read on the right) on the
right tail of the chi-square distribution
with the degrees of freedom found in PART D
represents the P-value associated with the
chi-square statistic . Using Table F
, approximate the P-value.
PART F: Using the significance level of 0.05, what do the results of the significance
test show? Choose one correct answer.
A. There is no significant evidence that having a loan and field of study are related. B. There is significant evidence that having a loan and field of study are related. C. There is significant evidence that having a loan and field of study are not related.
PROBLEM 10
Because statistical software plays such an important role in modern statistical
applications, many studies have encouraged the use of technology in statistics courses.
The Guidelines for the Assessment and Instruction in Statistics Education (GAISE) (Aliaga
et al., 2005) project was funded by the American Statistical Association to examine needs
for college level statistics courses. One of the six recommendations from GAISE is the
use of technology for developing conceptual understanding and analyzing data. Suppose a
survey was sent to 115 students at different universities across the United States to
access the relationship between ease of learning the statistical software program SAS and
a student’s currently level of SAS proficiency. The results of the survey are are shown
below.
To access the relationship between ease of learning the statistical software program SAS and a student's currently level of SAS proficiency, the appropriate hypotheses are
: of learning the statistical software program SAS and a student′s currently level of SAS are independent.
: of learning the statistical software program SAS and a student′s currently level of SAS are dependent.
Page 10 of 12
ONLINE STAT 2000 ONLINE MODULE 8 REVIEW LESSONS 1‐3
PART A: If the null hypothesis is true, how many students who find SAS to be
somewhat easy to learn and who are somewhat proficient in SAS would you expect to see in
the sample?
PART B: The partial Excel output using MegaStat add-in for the above table is given
below. The output includes the observed cell counts (in yellow), the expected cell counts
(in green), and the contributions to the chi-square statistic in orange cells. Some
expected cell counts and some contributions to are missing.
Recall that a contribution to is calculated by which is
represented by in the MegaStat output below.
Answer the following questions below:
(1) What is the contribution to for students who find SAS to be somewhat easy to
learn and who are somewhat proficient?
- (2) Which cell contributes most to the chi-square statistic ?
- (3) What is the value of the chi-square statistic ?
PART C: Determine if the following statement is appropriate. Give your reasons.
At 1% level of significance, there is sufficient evidence to suggest that ease of
learning the statistical software program SAS and a student’s currently level of SAS
proficiency are dependent.
PROBLEM 11
Following complaints about the working conditions in some apparel factories both in the
United States and abroad, a joint government and industry commission recommended in 1998
that companies that monitor and enforce proper standards be allowed to display a “No
Sweat” label on their products. Does the presence of these labels influence consumer
behavior? A survey of U.S. residents aged 18 or older asked a series of questions about
how likely they would be to purchase a garment under various conditions. For some
conditions, it was stated that the garment had a “No Sweat” label; for others, there was
no mention of such a label. On the basis of the responses, each person was classified as
a “label user” or a “label nonuser.” There were 296 women surveyed. Of these, 63 were
label users. On the other hand, 27 of 251 men were classified as users.
PART A: Display the data in a two-way table. Use the columns for the values of the
explanatory variable in the study and the rows for the values of the response variable in
the study.
Page 11 of 12
ONLINE STAT 2000 ONLINE MODULE 8 REVIEW LESSONS 1‐3
PART B
Suppose we want to perform an appropriate significant test to determine whether or not
there is a relationship between gender and use of No Sweat labels.
Give an appropriate set of hypotheses, an appropriate test statistic, its associated
degrees of freedom, and its associated P-value. Use TABLE F to approximate the P-value.
PART C
Using the significance level of 0.0025, which of the followings is the appropriate
conclusion?
A. There is no significant evidence that gender and label use are dependent. B. There is significant evidence that gender and label use are dependent. C. There is significant evidence that gender and label use are independent.


0 comments