Statistic assignment

Chs 1 – 3 1. A type of variable where arithmetic operations do not make sense are called _______.

A) quantitative B) categorical C) distributions D) cases

2. When using a pie chart, the sum of all the percentages should be _____.

A) 0 B) 1 C) 100 D) 50

3. What method is useful when comparing two distributions using a stemplot?

A) Splitting the stem B) Trimming the leaves C) Back-to-back stemplots D) None of the above

4. The histogram at right shows data from 30 students who were asked,

“How much time do you spend on the Internet in minutes?” What are some features about the data? A) There is a potential outlier. B) Most values are around 800. C) The range of values is between 0 and 400. D) None of the above

5. In a statistics class with 136 students, the professor records how

much money each student has in their possession during the first class of the semester. The histogram shown below represents the data he collected. What is approximately the percentage of students with under $10 in their possession? A) 35% B) 40% C) 44% D) 50%

6. A study is being conducted on air quality at a small college in the

South. As part of this study, monitors were posted at every entrance to this college from 6:00 a.m. to 10:00 p.m. on a randomly chosen day. The monitors recorded the mode of transportation used by each person as they entered the campus. Based on the information recorded, the following bar graph was constructed. Approximately what percentage of people entering campus on this particular day arrived by car? A) 9% B) 31% C) 53% D) 62%

7. The Insurance Institute for Highway Safety publishes data on the total damage suffered by compact automobiles in a series of controlled, low-speed collisions. The cost for a sample of nine cars, in hundreds of dollars, is provided below

10 6 8 10 4 3.5 7.5 8 9

What is the median cost of the total damage suffered for this sample of cars? A) $400 B) $730 C) $800 D) $1000

8. What is the interquartile range of the above data?

A) $300 B) $350 C) $400 D) $450 9. In a statistics class with 136 students, the professor records how much

money each student has in their possession during the first class of the semester. The histogram shown below represents the data he collected. From the histogram, which of the following is true? A) The mean is larger than the median. B) The mean is smaller than the median. C) The mean and median are approximately equal. D) It is impossible to compare the mean and median for these data.

10. The following boxplot is of the birth weights (in ounces) of 160 infants born

in a local hospital. About 40 of the birth weights were below A) 92 ounces B) 102 ounces C) 112 ounces D) 122 ounces

11. This is a standard deviation contest. Which of the following sets of four

numbers has the largest possible standard deviation? A) 7, 8, 9, 10 B) 5, 5, 5, 5 C) 0, 0, 10, 10 D) 0, 1, 2, 3

12. Agricultural fairs often hold competitions for produce grown by local gardeners. The following data

are the weight (in pounds) of tomatoes entered into an annual fair in Roland, Manitoba, Canada, in 2007.

2.48, 1.52, 1.15, 1.13, 1.00, 0.99, 0.96, 0.94, 0.75

Apply the 1.5 ×IQR rule to the data to check for outlier values. In this case, A) there are no outliers B) the value 0.75 is the only outlier C) the values 0.75 and 2.48 are both outliers D) the value 2.48 is the only outlier E) the values 1.52 and 2.48 are both outliers

13. The number of Facebook friends students at a university have are Normally distributed with a mean

of 1200 and a standard deviation of 200. What percentage of students has exactly 1000 Facebook friends? A) 84.13% B) 15.86% C) 42.07% D) None of the above

60 65 70 75 80 85 90

Height

V o

lu m

14. Many residents of suburban neighborhoods own more than one car but consider one of their cars to be the main family vehicle. The age of these family vehicles can be modeled by a Normal distribution with a mean of 2 years and a standard deviation of 6 months. What is the standardized value (Z score) for a family vehicle that is 3 years and 3 months old? A) 0.22 B) 2.5 C) 2.6 D) 2.92

15. Using the standard Normal distribution tables, what is the area under the standard Normal curve

corresponding to Z< 1.1? A) 0.1357 B) 0.2704 C) 0.8413 D) 0.8643

16. Using the standard Normal distribution tables, what is the area under the standard Normal curve

corresponding to –0.5 <Z< 1.2? A) 0.3085 B) 0.8849 C) 0.5764 D) 0.2815

17. Chocolate bars produced by a certain machine are labeled with 8.0 ounces. The distribution of the actual weights of these chocolate bars is Normal with a mean of 8.1 ounces and a standard deviation of 0.1 ounces. A chocolate bar is considered underweight if it weighs less than 8.0 ounces. What proportion of chocolate bars weighs less than 8.0 ounces? A) 0.159 B) 0.341 C) 0.500 D) 0.841

18. Which of the following statements about the standardized z-score of a value of a variable X, which

has a mean of m and a standard deviation of s, is/are TRUE? A) The z-score has a mean equal to 0. B) The z-score has a standard deviation equal to 1. C) The z-score tells us how many standard deviation units from the original observation fall away

from the mean. D) The z-score tells us the direction the observation falls away from the mean. E) All of the above statements about the z-score are true.

19. A researcher measured the height (in feet) and volume of usable

lumber (in cubic feet) of 32 cherry trees. The goal is to determine if the volume of usable lumber can be estimated from the height of a tree. The results are plotted at right. Select all descriptions that apply to the scatterplot. A) There is a positive association between height and volume. B) There is a negative association between height and volume. C) There is an outlier in the plot. D) The plot is skewed to the left. E) Both A and C

20. John’s parents recorded his height at various ages between 36 and 66 months. Below is a record of the results:

Age (months) 36 48 54 60 66 Height (inches) 34 38 41 43 45

John’s parents decide to use the least-squares regression line of John’s height on age to predict his height at age 21 years (252 months). What conclusion can we draw? A) John’s height, in inches, should be about half his age, in months. B) The parents will get a fairly accurate estimate of his height at age 21 years, because the data are

clearly correlated. C) Such a prediction could be misleading, because it involves extrapolation. D) All of the above.

Questions 21 – 23 Colorectal cancer (CRC) is the third most commonly diagnosed cancer among Americans (with nearly 147,000 new cases), and the third leading cause of cancer death (with over 50,000 deaths annually). Research was done to determine whether there is a link between obesity and CRC mortality rates among African Americans in the United States by county. Below are the results of a least-squares regression analysis from the software StatCrunch.

Simple linear regression results: Dependent Variable: Mortality.rate Independent Variable: Obesity.rate Mortality.rate = 13.458199 – 0.21749489 Obesity.rate Sample size: 3098 R (correlation coefficient) = –0.0067 R-sq = 4.5304943E-5 Estimate of error standard deviation: 111.20661 Parameter estimates: Parameter Estimate Std. Err. Alternative DF T-Stat P-Value Intercept 13.458199 15.9797735 ≠ 0 3096 0.84220207 0.3997 Slope –0.21749489 0.5807189 ≠ 0 3096 –0.37452698 0.708 Analysis of variance table for regression model: Source DF SS MS F-stat P-value Model 1 1734.7122 1734.7122 0.14027046 0.708 Error 3096 3.8287952E7 12366.91 Total 3097 3.8289688E7

21. What is the equation to predict mortality rates from obesity rates?

A) Mortality.rate = 13.458199 – 0.21749489 Obesity.rate B) Obesity.rate = 13.458199 – 0.21749489 Mortality.rate C) Mortality.rate = 13.458199 + 0.21749489 Obesity.rate D) Mortality.rate = 13.458199 – 0.0067 Obesity.rate

22. What fraction of the variation in mortality rates is explained by the least-squares regression?

A) 0.000045 B) 111.201 C) –0.0067 D) 13.45

23. A study of the salaries of full professors at a small university shows that the median salary for female professors is considerably less than the median male salary. Further investigation shows that the median salaries for male and female full professors are about the same in every department (English, physics, etc.) of the university. Which phenomenon explains the reversal in this example? A) extrapolation B) Simpson’s paradox C) causation D) correlation

Questions 24 – 26 Is age a good predictor of salary for CEO’s? Sixty CEO’s between the age of 32 and 74 were asked their salary (in thousands). The results of a statistical analysis are shown below: Simple linear regression results: Dependent Variable: SALARY Independent Variable: AGE SALARY = 242.70212 + 3.1327114 AGE Sample size: 59 R (correlation coefficient) = 0.1276 R-sq = 0.016270384 Estimate of error standard deviation: 220.64246 Parameter estimates: Parameter Estimate Std. Err. Alternative DF T-Stat P-Value Intercept 242.70212 168.7604 ≠ 0 57 1.4381461 0.1559 Slope 3.1327114 3.2264276 ≠ 0 57 0.9709536 0.3357 Analysis of variance table for regression model: Source DF SS MS F-stat P-value Model 1 45896.027 45896.027 0.9427509 0.3357 Error 57 2774936.2 48683.094 Total 58 2820832.2 24. Suppose a CEO is 57 years old. What do you predict his/her salary to be?

A) over $400,000 B) between $100,00 and $400,000 C) under $100,00 D) None of the above.

25. Suppose you wanted to predict the salary of the CEO of Facebook, Mark Zuckerberg, based on the

information here. How well do you think your prediction would be assuming Mr. Zuckerberg was 23 when he started Facebook and became CEO? A) The prediction would be accurate and around $300,000. B) The prediction would require extrapolation and therefore would not be accurate. C) The prediction would be accurate and around $240,000. D) None of the above.

26. What are possible reasons for a correlation around 0.13 for the above data?

A) Age is a very strong predictor of CEO salary. B) Age is not a good predictor and something else may be a better a predictor C) There is not enough data to accurately estimate the correlation. D) The range of ages is too small.

0.0 0.5 1.0 1.5 2.0 2.5

0.0

0.5

1.0

1.5

2.0

2.5

27. Consider the scatterplot at right. What do we call the point indicated by the plotting symbol O? A) a residual B) influential C) a z-score

Questions 28 – 29 The 94 students in a statistics class are categorized by gender and by the year in school. The numbers obtained are displayed below:

Year in school Gender Freshman Sophomore Junior Senior Graduate Total Male 1 2 9 17 2 31 Female 23 17 13 7 3 63 Total 24 19 22 24 5 94

28. What proportion of the statistics students in this class are sophomores, given they are female?

A) 0.11 B) 0.202 C) 0.27 D) 19 29. What proportion of the statistics students in this class are male?

A) 0.065 B) 0.105 C) 0.33 D) 31 30. What is the best way to control for lurking variables?

A) Compare two or more treatments B) Randomize to assign experimental units to treatments C) Repeat each treatment on many units D) None of the above

Extra Credit 1. In order to determine if drinking from plastic water bottles causes cancer, researchers surveyed a large

sample of adults. For each adult they recorded whether the person drank regularly from plastic water bottles at any period in their life and whether the person had cancer. They then compared the proportion of cancer cases in those who drank from plastic water bottles regularly at some time in their lives with the proportion of cases in those who never drank from plastic water bottles at any point in their lives. The researchers found a higher proportion of cancer cases among those who drank from plastic water bottles regularly than among those who never drank from plastic water bottles. What type of study is this? A) An observational study B) An experiment but not a double-blind experiment C) A double-blind experiment D) A block design

2.. Which of the following best describes a simple random sample (SRS) of size n?

A) It is a random sample of size n selected so that everyone in the population has a known probability of being included in the sample.

B) It is a random sample of size n selected so that everyone in the population has the same chance of being included in the sample.

C) It is a probability sample of size n with known probabilities of selection. D) It is a sample selected from the population in such a way that every set of n individuals has an

equal chance of being in the sample actually selected. E) It is a sample of n individuals selected in such a way that only chance determines who is included

in the sample.

A common fear for incoming freshman in college is the dreaded “freshman fifteen.” The combination of being in a new environment away from home, a high stress level, alcohol consumption, and eating dining hall food can cause weight gain in college students. A study examined weight gained during the first year of college and what factors contribute to it. A 27-question survey was sent to 252 students at over 50 universities in the United States. Questions included information on demographics, weight gain, diet, family relationships, etc. Ninety-five survey responses were received from students across 37 United States colleges and universities, with 32 respondents from Rose-Hulman Institute of Technology.

3. What is the sample in this study?

A) U.S. college students B) All college students C) The survey respondents D) The 50 universities

4. What is the response rate?

A) 50/252 B) 95/252 C) 32/50 D) 32/25

5. What type of sample is this?

A) Simple random sample B) Probability sample C) Stratified random sample D) Voluntary response sample

6. Does the survey suffer from nonresponse?

A) No, everyone chosen for the survey participated. B) No, this was an experiment so nonresponse is not an issue. C) Yes, because not everyone chosen for the survey participated. D) Yes, because the survey contained too many questions and it is likely participants did not answer all the questions.

7. What could you do to improve the study?

A) Increase the coverage of universities that were selected for the study. B) Conduct a matched-pairs design instead. Weigh students on the first day of class and at the end

of their freshman year. C) Follow up with students who did not respond to the study to improve the response rate. D) All of the above could be done to improve the study.

8. One of the questions asked was, “How much weight did you gain after your freshman year?” This

would be an example of ______. A) poor wording of a question because some students may have lost weight. B) response bias because the question does not allow for a valid response from students who lost

weight. C) voluntary response because the students can write whatever they want. D) only A and B. E) None of the above

Chs 4 – 5 1. A penny is tossed. We observe whether it lands heads up or tails up. Suppose the penny is a fair

coin, i.e., the probability of heads is ½ and the probability of tails is ½. What does this mean? A) Every occurrence of a head must be balanced by a tail in one of the next two or three tosses. B) If the coin is tossed many, many times, the proportion of tosses that land heads will be

approximately ½, and this proportion will tend to get closer and closer to ½ as the number of tosses increases.

C) Regardless of the number of flips, half will be heads and half tails. D) All of the above.

2. Which of the following is (are) appropriate statements about randomness and/or probability?

A) A phenomenon is called random if individual outcomes are uncertain, but in a large number of repetitions, there is a regular distribution of outcomes.

B) The word random in statistics is a description of a kind of order that emerges in the long run. C) Probability describes only what happens in the long run. D) In a small or moderate number of repetitions, the observed proportion of an outcome can be far

from the probability of the outcome. E) All of the above are appropriate statements.

3. Suppose a fair coin is flipped twice and the number of heads is counted. Which of the following is a

valid probability model for the number of heads observed in two flips? A) Number of heads 0 1 2 C) Number of heads 0 1 2

Probability ¼ ½ ½ Probability ¼ ¼ ¼

B) Number of heads 0 1 2 D) None of the above. Probability ⅓ ½ ⅓

Use the following to answer questions 4–5: Ignoring twins and other multiple births, assume babies born at a hospital are independent events with the probability that a baby is a boy and the probability that a baby is a girl both equal to 0.5. 4. What is the probability that the next three babies are of the same sex?

A) 0.125 C) 0.250 B) 0.375 D) 0.500

5. Define event B = {at least one of the next two babies is a boy}. What is the probability of the

complement of event B? A) 0.125 C) 0.250 B) 0.375 D) 0.500

6. The American Veterinary Association claims that the annual cost of medical care for dogs averages

$100 with a standard deviation of $30. The cost for cats averages $120 with a standard deviation of $35. Some basic algebraic and statistical steps show us that the average of the difference in the cost of medical care for dogs and cats is then $100 –$120 = –$20. The standard deviation of that same difference equals $46. If the difference in costs follows a Normal distribution, what is the probability that the cost for someone’s dog is higher than for the cat? A) 0.2839 C) 0.6618 B) 0.3319 D) 0.7161

Use the following situation to answer questions 7–8: A study was conducted in a large population of adults concerning eyeglasses for correcting reading vision. Based on an examination by a qualified professional, the individuals were judged as to whether or not they needed to wear glasses for reading. In addition it was determined whether or not they were currently using glasses for reading. The following table provides the proportions found in the study: Used glasses for reading Yes No Judged to need Yes 0.42 0.18 glasses No 0.04 0.36 7. If a single adult is selected at random from this large population, what is the probability that the adult is

judged to need eyeglasses for reading? A) 0.46 C) 0.78 B) 0.42 D) 0.60

8. What is the probability that the selected adult is judged to need eyeglasses but does not use them for

reading? A) 0.42 C) 0.54 B) 0.18 D) 0.60

9. Consider the following probability distribution for a discrete random variable X:

X 3 4 5 6 7 P(X = x) 0.15 0.10 0.20 0.25 0.3

What is the P{X ≤ 5.5}? A) 0.45 C) 0.20 B) 0.75 D) 0

10. Customers arrive at a Vineyard Vines at an average of 15 per hour (0.25/min).

What is the probability that the manager must wait at least 5 minutes for the first customer? A) 0.2865 C) 0.6836 B) 0.7135 D) 0.1232

Use the following to answer questions 11–14: Consider the following probability histogram for a discrete random variable X: number of hot dogs Capt Jim eats at a cookout. 11. This probability histogram corresponds to which of the following distributions for X?

A) Value of X 1 2 3 4 5 Probability 0.06 0.25 0.38 0.25 0.06 B) Value of X 1 2 3 4 5

Probability 0.10 0.25 0.30 0.20 0.15 C) Value of X 1 2 3 4 5 Probability 0.10 0.25 0.30 0.25 0.10 D) None of the above.

12. What is the P(X = 3)?

A) 0 C) 0.25 B) 0.20 D) 0.30

13. What is P(X < 3)?

A) 0.10 C) 0.35 B) 0.25 D) 0.65

14. What is P(X ≤ 3)?

A) 0.10 C) 0.35 B) 0.25 D) 0.65

15. Central Limit Theorem only applies when sampling from Normal populations.

A) True B) False

Use the following to answer questions 16–18: The Department of Animal Regulations released information on pet ownership for the population consisting of all households in a particular county. Let the random variable X = the number of licensed dogs per household. The distribution for the random variable X is given below:

Value of X 0 1 2 3 4 5 Probability 0.52 0.22 0.13 0.03 0.01

16. The probability for X = 3 is missing. What is it?

A) 0.07 C) 0.1 B) 0.09 D) 0.0

17. What is the probability that a randomly selected household from this community owns at least one

licensed dog? A) 0.22 C) 0.48 B) 0.26 D) 0.52

18. What is the average number of licensed dogs per household in this county?

A) 0 dogs C) 1 dog B) 0.92 dogs D) 1.22 dogs

Use the following to answer questions 19–21: In a large city, 72% of the people are known to own a cell phone, 38% are known to own a pager, and 29% own both a cell phone and a pager. 19. What proportion of people in this large city own either a cell phone or a pager?

A) 0.29 C) 0.81 B) 0.67 D) 1.1

20. What is the probability that a randomly selected person from this city owns a pager, given that the

person owns a cell phone? A) 0.266 C) 0.403 B) 0.38 D) 0.528

21. Are the events “owns a pager” and “owns a cell phone” independent?

A) Yes. B) No, because P(owns a pager) and P(owns a cell phone) are not equal. C) No, because P(owns a pager) and P(owns a pager|owns a cell phone) are not equal. D) Cannot be determined.

Use the following to answer questions 22–23: Chocolate bars produced by a certain machine are labeled 8.0 oz. The distribution of the actual weights of these chocolate bars is claimed to be Normal with a mean of 8.1 oz and a standard deviation of 0.1 oz. 22. A quality control manager initially plans to take a simple random sample of size n from the production

line. If he were to double his sample size (to 2n), by what factor would the standard deviation of the

sampling distribution of X change?

A) 1/2 C) 2

B) 21 D) 2 23. If the quality control manager takes a simple random sample of ten chocolate bars from the

production line, what is the probability that the sample mean weight of the 10 sampled chocolate bars will be less than 8.0 oz? A) 0 C) 0.0316 B) 0.00078 D) 0.1587

24. A sample of size n is selected at random from a population that has mean µ and standard deviation

σ . The sample mean x will be determined from the observations in the sample. Which of the following statements about the sample mean, x , is (are) TRUE?

A) The mean of x is the same as the population mean, i.e., µ .

B) The variance of x is σ 2

n .

C) The standard deviation of x decreases as the sample size grows larger. D) All of the above are true. E) Only A and B are true.

25. From the central limit theorem, we know that if we draw a SRS from any population the sampling

distribution of the sample mean will be EXACTLY Normal. A) True B) False

26. The average batch of chocolate at Schakolad Chocolate Factory will be ready to serve in 1 hour.

What is the chance the next batch will be ready in less than 50 min? A) 0.3681 C) 0.6836 B) 0.5654 D) 0.1232

27. The following histogram shows the distribution of 1000 sample

observations from a population with mean µ = 4 and variance 2σ = 8: Suppose a simple random sample of 100 observations is to be

selected from the population and the sample average, x , calculated. Which of the following statements about the

distribution of x is (are) FALSE?

A) The distribution of x will have a mean of 4.

B) The distribution x will be approximately Normal. C) Because the distribution shown in the histogram above is

clearly skewed to the right, the shape of the distribution of x will also show skewness to the right.

D) Even though the distribution of the population variable appears to be skewed to the right, the

distribution of x will be approximately symmetric around µ = 4.

E) The standard deviation of the distribution of x will be 0.283. Use the following to answer questions 28–30: In a test of extrasensory perception (ESP), the experimenter looks at cards that are hidden from the subject. Each card contains either a star, a circle, a wavy line, or a square. An experimenter looks at each of 100 cards in turn, and the subject tries to read the experimenter’s mind and name the shape on each. A subject who is just guessing has probability 0.25 of guessing correctly on each card. 28. What is the probability that the subject gets more than 30 correct if the subject does not have ESP

and is just guessing? (Use the continuity correction.) A) Less than 0.0001 B) 0.1038 C) 0.25 D) 0.31

29. What is the probability of the subject obtaining his/her first correct guess on the 4th question?

A) 0.1055 C) 0.4219 B) 0.8945 D) 0.5781

30. What is the probability of the subject obtaining his/her first correct guess within the first 4 questions?

A) 0.3164 C) 0.6836 B) 0.8945 D) 0.5781

Extra Credit 1. What is the probability it will take exactly 6 rolls of two fair dice to make a 7?

A) 0.9330 C) 0.0804 B) 0.0670 D) 0.9196

2. Let X be a binomial random variable with distribution B(10, 0.6). What is the probability that X equals

A) (0.6) 8 (0.4)

10!

8! (0.6)

8 (0.4)

C) 45(0.6) 8 (0.4)

D) 45(0.6) 2 (0.4)

E) None of the above.

3. A college basketball player makes 6 5

of her free throws. Assume free throws are independent. What is the probability that she makes exactly three of her next four free throws?

A) ( ) ( )1

6 14

C) ( ) ( )3

6 14

B) ( ) ( )1

6 1

D) ( ) ( )3

6 1

4. A batch of 100 chocolate chip cookies contains 10 burnt cookies. Five cookies are chosen at random,

without replacement. Find the probability that the sample contains at least one burnt cookie. A) 0.3349 C) 0.4162 B) 0.6606 D) 0.5839

5. The exponential distribution is ______. A) symmetric B) bell-shaped C) All of the above D) None of the above

6. Although cities encourage carpooling to reduce traffic congestion, most vehicles carry only one

person. For example, nationally 75.5% of the people drive to work alone. a) If you choose 12 vehicles driving to work at random, what is the probability that more than half

(that is, 7 or more) carry just one person? b) If you choose 80 vehicles at random, what is the probability that more than half (that is, 41 or

more) carry just one person?

Chs 6 – 8

1. The lifetime (in hours) of a 60-watt light bulb is a random variable that has a Normal distribution with

σ = 30. A random sample of 25 bulbs put on test produced a sample mean lifetime of x = 1038. If in a study of the lifetime of 60-watt light bulbs it was desired to have a margin of error no larger than 6 hours with 99% confidence, how many randomly selected 60-watt light bulbs should be tested to achieve this result? A) 13 B) 97 C) 165 D) 42 E) Not within ±2 of any of the above.

Use the following to answer questions 2 and 3: A manufacturer of a specific part used in the operation of a gas turbine engine is concerned because the part is designated as critical-to-quality (CQT) and is costly to produce. The process that is used to produce the part has been studied extensively and has been shown to be stable and predictable for some lengthy period of time. When the process is in the stable state, the crucial measurement on the CQT part is known to be Normally distributed with a mean of µ = 1.58 cm and with a standard deviation of σ = 0.10 cm. In order to check on the status of the production process, a monitoring plan has been established, which requires that a sample of four manufactured parts should be selected at random each

hour; if the mean x of the sample exceeds 1.71 cm then the process is stopped, examined for possible problems, and repairs made to the process if needed.

2. If the process is operating as it should, what is the probability that this rule regarding x results in a shutdown when in fact there is nothing wrong with the process (this is called a false alarm)? A) 0.0013 B) 0.0094 C) 0.0986 D) 0.0047 E) Not within ± 0.001 of any of the above.

3. If, in fact, the process has changed and the process mean has shifted to µ = 1.73, what is the

probability that this rule regarding x will fail to detect the shift? A) 0.3446 B) 0.6554 C) 0.9987 D) 0.0013 E) Not within ± 0.001 of any of the above.

4. In the last mayoral election in a large city, 47% of the adults over the age of 65 voted Republican. A

researcher wishes to determine if the proportion of adults over the age of 65 in the city who plan to vote Republican in the next mayoral election has changed. Let p represent the proportion of the population of all adults over the age of 65 in the city who plan to vote Republican in the next mayoral election. In terms of p, the researcher should test which of the following null and alternative hypotheses? A) Ho: p = 0.47 vs. Ha: p < 0.47 B) Ho: p = 0.47 vs. Ha: p ≠ 0.47 C) Ho: p = 0.47 vs. Ha: p > 0.47

Use the following to answer questions 5 and 6: The Survey of Study Habits and Attitudes (SSHA) is a psychological test that measures the motivation, attitude, and study habits of college students. Scores range from 0 to 200 and follow (approximately) a Normal distribution with mean 115 and standard deviation 25. You suspect that incoming freshmen at your school have a mean µ  which is different from 115 because they are often excited yet anxious about entering college. To test your suspicion, you decide to test the hypotheses Ho: µ = 115 versus Ha: µ ≠ 115. You give the SSHA to 25 incoming freshmen and find their mean score to be 116.2. 5. What is the value of the test statistic?

A) z = 0.048 C) z = 1.2 B) z = 0.24 D) z = 1.96

6. What is the value of the p-value?

A) 0.1151 C) 0.4052 B) 0.2302 D) 0.8104

Use the following to answer questions 7 and 8: The attention span of little kids (ages 3–5) is claimed to be Normally distributed with a mean of 15 minutes and a standard deviation of 4 minutes. A test is to be performed to decide if the average attention span of these kids is really this short or if it is longer. You decide to test the hypotheses Ho: µ = 15 versus Ha: µ > 15 at the 5% significance level. A sample of 10 children will watch a TV show they have never seen before, and the time until they walk away from the show will be recorded. 7. Fill in the blank. At a significance level of 5%, the decision rule would be to reject the null hypothesis if

the observed sample mean is greater than _________ minutes. A) 15.66 C) 17.48 B) 17.08 D) 19

8. If, in fact, the true mean attention span of these kids is 18 minutes, what is the probability of a Type II

error? A) 0.0107 C) 0.3405 B) 0.2335 D) 0.7665

9. A study has been completed involving a test of significance of the null hypothesis Ho: µ = 0. The

researchers have discovered that the power of the test is too small. What can the researchers try to do in order to increase the power of their test procedure? A) Increase the level of significance, α . B) Increase the sample size, n. C) Decrease the population standard deviation, σ . D) All of the above. E) None of the above; nothing can be done to increase the power.

Use the following to answer questions 10 and 11: The time needed for college students to complete a certain paper-and-pencil maze follows a Normal distribution with a mean of 30 seconds and a standard deviation of 3 seconds. You wish to see if the mean time µ is changed by vigorous exercise, so you have a group of nine college students exercise vigorously for 30 minutes and then complete the maze. Assume that σ remains unchanged at 3 seconds. The hypotheses you decide to test are Ho: µ = 30 versus Ha: µ ≠ 30.

10. Suppose it takes the nine students an average of x = 32.05 seconds to complete the maze. At the 1% significance level, what can you conclude? A) Ho should be rejected because the p-value is less than 0.01. B) Ho should not be rejected because the p-value is greater than 0.01. C) Ha should be rejected because the p-value is less than 0.01. D) Ha should not be rejected because the p-value is greater than 0.01.

11. Suppose you compute the average time x that it takes these students to complete the maze and you find that the results are significant at the 5% level. What can you conclude? A) The test would also be significant at the 10% level. B) The test would also be significant at the 1% level. C) Both of the above. D) None of the above.

Use the following to answer questions 12–13: A 95% confidence interval (using the conservative value for the degrees of freedom) for

1 2 µ µ− , based on two independent samples of sizes 18 and 20, respectively, gives us (45.6, 56.7).

12. What was the observed difference between the two sample means 1x and 2x ? A) 11.1 C) 51.15 B) 45.6 D) 56.7

13. What would be the margin of error for a 99% confidence interval for 1 2

µ µ− ? A) 2.63 C) 5.55 B) 2.898 D) 7.62

14. When the sample size is very large, the corresponding t distribution is very close to the normal

distribution. A) True B) False

Use the following to answer questions 15–18: You wish to compare the prices of apartments in two neighboring towns. You take a simple random sample of 12 apartments in town A and calculate the average price of these apartments. You repeat this

for 15 apartments in town B. Let 1

µ represent the true average price of apartments in town A and 2

µ the average price in town B. 15. What would be the hypotheses for this problem?

A) Ho: 1

µ = 2

µ versus Ha: 1

µ < 2

B) Ho: 1

µ = 2

µ versus Ha: 1

µ > 2

C) Ho: 1

µ = 2

µ versus Ha: 1

µ ≠ 2

µ 16. If we were to use the pooled t test, what would be the degrees of freedom?

A) 11 C) 14 B) 12 D) 25

17. If we were to use the unpooled t test, what would be the conservative estimate for the degrees of

freedom? A) 11 C) 14 B) 12 D) 25

18. Suppose we were to use the unpooled t test with the conservative estimate for the degrees of

freedom. The t statistic for comparing the mean prices is 2.1. What can we say about the value of the p-value? A) p-value < 0.01 C) 0.05 < p-value < 0.10 B) 0.01 < p-value < 0.05 D) p-value > 0.10

Use the following to answer questions 19 and 20: Ten couples are participating in a small study on cholesterol. Neither the man nor the woman in each couple is known to have any problems with high cholesterol. The researcher conducting the study wishes to use the t test for matched pairs to determine if there is evidence that the cholesterol level for the husband tends to be higher than the cholesterol level for the wife. The cholesterol measurements for the ten couples are given below:

Couple 1 2 3 4 5 6 7 8 9 10 Husband’s cholesterol 224 310 266 332 244 178 280 276 242 260 Wife’s cholesterol 200 270 288 296 270 180 268 244 210 236

19. What are the hypotheses the researcher wishes to test?

A) Ho: D

µ = 0 versus Ha: D

µ > 0, where D

µ = the mean of the differences in cholesterol levels (Husband – Wife) for all couples without cholesterol problems.

B) Ho: p = ½ versus Ha: p ≠ ½, where p = the proportion of cholesterol levels of the husband that are higher than those of the wife.

C) Ho: p = ½ versus Ha: p < ½, where p = the proportion of cholesterol levels of the husband that are higher than those of the wife.

D) Ho: population median = 0 versus Ha: population median > 0, where the differences for which the median is calculated are measured as Husband – Wife.

20. What is the (approximate) value of the p-value?

A) 0.039 C) 0.079 B) 0.055 D) 0.172

Use the following to answer questions 21–24: A sportswriter wished to see if a football filled with helium travels farther, on average, than a football filled with air. To test this, the writer used 18 adult male volunteers. These volunteers were randomly divided into two groups of nine subjects each. Group 1 kicked a football filled with helium to the recommended pressure. Group 2 kicked a football filled with air to the recommended pressure. The mean yardage for

Group 1 was 1x = 30 yards with a standard deviation 1

s = 8 yards. The mean yardage for Group 2 was

2x = 26 yards with a standard deviation 2

s = 6 yards. Assume that the two groups of kicks are

independent. Let 1

µ and 2

µ represent the mean yardage we would observe for the entire population represented by the volunteers if all members of this population kicked, respectively, a helium-filled football

and an air-filled football. Let 1

σ and 2

σ be the corresponding population standard deviations.

21. Assuming two-sample t procedures are safe to use, what is a 99% confidence interval for 1

µ – 2

µ ? (Use the conservative value for the degrees of freedom.) A) 4 ± 4.7 yards C) 4 ± 7.7 yards B) 4 ± 6.2 yards D) 4 ± 11.2 yards

22. Suppose we wish to test the hypothesis that the groups are equivalent in how variable their kicks are.

To do this, we wish to test the hypotheses Ho: 1

µ = 2

µ versus Ha: 1

µ ≠ 2

µ . Assume the distribution of the lengths of these kicks is Normal. What can we say about the value of the p-value? A) p-value < 0.025 C) 0.05 < p-value < 0.10 B) 0.025 < p-value < 0.05 D) p-value > 0.10

23. Based on the confidence interval for the difference in means and the test for equality of the standard deviations, we can draw a conclusion about the distribution of the lengths of the kicks with the two different kinds of footballs. Determine which of the following statements is true. A) Both the means and standard deviations of the air and helium groups seem to be the same. B) The means of the air and helium groups seem to be the same. However, the standard deviations

seem to be different. C) The standard deviations of the air and helium groups seem to be the same. However, the means

seem to be different. D) Both the means and standard deviations of the air and helium groups seem to be different from

one another. 24. If we had used the more accurate software approximation to the degrees of freedom, what would be

the number of degrees of freedom for the two-sample t procedures? A) 8 C) 14 B) 9.374 D) 14.837

Use the following to answer questions 25–27: There is substantial interest in the health benefits of the consumption of high amounts of fiber in diets. A market research team is interested in the public acceptance of a new high-fiber cereal (more than 8 gm of fiber per serving) that is to be marketed. To that end, the researchers selected a random sample of subjects from one region of the country. The selected subjects were provided with two bowls of cereal. One bowl contained the new cereal and the other bowl a well-known and popular cereal. The bowls were presented in random order and subjects asked which cereal they preferred. The study was repeated independently in a second region. In region 1, of the 400 subjects, 220 preferred the new cereal; in region 2, 195 of the 300 subjects indicated a preference for the new cereal. 25. The researchers wanted to test whether the proportions of consumers who preferred the new high-

fiber cereal are the same or different in the two regions. What null and alternative hypotheses should they establish?

A) Ho: p1 = p2 against Ha: p1 > p2

B) Ho: p̂1 = p̂2 against Ha: p̂1 ≠ p̂2

C) Ho: p1 = p2 against Ha: p1 ≠ p2

D) Ho: p̂1 = p̂2 against Ha: p̂1 > p̂2

E) Ho: p1 − p2 = 0 against Ha: p1 − p2 = 0.5 26. What is the value of the test statistic?

A) t = −2.67 B) z = −2.70 C) z = −3.84 D) z = −2.66 E) t = −2.70

27. What is the p-value for this test?

A) 0.0077 B) 0.0035 C) 0.0070 D) 0.0038 E) < 0.0002

Use the following to answer questions 28 – 30: A study was conducted at the University of Waterloo on the impact characteristics of football helmets used in competitive high school programs. There were three types of helmets considered, classified according to liner type: suspension, padded-suspension, and padded. In the study, a measurement called the Gadd Severity Index (GSI) was obtained on each helmet using a standardized impact test. A helmet was deemed to have failed if the GSI was greater than 1200. Of the 81 helmets tested, 29 failed the GSI 1200 criterion. 28. Assume that the suspension helmets tested were selected at random. What are the point estimates

of the proportion of suspension helmets that fail and the standard error of the estimate, respectively? A) 0.36; 0.0028 B) 0.64; 0.053 C) 0.36; 0.053 D) 0.64; 0.0028 E) 0.36: 0.089

29. Based on the sample results, what is the 90% confidence interval estimate for the true population

proportion of suspension helmets that would fail the test? A) (0.304, 0.416) B) (0.256, 0.464) C) (0.213, 0.507) D) (0.272, 0.448) E) (0.553, 0.737)

30. If the test was to be conducted again, how many suspension-type helmets should be tested so that

the margin of error does not exceed 0.05 with 95% confidence? A) 355 B) 20 C) 271 D) 250 E) 82

Extra Credit 1. The scores on the Wechsler Intelligence Scale for Children (WISC) are thought to be Normally

distributed with a standard deviation of σ = 10. A simple random sample of 25 children is taken, and

each is given the WISC. The mean of the 25 scores is x = 104.32. Based on these data, what is a 95% confidence interval for µ ? A) 104.32 ± 0.78 B) 104.32 ± 3.29 C) 104.32 ± 3.92 D) 104.32 ± 19.60

2. A sample of size n = 27 is used to conduct a significance test for Ho: µ = 75 versus

Ha: µ > 75. The test statistic is t = 3.45. What are the degrees of freedom for this test statistic? A) 26 B) 27 C) 74 D) 75

3. A simple random sample of five female basketball players is selected. Their heights (in cm) are 170,

175, 169, 183, and 177. What is the standard error of the mean of these height measurements? A) 2.538 B) 2.837 C) 5.075 D) 5.675

Chs 9 – 12 For Questions 1 – 2 A sample of 50 male and 50 female infants were put on an experimental infant formula. After two weeks the parents of each infant were asked to fill out a questionnaire concerning infant satisfaction on the formula. One of the questions was “Did Baby Seem to Like the Formula?” with possible responses

1 – Like Very Much; 2 – Like Somewhat; 3 – Neutral; 4 – Dislike Somewhat; 5 – Dislike Very Much

The resulting data are presented below.

Gender Male Female Like Very Much 16 14 Like Somewhat 13 18 Neutral 17 16 Dislike Somewhat 3 2 Disike Very Much 1 0

1. This is a

a) 2 x 2 table. b) 2 x 5 table. c) 5 x 2 table. d) 5 x 5 table

2. The appropriate null hypothesis for this data is that

a) the distribution of parents’ responses on this question is the same for male and female infants. b) the distribution of gender is the same for each parents’ response to this question. c) gender and parents’ responses on this question are independent. d) gender and parents’ responses on this question are dependent

3. A study to compare two types of infant formula was run at two sites, one in Atlanta and the second in Denver. The study was run over a three-week period. Subjects at both sites were classified as dropouts if they left the study before the conclusion, or completers if they finished the study. The following table gives the number of dropouts and completers at each site. A chi-square test was performed and the result was X

2 = 5.101 with p-value = 0.024.

Responder Dropout Completer Atlanta 16 134 Denver 21 379

The correct conclusion is a) we found evidence to suggest that Atlanta had a greater dropout rate. b) we found evidence to suggest that Denver had a greater dropout rate. c) any differences can be explained by sampling variability. d) there is no association between responder and dropout rate.

For Questions 4 – 7 A study was performed to examine the personal goals of children in grades 4, 5, and 6. A random sample of students was selected for each of the grades from schools in Georgia. The students received a questionnaire regarding personal goals. They were asked what they would most like to do at school: make good grades, be popular, or be good at sports. Results are presented in the table below by the sex of the child.

Make good grades Be popular Be good in sports

Boys 96 32 94

Girls 295 45 40

4. The proportion of boys who chose the goal “be good in sports” and the proportion of girls who chose

the goal “be good in sports” are a) proportion of boys = 0.42, proportion of girls = 0.07. b) proportion of boys = 0.70, proportion of girls = 0.30. c) proportion of boys = 0.16, proportion of girls = 0.07. d) proportion of boys = 0.42, proportion of girls = 0.11.

5. Suppose we wish to test the null hypothesis that there are no differences among the proportion of boys

and the proportion of girls choosing each of the three personal goals. Under the null hypothesis, the expected number of boys that would select “be good in sports” is a) 49.4 b) 67 c) 74 d) 33

6.. Suppose we wish to test the null hypothesis that there are no differences among the proportion of

boys and the proportion of girls choosing each of the three personal goals. The value of the chi- square statistic X

2 is

a) 0.2893 b) 1.2644 c) 90.0266 d) 45.4335

For Questions 7 – 15 At what age do babies learn to crawl? Does it take longer to learn in the winter when babies are often bundled in clothes that restrict their movement? Data were collected at the University of Denver Infant Study Center where parents and their babies participated in one of a number of experiments between 1988 and 1991. Parents reported the age (in weeks) at which their child was first able to creep or crawl a distance of four feet within one minute. The researchers also recorded the average outdoor temperature (in °F) six months after each baby’s birthdate. For each month of the year, the researchers selected one baby, at random, born in that month. If we fit the least-squares line to the 12 data points (one for each month) we obtain the following results from a software package. Notice that temperature is taken as the explanatory variable and crawling age as the response.

s = 1.319

Variable Parameter estimate Standard error of estimate

Intercept 35.6781 1.318

Temperature -0.077739 0.0251

Here is a scatterplot of average crawling age versus average outdoor temperature six months after birth followed by a plot of the residuals versus average outdoor temperature six months after birth.

Suppose the researchers test the hypotheses

0 H : the slope of the least-squares regression line = 0

a H : the slope of the least-squares regression line ≠ 0.

7. The explanatory variable in this study is

a) crawling age b) the age (in weeks) at which a baby was first able to creep or crawl a distance of four feet within

one minute. c) the extent to which parents honestly reported the age at which their baby was first able to crawl

and didn’t exaggerate in order to make their baby appear gifted. d) the average outdoor temperature six months after a baby’s birthdate

8. The slope of the least-squares regression line is (approximately)

a) 35.68. b) 1.32. c) -0.08. d) -0.80

9. The quantity s = 1.319 is an estimate of the standard deviation of the deviations in the simple linear

regression model. The degrees of freedom for s 2 are

a) 1.74. b) 10. c) 11. d) 12

10. The value of the t statistic for this test is

a) -0.06. b) -3.10. c) 27.07. d) 3.10

11. Which of the following statements is supported by these plots?

a) There is no striking evidence in these plots that the assumptions for regression are violated. b) There is evidence in these plots that the assumptions for regression are violated. c) There is an influential observation in the plot, which should be deleted. d) There is an outlier in the plots suggesting that our above results must be interpreted with caution.

12. A 90% confidence interval for the slope of the least-squares regression line is (approximately) a) -0.078 ± 0.041. b) -0.078 ± 0.045. c) -0.078 ± 0.056. d) -0.078± 0.059.

13. Suppose we wish to determine the mean crawling age for all babies born when the average outdoor

temperature is 25°F six months after birth. We use computer software to do the prediction and obtain the following output.

Temp. Predict Stdev. Mean Predict

25° F 33.735 0.739

95% C.I. for Mean Predict 95% Predict Interval

(32.087, 35.382) (30.364, 37.105)

A 95% interval for mean crawling age is a) (32.087, 35.382). b) (30.364, 37.105). c) 33.735 ± 0.739. d) 33.735 ± 6.741

For Questions 14 – 16 Is there a relationship between brain size and intelligence? The Full Scale IQ scores (FSIQ) and brain sizes (in pixels, as measured by MRI scans) of 39 subjects were measured. Researchers wished to study the relationship between FSIQ and brain size, using brain size to predict FSIQ. However, the researchers believed that brain size is also dependent on body size and that some adjustment for body size might be necessary in order to understand the relation between brain size and intelligence. Therefore, the researchers also measured the heights (in inches) of the 39 subjects and used height as a measure of body size. They then used a multiple regression model to predict FSIQ from brain size and height. They obtained the following results. Analysis of Variance

Source df Sum of squares Mean Squares F

Model 2 5861.7

Error 36 15805.2

Total 38 21666.9

Variable Parameter estimate Standard error

Intercept 117.22 59.09

Brain size 0.00020957 0.00005816

Height -2.824 1.065

s = 20.95, 2

R = 0.271 The researchers assume that the statistical model for the relation between FSIQ, brain size, and height is the multiple linear regression model

i FSIQ =

0 β +

1 β (brain size)i +

2 β (height)i +

i ε

for i = 1, 2,…, 39. The deviations i

ε are assumed to be independent and normally distributed with mean 0 and standard deviation σ.

Complete the above ANOVA Table and then answer questions 14 – 16.

14. One of the subjects had FSIQ = 130, brain size = 866,662, and height = 66.5. The residual for this

subject is a) 18.95. b) 20.95. c) 111.05. d) 5.35.

15. The p-value of the analysis of variance F test of the hypothesis Ho: 1 2

0β β= = is a) less than 0.01. b) between 0.01 and 0.05. c) between 0.05 and 0.10 d) greater than 0.10.

16. Based on the above results, we may conclude that

a) the proportion of the variation of FSIQ that is explained by brain size in a multiple linear regression is 0.271.

b) the proportion of the variation of FSIQ that is explained by the variables brain size and height in a multiple linear regression is 0.271.

c) the proportion of the variation of FSIQ that is explained by the variables brain size and height in a multiple linear regression is 0.52.

d) None of the above For Questions 17 – 20 A manufacturer of infant formula is running an experiment using the standard (control) formula, and two new formulas, A and B. The goal is to boost the immune system in infants. 120 infants in the study are randomly assigned to each of three groups: group A, group B, and a control group. There are 40 infants per group, and the study is run for 12 weeks. At the end of the study the variable measured is total IGA (in mg per dl), with higher values being more desirable. We are going to run a one-way ANOVA on these data. A partial ANOVA table is given below.

One-Way Analysis of Variance

Analysis of Variance

Source DF SS MS F p

Group

Error 0.00252

Total 0.31841

17. Complete the ANOVA table above. What is the value of the F-statistic? a) 5.469 b) 0.002 c) 0.118 d) 4.679 18. The hypotheses tested by the one-way ANOVA F test are

a) Ho: the mean IGA score is the same for all three formulas. Ha: the mean IGA score is higher for both treatment groups A and B than the control group.

b) Ho: the mean IGA score is the same for all three formulas. Ha: the mean IGA score is higher for at least one of the two treatment groups than the control group.

c) Ho: the mean IGA score is the same for all three formulas. Ha: the mean IGA score is not the same for all three formulas.

d) Ho: the mean IGA score is the same for all three formulas. Ha: the mean IGA score is lower for at least one of the two treatment groups than the control group.

19. The mean square for groups in this table is a) 0.00786. b) 0.01179. c) 0.02357. d) 0.31841

20. The p-value for testing

Ho: the mean IGA score is the same for all three formulas Ha: the mean IGA score is not the same for all three formulas a) is less than 0.001. b) is between 0.001 and 0.025. c) is between 0.05 and 0.1 d) is greater than 0.10.

Extra Credit Use the following to answer questions 1–5: A study was conducted to monitor the emissions of a noxious substance from a chemical plant and the concentration of the chemical at a location in close proximity to the plant at various times throughout the year. A total of 14 measurements were made. Computer output for the simple linear regression least- squares fit is provided (some entries have been omitted and replaced with *****):

Linear Fit Concentration = 1.5429211 + 1.8247687 Emissions Summary of Fit RSquare 0.793919 RSquare Adj 0.776745 Root Mean Square Error 1.513979 Mean of Response 8.810714 Observations (or Sum Wgts) 14 Analysis of Variance Source DF Sum of Squares Mean Square F Ratio Prob > F Model ** 105.96390 ********* 46.229 <.0001 Error ** *********** ********* C. Total ** 133.46949 Parameter Estimates Term Estimate Std Error t Ratio Prob>|t| Intercept 1.5429211 1.142937 **** 0.2019 Emissions 1.8247687 0.268379 **** <.0001

1. The degrees of freedom for SSM and SSE are, respectively:

a) DFM = 2, DFE = 12. b) DFM = 1, DFE = 12. c) DFM = 1, DFE = 13. d) DFM = 1, DFE = 14.

2. What is the value for the SSE?

a) 27.50559 b) 10.26688 c) 1.142937 d) 2.292

3. What is the estimate of σ 2

? a) 1.514 b) 1.143 c) 0.794 d) 2.292

4. What is the test statistic and its value to test Ho: 1

0β = against Ha: 1

0β ≠ ? a) F = 46.2294 b) t = 6.80 c) t = 1.35 d) Either A or B.

5. What is the 95% confidence interval estimate for 0

β ? A) (1.24, 2.41) B) (-0.49, 3.58) C) (-0.95, 4.03) D) (1.35, 2.39)

6. What is the goal of statistics? a) Creating sampling distributions of the sample mean that are approximately normal b) Maximize systematic variance, minimize error variance. c) A density curve has an area underneath it of 1 d) P-values are more informative than the reject-or-not result of a fixed level α test.

Call Us: +1-402-246-1747

Leave a Reply Cancel Reply