# LINEAR REGRESSION

The shelf life of packaged food depends on many factors. Dry cereal is considered to be a moisture-sensitive
product, with the shelf life determined primarily by moisture content. In a study of the shelf life of one particular
brand of cereal, x = time on shelf (stored at
73 F

and 50% relative humidity) and y = moisture content were
recorded.
The resulting data is from “Computer Simulation Speeds Shelf Life Assessments.”
x 0 3 6 8 10 13 16 20 24 27 30 34 37 41
y 2.8 3.0 3.1 3.2 3.4 3.4 3.5 3.1 3.8 4.0 4.1 4.3 4.4 4.9
1. Construct a scatter plot of moisture content (y) vs. time on shelf (x) below.
3 6 9 12 15 18 21 24 27 30 33 36 39 42

### Save your time - order a paper!

Get your paper written from scratch within the tight deadline. Our service is a reliable solution to all your troubles. Place an order on any task and we will take care of it. You won’t have to worry about the quality and deadlines

Order Paper Now

2. Does the scatter plot reveal anything about the relationship between the time on the shelf and the
moisture content, i.e., does knowing the time on the shelf help in predicting moisture content?
…………………………………………………………………………………………………………………………………………………………..
3. Two variables positively associated, with a positive slope where larger values of one variable tend
to occur with larger values of the other variable. Two variables are negatively associated, with a
negative slope where larger values of one variable tend to occur with smaller values of the other
variable. What type of association does time on the shelf and the moisture content seem to have?
…………………………………………………………………………………………………………………………………………………………..
4. Enter the time on the shelf (x) in L1 and the moisture content (y) in L2. Calculate a regression
line using your calculator, STAT-CALC 4: LinReg. Record your linear regression equation below.
Regression Equation:
…………………………………………………………………………………………………………………………………………………………..
5. Find a 99% confidence interval for the moisture content (y) of this brand of cereal based on the data.
You can find the mean and standard deviation analyzing the data in L2.
y
2
s
y t
n

   
 =         
=
…………………………………………………………………………………………………………………………………………………………..
6. The scatter plot has a high linear correlation if the points lie close to a straight line and the correlation
coefficient, r-reading on the calculator is close to
1.0
or − 1.0 . The correlation coefficient, r,
measures how strongly the x and y values in a sample of pairs are related to one another.
Does this line seem to fit the data reasonably well? 1
5.0
4.5
4.0
3.5
3.0
2.5
2.0
1.5
1.0
0.5
7. What is the calculated correlation coefficient, r, for time on the shelf vs. moisture content?
 Catalog DiagnosticOn enter, enter = r
…………………………………………………………………………………………………………………………………………………………..
8. If the two variables have a high correlation between them, does it follow that there must be a
“cause-and-effect” relationship between them? Can you say “x causes y” or “y causes x?”
…………………………………………………………………………………………………………………………………………………………..
9. The coefficient of determination, denoted by “
2
r
,” is the proportion of variation in y that can be
attributed to an approximate linear relationship between x and y in the sample. Multiplying
2
r
by 100
gives the percentage of “moisture content” that can be explained or attributed to the “time on the shelf.”
Make a general statement relating “moisture content” and “time on the shelf” for this brand of cereal.
…………………………………………………………………………………………………………………………………………………………..
10. What is the calculated coefficient of determination,
2
r , for time on the shelf vs. moisture content?
=
2
r
…………………………………………………………………………………………………………………………………………………………..
Match each correlation coefficient, r, and each coefficient of determination,
2
r
, with the appropriate graph.
11. r = 0.18 15. r 0.9801 2
=
12. r = – 0.8 16. r 0.0324 2
=
13. r = 0.8 17. r 0.64 2
=
14. r = 0.99
…………………………………………………………………………………………………………………………………………………………..
18. Given the following data, choose the equation of the least squares prediction line, y ax b ˆ
= +
?
Refer to your help sheet for formulas for slope, a and y-intercept, b.
r = 0.8920 sx
= 3.918 sy
= 4.698 x = 25.9 y = 86.2
A.
y 1.070x 58.498 = +
B.
y 1.070x 58.498 = − +
C.
y 27.702x 58.498 = − +
D.
y 58.498x 1.070 = +
…………………………………………………………………………………………………………………………………………………………..
19. A least squares regression line, y 56.9x 6924 ˆ
= −
, is used to predict the profit of a company
where 56.9 is the net gain for each item sold and – 6924 represents the fixed costs. What is the
predicted profit on sales of 767 units?
2
One classic application of correlation involves the association between the temperature and the number of
times a cricket chirps in a minute. The table lists the number of chirps in 1 minute and the corresponding
temperature in
F . Is there sufficient evidence at the 5% level of significance to conclude that there is a linear
correlation between the number of chirps and the temperature?
20. H0
: n = Critical Values:
H : 1
r
= p-value =
2
r
=
 =
21. Test Statistic:
2
r
t
1 r
n 2
=

= =
22. Decision:
Conclusion:
………………………………………………………………………………………………………………………………………………………….
23. Four data sets are given. Enter x in L1 and y in L2 in the Lists of your calculator and graph the data
sets. Note the shapes and any outstanding features, such as curves and outliers below each data set.
Set 3 x 10 8 13 9 11 14 6 4 12 7 5
y 7.46 7.77 12.74 7.11 7.81 8.84 6.08 5.39 8.15 6.42 5.73
………………………………………………………………………………………………………………………………………………..
24. Calculate the least squares regression line, y
ˆ
= ax + b
, and the correlation coefficient, for Data Set 4.
Data Set 1:
y 0.468x 3.096 ˆ
= + r 0.808 =
Data Set 2:
y 0.500x 3.001 ˆ
= + r 0.816 =
Data Set 3:
y 0.491x 3.175 ˆ
= + r 0.807 =
Data Set 4:
………………………………………………………………………………………………………………………………………………..
25. What conclusions can you make after you have investigated all four data sets?
Chirps in 1 min. 882 1188 1104 864 1200 1032 960 900
Temperature 69.7 93.3 84.3 76.3 88.6 82.6 71.6 79.6
Set 1 x 10 8 13 9 11 14 6 4 12 7 5
y 8.04 6.95 7.58 8.81 8.33 9.66 7.24 4.26 10.84 4.82 5.68
Set 2 x 10 8 13 9 11 14 6 4 12 7 5
y 9.14 8.14 8.74 8.77 9.26 8.10 6.13 3.10 9.13 7.26 4.74
Set 4 x 8 8 8 8 8 8 8 19 8 8 8
y 6.58 5.76 7.71 8.84 8.47 7.04 5.25 12.5 5.56 7.91 6.89
3
An instructor wants to show students that there is a linear correlation between the number of hours they watch
TV on a certain weekend and their scores on a test taken the following Monday. The number of television
viewing hours and the test scores for 12 randomly selected students are shown on the table. At
 = 0.01 ,
is there enough evidence to conclude that there is a significant linear correlation between the data?
26. H0
: n = Critical Values:
H : 1
r
= p-value =
2
r
=
 =
27. Test Statistic:
2
r
t
1 r
n 2
=

= =
28. Decision:
Conclusion:
………………………………………………………………………………………………………………………………………………………….
29. Calculate the correlation coefficient r, letting Row 1 represent the x-values and Row 2 represent the
y-values. Then calculate the correlation coefficient r, letting Row 2 represent the x-values and Row 1
represent the y-values.
What effect does switching the explanatory and response variables have on the correlation coefficient?
…………………………………………………………………………………………………………………………………………………………..
30. A bivariate data set has a coefficient of determination of 0.7592. Which of the following is possible for
the correlation coefficient?
A.
 0.759
B.
 0.576
C.
 0.871
D.
 0.987
…………………………………………………………………………………………………………………………………………………………..
31. The correlation coefficient for a bivariate data set is 0.8460. What percentage of the variability in
the y-variable is attributed or explained by the x-variable?
A. 15.4% B. 28.43% C. 71.57% D. 84.6%
…………………………………………………………………………………………………………………………………………………………..
32. What is the coefficient of determination for the following bivariate data set?
A. 0.943 B. 0.975 C. 0.951 D. 0.987

The post LINEAR REGRESSION first appeared on COMPLIANT PAPERS.