Consider the following data for 15 subjects with two predictors. The dependent variable, MARK, is the total score for a subject on an examination. The first predictor, COMP, is the score for the subject on a so-called compulsory paper. The other predictor, CERTIF, is the score for the subject on a previous exam.
Student MARK COMP CERTIF
1 476 111 68
2 457 92 46
3 540 90 50
4 551 107 59
5 575 98 50
6 698 150 66
7 545 118 54
8 574 110 51
9 645 117 59
10 556 94 97
11 634 130 57
12 637 118 51
13 390 91 44
14 562 118 61
15 560 109 66
a. Run a stepwise regression on the dataset
b. Does CERTIF add anything to predicting MARK, above and beyond that of COMP?
c. Write out the prediction equation
d. A statistician wishes to know the sample size needed in a multiple regression study. She has four predictors and can tolerate at most a .10 drop-off in predictive power. But she wants this to be the case with .95 probability. From previous related research, the estimated squared population multiple correlation is .62. How many subjects are needed?
e. What statistical assumptions must be met to use multiple regression analysis and how do you evaluate the assumptions?
In this example, three tasks were employed to ascertain differences between good and poor undergraduate writers on recall and manipulation of information: an ordered letters task, an iconic memory task, and a letter reordering task. In the following table are means and standard deviations for the percentage of correct letters recalled on the three dependent variables. There were 15 participants in each group.
f. Good writers Poor writers
Task M SD M SD
Ordered letters 57.79 12.96 49.71 21.79
Iconic memory 49.78 14.59 45.63 13.09
Letter reordering 71.00 4.80 63.18 7.03
Consider this results section:
The data were analyzed via a multivariate analysis of covariance using the background variables (English usage ACT subtest, composite ACT, and grade point average) as covariates, writing ability as the independent variable, and task scores (correct recall in the ordered letters task, correct recall in the iconic memory task, and correct recall in the letter reordering task) as the dependent variables. The global test was significant, F(3, 23) = 5.43, p < .001. To control for experiment-wise type I error rate at .05, each of the three univariate analyses was conducted at a per comparison rate of .017. No significant difference was observed between groups on the ordered letters task, univariate F(1, 25) = 1.92, p > .10. Similarly, no significant difference was observed between groups on the iconic memory task, univariate F < 1. However, good writers obtained significantly higher scores on the letter reordering task than the poor writers, univariate F(1, 25) = 15.02, p < .001.
a. From what was said here, can we be confident that covariance is appropriate here?
b. The “global” multivariate test referred to is not identified as to whether it is Wilks’ Λ, Roy’s largest root, and so on. Would it make a difference as to which multivariate test was employed in this case?
c. The results mention controlling the experiment-wise error rate at .05 by conducting each test at the .017 level of significance. Which post hoc procedure is being used here?
d. Is there a sufficient number of participants for us to have confidence in the reliability of the adjusted means?
e. What is the main reason for using covariance analysis in a randomized study?
f. What statistical assumptions must be met to use Analysis of Covariance?
An investigator has a 50-item scale and wishes to compare two groups of participants on the item scores. He has heard about MANOVA, and realizes that the items will be correlated. Therefore, he decides to do a two-group MANOVA with each item serving as a dependent variable. The scale is administered to 45 participants, and the investigator attempts to conduct the analysis. However, the computer software aborts the analysis. Why? What might the investigator consider doing before running the analysis?
Suppose you come across a journal article where the investigators have a three-way design and five correlated dependent variables. They report the results in five tables, having done a univariate analysis on each of the five variables. They find four significant results at the .05 level. Would you be impressed with these results? Why or why not? Would you have more confidence if the significant results had been hypothesized a priori? What else could they have done that would have given you more confidence in their significant results?
a. Consider the following data for a two-group, two-dependent-variable problem:
Y1 Y2 Y1 Y2
1 9 4 8
2 3 5 6
3 4 6 7
b. Compute W, the pooled within-SSCP matrix.
c. Find the pooled within-covariance matrix, and indicate what each of the elements in the matrix represents.
d. Find Hotelling’s T2.
e. What is the multivariate null hypothesis in symbolic form?
f. Test the null hypothesis at the .05 level. What is your decision?
g. An investigator has an estimate of D 2 = .61 from a previous study that used the same four dependent variables on a similar group of participants. How many subjects per group are needed to have power = .70 at α = .10?
The post Question 1. Consider the following data for 15 subjects with two predictors. The dependent variable appeared first on mynursinghomeworks.