Our website is a unique platform where students can share their papers in a matter of giving an example of the work to be done. If you find papers
matching your topic, you may use them only as an example of work. This is 100% legal. You may not submit downloaded papers as your own, that is cheating. Also you
should remember, that this work was alredy submitted once by a student who originally wrote it.
The paper "Understanding and Exploring Assumptions" describes that the null hypothesis in the Levene test for homogeneity is that there is homogeneity of variance. As indicated by the table, we are unable to reject the null hypothesis for all variables except for numeracy…
Download full paperFile format: .doc, available for editing
Extract of sample "Understanding and Exploring Assumptions"
Activity 4 Why do we care whether the assumptions required for statistical tests are met? It is very important that the assumptions for statistical tests are met because the appropriate interpretation of the statistics would depend on whether the applicable assumptions are met. For instance, for inferential statistics, it is usually important that the sample is a random sample so that the sample characteristics would have valid information on population characteristics. Further, some statistics that are used to compare samples require that we explicitly identify what assumption would be likely applicable, or whether we should assume that variance of the two samples are equal or unequal. The assumption of normality or non-normality of distribution will also inform us the extent to which we can use a normal distribution or a distribution like the chi-square distribution to make inferences from data.
2. Open the data set that you corrected in activity #3 for DownloadFestival.sav. You will use the following variables: Day1, Day2, and Day3 (hygiene variable for all three days). Create a simple histogram for each variable. Choose to display the normal curve (under Element Properties) and title your charts. Copy these plots into your Activity #4 Word document.
Histogram of Variables Day1, Day2, and Day3
3. Now create probability-probability (p-p) plots for each variable. This output will give you additional information. Read over the Case Processing Summary. Notice that there is missing data for Days 2 and Day 3? Copy only the Normal p-p Plots into your Activity #4 Word document (you do not need to copy the beginning output nor the Detrended Normal p-p Plots).
4. Examining the histograms and p-p plots describe the dataset, with particular attention toward the assumption of normality. For each day, do you think the responses are reasonably normally distributed? (Just give your impression of the data.) Why or why not?
Based on the p-p plot for each variable, using “normal” as the test distribution, variables “day 1”, “day 2”, and “day 3” are most likely normally distributed. They are most likely normally distributed or that the assumption of normal distribution is justified because according to the SPSS help facility, “if the selected variable matches the test distribution, the points cluster around a straight line.” In the SPSS output I have produced for item 3, the test distribution I selected is the “normal distribution”.
However, I must point out that based on the SPSS outputs I have produced for item 2, while variable “day 1” appears to be normally distributed, variables “day2” and “day2” are positively skewed distribution.
5. Using the same dataset, and the Frequency command, calculate the standard descriptive measures (mean, median, mode, standard deviation, variance and range) as well as kurtosis and skew for all three hygiene variables. Paste your output into your Activity #4 Word document (you do not need to paste the Frequency Table). What does the output tell you? You will need to comment on: sample size, measures of central tendency and dispersion and well as kurtosis and skewness. You will need to either calculate z scores for skewness and kurtosis or use those given in the book to provide a complete answer. Bottom line: is the assumption of normality met for these three variables? Does this match your visual observations from question #2?
Hygiene (Day 1 of Download Festival)
Hygiene (Day 2 of Download Festival)
Hygiene (Day 3 of Download Festival)
N
Valid
810
264
123
Missing
0
546
687
Mean
1.7934
.9609
.9765
Median
1.7900
.7900
.7600
Mode
2.00
.23
.44(a)
Std. Deviation
.94449
.72078
.71028
Variance
.892
.520
.504
Skewness
8.865
1.095
1.033
Std. Error of Skewness
.086
.150
.218
Kurtosis
170.450
.822
.732
Std. Error of Kurtosis
.172
.299
.433
Range
20.00
3.44
3.39
a Multiple modes exist. The smallest value is shown
All samples are large samples, implying that we can use the assumption of normality for all distributions (Walpole et al., 2007, p. 182-185, 245-247). The relative closeness of the values of mean, median, and mode for “day1” and “day3” suggest a symmetric or normal distribution. Based on the measures of the skew, “day1” is more positively skewed compared to “day2” and “day3”. The variance values among the three variables suggest that they are relatively homogenous because the values are close to zero. Among the three variables, “day1” has the greatest excess kurtosis based on Gujarati (1995, p. 143). The assumptions of normality are not basically met for all three variables because their skew and kurtosis values are generally twice their standard errors (except for variable “day3” in which kurtosis is less than twice the standard error). Nevertheless, the conventional statistical theory says we can validly assume that they are normally distributed with regard to the computation of the mean (Walpole et al., 2007, p. 182-185, 245-247). The figures for skew and kurtosis provided by the table do not match my interpretation of the graphics I obtained in item 2.
6. Using the dataset SPSSExam.sav, and the Frequency command, calculate: the standard descriptive statistics (mean, median, mode, standard deviation, variance and range) plus skew and kurtosis, and histograms with the normal curve on the following variables: Computer, Exam, Lecture, and Numeracy for the entire dataset. Complete the same analysis using University as a grouping variable. Paste your output into your Activity #4 Word document (you do not need to paste the Frequency Table). What do the results tell you with regard to whether the data is normally distributed?
Percentage on SPSS exam
Computer literacy
Numeracy
Percentage of lectures attended
N
Valid
100
100
100
100
Missing
0
0
0
0
Mean
58.10
50.71
4.85
59.765
Median
60.00
51.50
4.00
62.000
Mode
72(a)
54
4
48.5(a)
Variance
454.354
68.228
7.321
470.230
Range
84
46
13
92.0
a Multiple modes exist. The smallest value is shown
Grouping Variable: University
Percentage on SPSS exam
Computer literacy
Numeracy
Computer literacy
University
Duncetown University
Mean
40
50
4
50
Mode
34
48
4
48
Median
38
49
4
49
Maximum
66
67
9
67
Minimum
15
35
1
35
Standard Deviation
13
8
2
8
Variance
158
65
4
65
Sussex University
Mean
76
51
6
51
Mode
72
54
5
54
Median
75
54
5
54
Maximum
99
73
14
73
Minimum
56
27
1
27
Standard Deviation
10
9
3
9
Variance
104
72
9
72
7. Using the dataset SPSSExam.sav, determine whether the scores on computer literacy and percentage of lectures attended (with University as a grouping variable) meet the assumption of homogeneity of variance (use Levene test). You must remember to unclick the split file option used above before doing this test. What does the output tell you? (Be as specific as possible.)
Test of Homogeneity of Variances
Levene Statistic
df1
df2
Sig.
Percentage on SPSS exam
2.584
1
98
.111
Computer literacy
.064
1
98
.801
Percentage of lectures attended
1.731
1
98
.191
Numeracy
7.368
1
98
.008
The null hypothesis in the Levene test for homogeneity is that there is homogeneity of variance. As indicated by the table immediately above, we are unable to reject the null hypothesis for all variables except for numeracy.
8. Describe the assumptions of normality and homogeneity of variance. When these assumptions are violated, what are your options? Are there cases in which the assumptions may technically be violated, yet have no impact on your intended analyses? Explain.
When the null hypothesis of homogeneity of variance can be rejected then the option is to test hypotheses based on the separate variance t-test (Kinnear and Grav, 2008, p. 200).
Bibliography
Kinnear, P. & Gray, C. (2008). SPSS 15 made simple. Hove and New York: Taylor and Francis Group.
SPSS, Inc. (2005). SPSS Version 15. A statistical software with tutorial. Surrey, UK: SPSS UK Limited.
Walpole, R., Myers, R., Myers, S., and Ye, Keying (2007). Probability & Statistics for Engineers & Scientists. 8th ed. New Jersey: Pearson Educational International.
Read
More
Share:
sponsored ads
Save Your Time for More Important Things
Let us write or edit the assignment on your topic
"Understanding and Exploring Assumptions"
with a personal 20% discount.