Retrieved from https://studentshare.org/statistics/1396483-statistics-and-econometrics-assignment
https://studentshare.org/statistics/1396483-statistics-and-econometrics-assignment.
The respective means of these variables are 82.38, 80.77 and 44.66 and significant variability among the values taken by these variables is observed, implying a possibility that variations in attendance can potentially cause variations in marks. Other variables that can potentially affect performances in the course have to be accounted for to ensure a proper evaluation and so, “ability”, “age”, “hrss”, i,e., study hours are also explored. All these variables reflect strong variability and thus are all potential candidates as controls.
(For details, see table 1 in appendix). Apart from simply looking at individual descriptive statistics, in order to obtain some idea about the interrelationships and potential causations, a table of scatter plots are also explored where “smarks” is the plotted as the y variable while “ability”, “age”, “hrss”, “alevelsa” “attl” as well as squared forms of ability and attl as the x variables. From the plots (figure 2 in appendix), we find that both ability and its square seem to be positively correlated with marks.
The variables “age” and “alevelsa” seem to have no associative patterns with marks. For attendance, our primary variable of interest, we find that there is evidence of clustering of values greater than the mean marks at the higher values of attl implying that higher lecture attendance rate is associated with better performances on average on the course. Further, it seems that there is some clustering at higher values of the squared lecture attendance rates. No correlation seems to be present between smarks and hrss from the last graph in the table.
The interrelationships between these variables are important for regression specifications, since high correlations among independent variables may lead to multicollinearity. So, a scatterplot matrix is presented as figure 2 in the appendix. Therefore, the summary statistics and the scatter plots, show that there is a strong possibility that class attendance influences performance along with other factors such as ability. Further, since some evidence of possible positive correlation between class performance as measured by “smarks” and the squares of “ability” and attendance, represented by “attl” were observed, the possibility of nonlinear dependence cannot be ignored. 2. Basic OLS estimation a) From the simple regression of smarks on an intercept and the variable “attl”, we find that attendance has a significant positive impact on performance1.
The coefficient on attendance is close to 0.15 and has a t-stat value of 4.33>1.96, which is the 5% critical value for the t distribution under the null hypothesis that the coefficient is insignificant, i.e., is not statistically significantly different from zero. Additionally the intercept takes a value of 52.91 implying that the conditional mean of “smarks” is 52.91 for students who have a zero attendance rate for lectures. This value is significant at the 5% level as well (t-stat value 19.06>1.96).
However, the adjusted R-squared value is only 0.06 implying that only 6% of the variation of performance can be explained in terms of variations in lecture attendance rates. Therefore, the model fit is poor. b) Inclusion of ability and hours studied (hrss) leads to the impact of attendance rate falling to approximately 0.13 from 0.15, but the
...Download file to see next pages Read More