StudentShare
Contact Us
Sign In / Sign Up for FREE
Search
Go to advanced search...
Free

Statistical Methods in Math - Case Study Example

Cite this document
Summary
The paper "Statistical Methods in Math" highlights that the critical value at 5% level of significance is 1.39. Hence the null hypothesis that there no break-in AR(1) model at 1984-1985 cannot be rejected. A single model may be used to model the whole data from 1974 to 2009…
Download full paper File format: .doc, available for editing
GRAB THE BEST PAPER95.5% of users find it useful
Statistical Methods in Math
Read Text Preview

Extract of sample "Statistical Methods in Math"

Assignment Part a) A regression is run with yrsed as the response variable on dist, bytest, female, black, hispanic, incomehi, dadcoll, mumcoll and cue80. The regression equation is Table 1 shows the regression equation along with significance of the predictors. Table 1 shows the significance of the estimated regression coefficients along with their standard errors and significance levels as well as the measures of fit. Table 1 Predictor Estimate s.e.(estimate) Significance Intercept 9.005 0.387 *** dist -0.0160 0.025 bytest 0.085 0.006 *** female 0.134 0.101 black 0.281 0.150 * hispanic 0.537 0.145 *** incomehi 0.230 0.121 * dadcoll 0.704 0.149 *** mumcoll 0.461 0.161 *** cue80 0.010 0.018 Measures of fit R2 = 25% Adjusted R2 = 24.4% Std err of regression = 1.58 (b) A simple regression of yrsed is run with only dist as the predictor. The regression equation is The regression equation indicates that years of education complete has a negative correlation with distance from 4 year college. If the distance increases, completed years of education decreases. It indicates that accessibility to a 4 year college plays an important role in furthering education. Table 2 below shows the relevance of the regression equation. Table 2 Predictor Estimate s.e.(estimate) Significance Intercept 14.028 0.074 *** dist -0.086 0.028 *** Measures of fit R2 = 0.9% Adjusted R2 = 0.8% Std err of regression = 1.809 The variable dist is significant at 1% level while it was not previously (see Table 1). Note also that the regression equation is able to explain only 0.9% of the total variability in the data, whereas in the previous model 25% of the variability was explained. The std error or regression also increases. All these point to a significant omitted variable bias. (c) Both dadcoll and mumcoll variables are significant at 1% in predicting yrsed (Table 1) and both are positively correlated with the response. Both these variables are dummy with 1 indicating being college graduate and 0 otherwise. If all the other variables in the regression model remain unchanged, then father being a college graduate is expected to increase years of education in his ward by 0.704 years. If all other variables in the regression model remain unchanged, then mother being a college graduate is expected to increase years of education in his ward by 0.461 years. (d) From Table 1 it is clear that if 1 unit of dist is decreased, all other variables remaining unchanged, years of completed education increase by 0.016 years. Note that the unit of measurement for distance is 10s of miles. Hence 20 mile is equivalent to 2 units. Hence on an average the increase in years completed is 2*0.016 = 0.032 years. The claim of increase by approximately 0.15 year if distance to the nearest college is decreased by 20 miles does not seem tenable. (e) Here two models are to be compared where model 2, the simpler model, in nested within model 1, the full model with all the predictors. The test statistic for comparison is defined as F = [(RSS2 – RSS1)/(df2 – df1)] / RSS1/df1 Where RSSi denotes the residual sum of square of the model i, i = 1, 2 and dfi is the corresponding degrees of freedom. The values of the statistics are as follows RSS2 = 2586.62 RSS1 = 2471.523 df2 = 993 df1 = 990 Value of the F-statistic is F = 38.366 / 2.496 = 15.37 F follows F distribution with 3, 990 df. At 5% level of significance the critical value of F distribution is 2.61. The observed value of F is much larger and hence the value is significant. Model 1 was the fill model but model 2 was a simpler model. The hypothesis was whether model 2 may be used in place of model 1. However, this hypothesis is rejected and the conclusion drawn is taken as a group the variables dadcoll, mumcoll and cue80 may not be eliminated from the model. (f) From Table 1 it is seen that both the regression coefficients corresponding to black and Hispanic are significant, even though significance of black is only at 10% level. In fact its p-value is 6%. Both the coefficients are positive. Hence one may conclude that being Hispanic is expected to increase years of completed education by 0.54 years. Similarly, being black also is expected to increase years of completed education by 0.28 years. Nevertheless, two issues are to be addressed here. Usually significance is considered at 5% or less. As a special case, for model building, significance at 10% may also be considered, but this needs to be pointed out clearly. Second issue is more complicated. The contention is blacks and Hispanics complete more years of college than whites. However, it is not mentioned that there are only three ethnicities considered. The comparison may be among whites, blacks, Hispanics, oriental or other native or mixed ethnicities. Unless such information is given, no definite conclusion may be drawn regarding the contention that blacks and Hispanics complete more years of college than whites. Assignment Part 2 A regression is run with yrsed as the response and female, incomehi and bytest as the predictor variables. The regression equation is (a) To determine whether non-inclusion of mumcoll introduces a omitted variable bias, residual sums of squares of two models, one with mumcoll (Model 1) and the other without mumcoll (Model 2), are compared and the resultant F-statistic is checked for significance. RSS2 = 2623.37 RSS1 = 2566.95 df2 = 996 df1 = 995 Observed value of F is F = [(RSS2 – RSS1)/(df2 – df1)] / RSS1/df1 = 21.87 F-statistic follows F distribution with 1 and 995 degrees of freedom. The above value is significant at 5% level of significance. Hence a conclusion may be drawn that there is indeed an omitted variable bias if mumcoll is not included in the model. (b) The Farrer-Glauber test for multicollinearity uses a chi-square statistic to test for presence of multicollinearity in the data. It also uses the natural logarithm of the determinant of |X’X|. Χ2 = −[T – 1 – 1/6(2p + 5)] log|X’X| Where X is (T x p) design matrix of the regression. Here T = 1000 and p = 3. The above statistic follows a chi-square distribution with ½(p)(p-1) = 3 degrees of freedom. Observed values of Χ2 = - [999 – 1/6(6+5)]*(-0.05) = 49.85 whereas at 5% level of significance the critical value is 7. 81. Hence it may be concluded that there is multicollinearity among the three variables female, incomehi and bytest. (c) Next it requires to be determined which variable is responsible for multucollinearity. Regressing incomehi on female and bytest multiple R2 = 0.033 is obtained. Regressing bytest on female and incomehi R2 = 0.036 is obtained. In the first instance value of test statistic is F1 = [0.033/(1-0.033)][(1000 – 3)/(3-1)] = 17.011 And value of second statistic is F2 = [0.036/(1-0.036)][(1000 – 3)/(3-1)] = 18.616 At (997, 2) df and 5% level of significance, critical value is 19.49. Hence none of these two variables seems to be responsible for multicollinearity. (d) The White test for heteroskedasticity tests whether the error variances are constant for all observations. To do that, the squared residuals are regressed over the predictors, the squared predictors and the cross-product of the predictors. Here two predictors, female and incomehi, are binary. Hence these variables and their squares are identical. Multiple R2 of this auxiliary regression is 0.039. Hence observed value of the test statistic is 1000*0.039 = 39. White’s test statistic follows a chi-square distribution with 7 df. At 5% level of significance critical value is 14.06. Hence null hypothesis of homoskedasticity is rejected. Assignment Part 3 (a) Table 1 : Quarterly GDP Growth Rate Mean 0.7705 % Standard Deviation 0.8795 % Autocorrelation of order 1 34.3% Autocorrelation of order 2 27.4% Autocorrelation of order 3 9.2% Autocorrelation of order 4 8.4% Autocorrelations of quarterly GDP growth rate are unit free. (b) The first order autoregressive model is fitted on quarterly GDP growth rate. The regression equation is The regression model indicates that there is a significant negative dependence of quarterly GDP growth rate on that of the previous quarter. 95% confidence interval for population AR(1) parameter is (− 0.3231, − 0.5739). (c) The second order autoregressive model is fitted on quarterly GDP growth rate. The regression equation is The second order autoregressive model above indicates that quarterly GDP growth rate has significant dependence on that of previous two quarters. Both dependences are negative in nature. That means if other things remaining constant GDP growth rate of previous quarter increases, GDP growth rate of this quarter will decrease. Similarly, if GDP growth rate of the previous second quarter increase, GDP growth rate of this quarter will decrease. AR(2) model is preferred to AR(1) model as the MSE for the former is less than the MSE for the latter. This indicates that the additional parameter relating GDP growth rate in the second previous quarter is instrumental in explaining more variability of the GDP growth rate. (d) AR(1) model for subsample 1: 1974 - 1984 AR(1) model for subsample 2: 1985 – 2009 To test for a break in AR(1) model an F-test comparing the restricted and the unrestricted model is used. Observed value of the F-statistic F =[ [155.968 - (71.1427 + 31.8707)] / 57] /0.8 = 1.16 This follows F-distribution with 57,195 df. The critical value at 5% level of significance is 1.39. Hence the null hypothesis that there no break in AR(1) model at 1984-1985 cannot be rejected. A single model may be used to model the whole data from 1974 to 2009. Read More
Cite this document
  • APA
  • MLA
  • CHICAGO
(Statistical methods Case Study Example | Topics and Well Written Essays - 1750 words, n.d.)
Statistical methods Case Study Example | Topics and Well Written Essays - 1750 words. https://studentshare.org/mathematics/1752073-statistical-methods
(Statistical Methods Case Study Example | Topics and Well Written Essays - 1750 Words)
Statistical Methods Case Study Example | Topics and Well Written Essays - 1750 Words. https://studentshare.org/mathematics/1752073-statistical-methods.
“Statistical Methods Case Study Example | Topics and Well Written Essays - 1750 Words”. https://studentshare.org/mathematics/1752073-statistical-methods.
  • Cited: 0 times

CHECK THESE SAMPLES OF Statistical Methods in Math

Best Practices in improving student success rate in college - developmental/remedial math

Supplementary methods like mastery learning, computer aided learning, mentoring practice and innovative math lab have been adopted along with this pilot course.... statistical analysis of the data has been used to find out recommendations to improve the students success rate in developmental mathematics....
27 Pages (6750 words) Dissertation

Statistics Project

o the Pearson correlation (r) of popularity and math scores is equal to -0.... This may imply as popularity level increases, math test scores decreases and vice versa.... Conclusion:Do not reject null hypothesis so we conclude that there is no significant linear association/relationship between level of popularity and math test score.... These aspects include some basic terms, method of data collection, data capturing and editing, statistical analysis, interpretation and report writing....
8 Pages (2000 words) Research Paper

Statistics in a Real-World Context

Part 2b will be the evidence to prove this hypothesis that student's math scores will be positively related to their science scores.... Probability as a general concept can be defined as the chance of an event occurring.... In addition to being used in games of chance, probability is used in different real life fields like insurance, investments, and weather forecasting, and in various areas. ...
14 Pages (3500 words) Essay

Terra Nova 4th Grade Math Score

In the paper “Terra Nova 4th Grade math Score” the author discusses the paired-sample t-test between “math scaled score” and “Terra Nova 4th grade math score”.... The null hypothesis is that there is no difference between the maths scaled score and the Terra Nova 4th grade math score.... The null hypothesis is that the averages in the math tests among ethnic groups are similar (or the same)....
3 Pages (750 words) Essay

The Statistical Methods of Collecting Data and Information

Although, in using the data collection method, I learned that there are different methods which include; the face-to-face interview method whereby I will interview people who I know and the telephone interviews which I do by calling people and asking them questions.... This essay discusses statics that helps me to be able to make decisions that are based on information that is collected from a research project....
2 Pages (500 words) Essay

Descriptive Statistics, Inferential Statistics

Individuals working with statistics use statistical methods… and thinking to a broad range of social, scientific, and business endeavors in such areas as biology, economics, astronomy, education, marketing, engineering, psychology, genetics, marketing, sports, public health and many others.... The best idea concerning a statistician is that Most of the economics, military, political and social decisions, cannot be implemented without statistical formulas, such as the experiments design gain federal approval of a presently manufactured drug....
4 Pages (1000 words) Admission/Application Essay

SPSS Statistics Project

So the Pearson correlation (r) of popularity and math scores is equal to -0.... This may imply as popularity level increases, math test scores decrease, and vice versa.... These aspects include some basic terms, methods of data collection, data capturing and editing, statistical analysis, interpretation and report writing.... hellip; There are various statistical packages designed to carry out quantitative data analysis, the most widely used package is SPSS....
8 Pages (2000 words) Research Paper

The Historical Development of Women in Mathematics

The essay outlines the biographies of women mathematician who focused on numerous achievements in the field of include arithmetic, algebra, geometry, calculus, number theory, advanced algebra, topology etc.... These notable women have left their mark in the historical development of mathematics by contributing in mathematical research, ideas, knowledge and novel....
6 Pages (1500 words) Essay
sponsored ads
We use cookies to create the best experience for you. Keep on browsing if you are OK with that, or find out how to manage cookies.
Contact Us