StudentShare
Contact Us
Sign In / Sign Up for FREE
Search
Go to advanced search...
Free

Statistical Analysis - Simple Linear Regression Model - Assignment Example

Summary
The paper "Statistical Analysis - Simple Linear Regression Model" is a good example of an assignment on statistics. Y = β o + β 1 x + e, where β o is the y-intercept and β 1 is the population regression line. That is the slope. In the first case, the regression parameters are estimated by the use of least-squares…
Download full paper File format: .doc, available for editing
GRAB THE BEST PAPER92.2% of users find it useful

Extract of sample "Statistical Analysis - Simple Linear Regression Model"

Statistical Analysis Assignment By Student Name Course code + name Professor name University name City, State Date of submission Simple Linear Regression Model Y = β o + β 1 x + e, where β o is the y- intercept and β 1 is the population regression line. That is the slope. In the first case, the regression parameters are estimated by the use of least-squares. Assuming a given sample made up of n pairs of the observations, then x 1 – y 1, ………x n – y n. in order to get the line of the best fit, then the parameters β 0,β 1 and 6 must be determined using the least square methods (Christensen, Johnson, & Turner, 2011) Y = b 0 + b 1 x Whereby, b1 = ∑ (x 1 – x mean) y 1 / ∑ (x 1 – x mean) 2= r y/s x gives β 1 b0 = y mean – b 1 x mean is the estimator of β o. Then, the residual is estimated, this is the difference between the observed value and the expected value. E I = y I – (b o + b 1 x I) = y I – y I mean Thus, 6 2 = can be estimated by e i S 2 = ∑ (y i – y i mean) 2 / n – 2. It should be noted that s is the regression standard error. It also has n – 2 degrees of freedom (Christensen, Johnson, & Turner, 2011) From the data given, it can be assumed that Y is the shoe-size of men and, X is the shoe-size for women X Y 507 507 - 2,479.33 = - 19479 504 504 - 1536 = - 1032 616 616 - 2,479.33 = - 1863.33 526 526 - 1536 = - 1010 772 772 - 2,479.33 = - 1707.33 564 564 -1536 = - 972 1044 1044 - 2,479.33 = - 1,435.33 811 811- 1536 = - 725 1170 1170 - 2,479.33 = -1,309.33 1079 1079 - 1536 = - 457 1560 1560 - 2,479.33 =-919.33 1487 1487 - 1536 = - 49 1642 1642 - 2,479.33 = - 837.33 1509 1509 - 1536 = - 27 1970 1970 - 2,479.33 = -509.33 1462 1462 - 1536 = - 74 2466 2466 - 2,479.33 = -13.33 1888 1888 - 1536 = 352 3231 3231 - 2,479.33 = -751.67 1489 1489 - 1536 = - 47 2613 2613 - 2,479.33 = 133.67 1372 1372 - 1536 = - 164 2401 2401 - 2,479.33 = - 78.33 1738 1738 - 1536 = 202 2926 2926 - 2,479.33 = 446.67 1232 1232 - 1536 = - 304 2477 2477 - 2,479.33 =-2.33 1832 1832 - 1536 = 296 1755 1755 - 2,479.33 =-724.33 1738 1738 - 1536 = 202 2035 2035 - 2,479.33 =-444.33 1584 1584 - 1536 = 48 1713 1713- 2,479.33 = -766.33 1361 1361-1536 = - 175 2073 2073- 2,479.33 = -406.33 1528 1528-1536 = - 8 2543 2543- 2,479.33 = 63.67 1552 1552-1536 = 16 2401 2401- 2,479.33 = -78.33 1738 1738-1536 = 202 2926 2926- 2,479.33 = 446.67 1536 1536-1536 = 0 2477 2477- 2,479.33 = - 2.33 1255 1255-1536 = - 281 1755 1755- 2,479.33 = - 724.33 1741 1741-1536 = 205 2035 2035- 2,479.33 = - 444.33 1584 1584-1536 = 48 1713 1713- 2,479.33 = - 766.33 1361 1361-1536 = - 175 2073 2073- 2,479.33 = - 406.33 1528 1528-1536 = - 8 2543 2543- 2,479.33 =63.67 1552 1552-1536 = 16 531 531- 2,479.33 = -1,948.33 1439 1439-1536 = - 97 3000 3000- 2,479.33 = 520.67 1591 1591-1536 = 55 2892 2892- 2,479.33 =412.67 1899 1899-1536 = 363 2800 2800- 2,479.33 = 320.67 1944 1944-1536 = 408 2979 2979- 2,479.33 = 499.67 2103 2103-1536 = 567 Determining STATA X mean = 2,479.33 Y mean = 1536 X1 2896 Y1 1412 b1 = ∑ (x 1 – x mean) y 1 / ∑ (x1 – x mean) 2 Substituting the figures into the formula; b1 = ∑ (2896 – 2479.33)2 / ∑ (2896 – 2479.33) = ∑416.67/416.67 = 173,613.88 – 416.67 = 173,197 S 2 = ∑ (y i – y i mean) 2 / n – 2. = (1412 – 1536)2 / 36 – 2 = 15,376 / 3445 = 452.24 b0 = y mean – b1 x mean is the estimator of β o. = 1536 – 452.24 x 2.479 = 415 CIs for Regression parameters b1 = n (β 1 [б / ∑x 1 – x mean] 2) (Christensen, Johnson, & Turner, 2011) Given that 6 are not given, then s, is used to estimate it. This will result to t-confidence interval for β o and β 1 Regression Inference: Conditions for Inference The sample used is an SRS drawn from the population The population has a linear relationship, this, is checked by assessing the linearity of a scatter diagram from the sampled data. The standard deviation about the population of the response for all the values is the same for the explanatory variables. This is determined by assessing and plotting for all the residuals and determining whether the observation spread is uniform around the least squares as the value of x varies. There is a variation of the response about the regression line of the population .This condition is determined by the observation of a normal quintile residual plot. Determining the Confidence Intervals The confidence interval for β 0 by the level (1 - α) is determined as below. b0 – t S E b o+ t S E bo (Vergura, Acciani, Amoruso, Patrono, & Vacca, 2009) Where t * = α / 2. It is the critical value of the t n -2, and the S E (t i) =∑ s / (x 1 – x mean) 2 436 / (2896 – 1536)2 = 0.00024 Hypothesis test on regression parameters In order to determine the test of hypothesis h o: β 1 = α. In this case, the test statistic used is T = b 1 – α / s e (b 1), then, the p value can be determined by the distribution of t n-1 This will help in determining whether there is a relationship between the values of x and y. The STATA regression analysis The ANOVA Equations The variation in y can be determined by an account of the SS (residual) and SS (regression) SS total ∑ (Y1 – y) = SS regression∑ (y1 – y mean) = SS residual ∑ (y I – y mean) (Vergura, Acciani, Amoruso, Patrono, & Vacca, 2009). This is, then, broken down into the degrees of freedom. SS total n - 1 = SS regression n 1 = SS residual n- 2 Then, the mean square is determined as follows; MS (residual) = SS (residual) / d f (residual) = ∑ (y I – y I mean) 2 / n – 2 = s 2 (Christensen, Johnson, & Turner, 2011) R 2 = regression SS / Total SS = r 2 (Vergura, Acciani, Amoruso, Patrono, & Vacca, 2009). 2430 / 5632 = 0.4314 Determination coefficient of variation; This is determined by the formula CV = (S / X) 100 % (Vergura, Acciani, Amoruso, Patrono, & Vacca, 2009). Given that the average price for shoes men is 1432 and that of the shoes for women is 1639. The standard deviation of shoes men and that of shoes for women is 4 and 6 respectively; the determination of the coefficient of variation can be done as follows; (4 / 1432) x 100 % =0.27% (6/1639) x 100 % = 0.37% Month units_shoes_men units_shoes_women units_boots_men units_boots_women 1 507 504 787 595 2 616 526 986 593 3 772 564 1009 548 4 1044 811 968 610 5 1170 1079 926 514 6 1560 1487 863 457 7 1642 1509 1092 508 8 1970 1462 1137 588 9 2466 1888 1141 672 10 3231 1489 996 611 11 2613 1372 448 664 12 2401 1738 454 632 1 2926 1536 446 661 2 2477 1255 470 674 3 1755 1741 507 661 4 2035 1584 461 733 5 1713 1361 562 765 6 2073 1528 462 884 7 2543 1552 313 1058 8 2531 1439 438 797 9 3000 1591 520 687 10 2892 1899 521 765 11 2800 1944 509 1093 12 2979 2103 527 783 1 4546 2220 623 916 2 4998 3289 544 1089 3 6013 2926 475 863 4 7297 2843 494 865 5 8537 3077 524 830 6 7496 3415 482 904 7 11188 3798 561 929 8 13064 4336 561 792 9 9413 4024 638 801 10 9888 4625 769 850 11 11589 4909 605 909 12 11935 6008 698 1167 1 14844 7484 972 1390 2 11397 8862 777 1602 3 10435 11537 870 1724 4 12113 15857 1043 1525 5 15301 15334 960 1279 6 10337 12985 921 1537 7 11468 14191 1104 1717 8 11202 19645 1183 1752 9 12915 25107 1225 1503 10 14619 30250 1183 1766 11 19488 30760 1339 1754 12 18451 26452 1264 2487 From the above data the correlation values of a, and b, can be determined as follows; Let the shoe units for men be x / 10,000 and those of women be y / 10,000. S X = ∑ x i = 1546.25 S Y = ∑ y i = 100 S X X = ∑ x I 2 = 0. 4568 S X Y = ∑ x i yi = 1356 S Y Y = ∑ y 1 2 =23.6 B = (n s x y –s x s y) / (n s x x –s x) =-912 / 814.2 = - 1.12 A = (1/n) s y – b (1 / n) s x = 423.35+ 440.28 = 1.04 There is a negative correlation between the men shoes and women shoes, it can be inferred that supply is on the basis of what will earn the company more profits. That is why if the profits from the men shoes are high, then there is high supply of the shoes and vice versa. Determining the a, and b, values for boots Let the men boots be y / 10000 and that of women be x / 10000 Then, the values of a and b are; S X = ∑ x i = 2.35 S Y = ∑ y i = 11 S X = ∑ x I 2 = 0.3324 S X Y = ∑ x i y i = 1.256 S Y Y = ∑ y 1 2 = 2.26 B = (n s x y –s x s y) / (n s x x –s x) = 6.123 / 3.25 = 1.884 a= (1 / n) s y – b (1 / n) s x = 1.092 + 1.075 = 1.1842 Interpretation; There is a positive correlation between the men and women boots. This can be inferred that there is an equal demand for both of these boots amongst the both genders. Then, the regression correlation between the profits can be determined to establish the relationship between the sales of the boots and those of the shoes worn between men and women by considering their respective profits. profit men shoes profit men shoes profit men shoes profit men shoes profit total 19995 14964 44955 29963 109877 20969 14002 41633 27068 103672 18451 15686 47504 27966 109607 20227 16834 49775 30904 117740 20481 22160 48557 31891 123089 23587 25822 57093 35687 142189 26609 31486 50061 33835 141991 25336 32922 53382 31759 143399 26091 31678 49617 34788 142174 28615 37366 49751 41924 157656 34836 35993 54818 39323 164970 33119 44500 58632 38088 174339 30503 51613 63508 38974 184598 30545 53334 70482 33915 188276 34377 60041 63799 35270 193487 38102 69605 67775 41524 217006 38790 73414 64623 39755 216582 39412 81366 58589 41400 220767 44565 84489 70948 40494 240496 43953 71017 69854 34187 219011 36258 78082 70351 37226 221917 34488 74991 69574 39088 218141 38695 81468 73700 39550 233413 36674 83313 78179 41634 239800 46470 89611 81847 34375 252303 55381 92840 88217 39464 275902 57077 105614 97913 45078 305682 75348 111113 96703 44086 327250 79161 110419 103441 35812 328833 72465 114462 134325 37945 359197 71405 124177 119840 34678 350100 69667 133971 121139 36682 361459 76015 145309 114084 41665 377073 76657 169917 86705 44419 377698 76761 216237 94352 47118 434468 78262 204811 83776 55517 422366 84948 232845 91160 59420 468373 77444 245577 87908 73490 484419 86026 292020 93839 86645 558530 90259 357343 96703 84303 628608 84308 415097 103344 76879 679628 84888 465811 115278 80723 746700 97675 485313 139347 83625 805960 122871 494634 142124 78114 837743 111948 540289 160742 83924 896903 128722 571556 152216 92881 945375 117628 643643 150018 91380 1002669 134146 614784 144561 107147 1000638 From the above data the correlation values of a, and b can be determined as follows; Let the shoe units for men be x / 10,000 and those of women be y / 10,000. S X = ∑ x i = 198 S Y = ∑ y i = 112 S X X = ∑ x I 2 = 0.5768 S X Y = ∑ x i y i = 1467 S Y Y = ∑ y 1 2 = 2356 B = (n s x y –s x s y) / (n s x x –s x) = 3876 / 56.9 = A = (1/n) s y – b (1 / n) s x = 27+ 102.5 = 129.5 Determining the, a, and b, values for boots Let the men boots be y / 10000 and that of women be x / 10000 Then, the values of a and b are; S X = ∑ x i = S Y = ∑ y i = 11 S X = ∑ x I 2 = S X Y = ∑ x i y i = 1.7645 S Y 1.5344 = ∑ y 1 2 = 3.9476 B = (n s x y –s x s y) / (n s x x –s x) = 9.56 /4 .85 = 1.4767 A = (1 / n) s y – b (1 / n) s x = 65 + 1.23 = 1.88 Interpretation Analysis; There is a positive correlation between the two values; this can be inferred to the increased turnover for the products of men and women. That is, shoes and boots for the both gender are increasingly sought within the market. References List Barnett, V. 1999, Comparative statistical inference, Chichester, Wiley http://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&db=nlabk&AN=17849 Christensen, L, B, Johnson, B, & Turner, L, A, 2011, Research methods, design, and analysis, Allyn & Bacon Krzanowski, W, J, 2000, Principles of multivariate analysis, Oxford: Oxford University Press Oakes, M. W. 1990. Statistical inference. Chestnut Hill, MA, Epidemiology Resources Inc Silvey, S. D.1975. Statistical inference. London, Chapman and Hall Trochim, W, M, & Donnelly, J, P, 2008, Research methods knowledge base, Mason, OH: Atomic Dog/Cengage Learning. Vergura, S, Acciani, G, Amoruso, V, Patrono, G, E, & Vacca, F, 2009, Descriptive and inferential statistics for supervising and monitoring the operation of PV plants, Industrial Electronics, IEEE Transactions on. Weiss, N, A, & Weiss, C, A, 2012, Introductory statistics, Pearson Education Read More

CHECK THESE SAMPLES OF Statistical Analysis - Simple Linear Regression Model

Statistical Inference and Regression

This paper discusses statistical inferences and cuts across other areas of statistics including regression, linear regression, nonlinear regression, least-squires method, and the maximum likelihood method.... Statistical Inference and regression Name Institution Statistics is the study that involves collecting, organizing, analyzing, and interpreting data.... They also give tools for forecasting, as well as prediction using statistical models....
10 Pages (2500 words) Essay

Description of Step-Wise Multiple Regression statistic test

This paper attempts to review in detail the step-wise regression model and its application through SPSS version 21.... ccording to Investopedia, Step-wise regression is a step-by-step iterative establishment of a regression model that necessitates automatic excerption of independent variables.... Stepwise regression can be accomplished either by testing single independent variable at one time and admitting it in the regression model if it is found to be statistically significant, or by admitting all possible independent variables within the model and eradicating those that are found to be statistically insignificant, or by a amalgamation of both methods (Investopedia US, A Division of ValueClick, Inc....
10 Pages (2500 words) Essay

Multiple Regression as One of the Most Widely Used Tools in Statistical Analysis

He may even be able to come up with a regression model that will then allow him to forecast his next exam score, given the amount of time spent studying, the amount of sleep taken the night before the exam, and the amount of beer drank the night before the exam.... This regression line is sometimes called the line of best fit and this is the line that represents the regression model of a given problem.... This paper "Multiple Regression as One of the Most Widely Used Tools in Statistical Analysis" tells that multiple regression is simply an extension of linear regression, a statistical process that seeks to find the linear relationship between an independent variable and a dependent variable....
8 Pages (2000 words) Term Paper

Linear Regression Analysis

hen some more explanatory variables are added we now have a multiple linear regression model, which contains: Educational attainment, currently employed, Poor health self-rating, number types of abuse, and Respondent's age at the time of interview.... This multiple linear regression model, with five explanatory variables, has an R squared value of 0.... he association between y and x can then be estimated by carrying out a simple linear regression analysis....
4 Pages (1000 words) Essay

MGM330-0704A-04 Business Decision-Making - Phase 3 Discussion Board

Software such as Microsoft Excel, Minitab or graphical calculators such as the Ti-89 Titanium created by Texas Instrument can be utilized to perform linear regression analysis.... The results or outputs of the statistical methods are great tools that allow people to make decision supported by a scientifically proven statistical analysis.... The regression analysis method is a predictive model that allows the statistician to determine the values of a variable within an equation based on the given data about the other variable....
2 Pages (500 words) Essay

Regression Analysis

The author of the essay "regression Analysis" casts light on the concept of regression.... It is mentioned here that regression analysis estimates the extent to which two or more variables are related.... In this work, the focus is to employ various aspects of regression and correlation analysis to be able to explore how imports and export ratios affect the GDP of the UAE.... he multiple regression analysis estimates the coefficients of the linear equation especially in the cases where more than one independent variable exists....
3 Pages (750 words) Statistics Project

A Heteroscedastic Regression Model for Survival Analysis

"Heteroscedastic regression model for Survival Analysis" paper studies the rates of survival for specific cancer suitable covariates relevant to that cancer identified.... Statistically, the specifications of a model require choosing both systematic and error components.... The choice of an error component involves specifying the statistical distribution of what remains for an explanation after the model is fit.... They encompass a multi-dimensional and extensible model class for approximating overall distribution function in a semi-parametric dimension and this makes the modeling technique a popular technique accounting for a relevant diversity....
8 Pages (2000 words) Statistics Project

Regression Analysis Models for Marketing Decision Making

The major regression models available are the linear regression model, the non-linear regression model, logistic regression, and multinomial logistic regression.... Multiple linear regression is a regression that applies more than two variables.... on-linear regression is a quantitative statistical method of finding a nonlinear model of the relationship between the dependent variable and a set of several independent variables....
9 Pages (2250 words) Research Paper
sponsored ads
We use cookies to create the best experience for you. Keep on browsing if you are OK with that, or find out how to manage cookies.
Contact Us