Our website is a unique platform where students can share their papers in a matter of giving an example of the work to be done. If you find papers
matching your topic, you may use them only as an example of work. This is 100% legal. You may not submit downloaded papers as your own, that is cheating. Also you
should remember, that this work was alredy submitted once by a student who originally wrote it.
The paper "Statistical Analysis - Simple Linear Regression Model" is a good example of an assignment on statistics. Y = β o + β 1 x + e, where β o is the y-intercept and β 1 is the population regression line. That is the slope. In the first case, the regression parameters are estimated by the use of least-squares…
Download full paperFile format: .doc, available for editing
Extract of sample "Statistical Analysis - Simple Linear Regression Model"
Statistical Analysis Assignment
By Student Name
Course code + name
Professor name
University name
City, State
Date of submission
Simple Linear Regression Model
Y = β o + β 1 x + e, where β o is the y- intercept and β 1 is the population regression line. That is the slope.
In the first case, the regression parameters are estimated by the use of least-squares.
Assuming a given sample made up of n pairs of the observations, then x 1 – y 1, ………x n – y n. in order to get the line of the best fit, then the parameters β 0,β 1 and 6 must be determined using the least square methods (Christensen, Johnson, & Turner, 2011)
Y = b 0 + b 1 x
Whereby, b1 = ∑ (x 1 – x mean) y 1 / ∑ (x 1 – x mean) 2= r y/s x gives β 1
b0 = y mean – b 1 x mean is the estimator of β o.
Then, the residual is estimated, this is the difference between the observed value and the expected value.
E I = y I – (b o + b 1 x I) = y I – y I mean
Thus, 6 2 = can be estimated by e i
S 2 = ∑ (y i – y i mean) 2 / n – 2.
It should be noted that s is the regression standard error. It also has n – 2 degrees of freedom (Christensen, Johnson, & Turner, 2011)
From the data given, it can be assumed that Y is the shoe-size of men and, X is the shoe-size for women
X
Y
507
507 - 2,479.33 = - 19479
504
504 - 1536 = - 1032
616
616 - 2,479.33 = - 1863.33
526
526 - 1536 = - 1010
772
772 - 2,479.33 = - 1707.33
564
564 -1536 = - 972
1044
1044 - 2,479.33 = - 1,435.33
811
811- 1536 = - 725
1170
1170 - 2,479.33 = -1,309.33
1079
1079 - 1536 = - 457
1560
1560 - 2,479.33 =-919.33
1487
1487 - 1536 = - 49
1642
1642 - 2,479.33 = - 837.33
1509
1509 - 1536 = - 27
1970
1970 - 2,479.33 = -509.33
1462
1462 - 1536 = - 74
2466
2466 - 2,479.33 = -13.33
1888
1888 - 1536 = 352
3231
3231 - 2,479.33 = -751.67
1489
1489 - 1536 = - 47
2613
2613 - 2,479.33 = 133.67
1372
1372 - 1536 = - 164
2401
2401 - 2,479.33 = - 78.33
1738
1738 - 1536 = 202
2926
2926 - 2,479.33 = 446.67
1232
1232 - 1536 = - 304
2477
2477 - 2,479.33 =-2.33
1832
1832 - 1536 = 296
1755
1755 - 2,479.33 =-724.33
1738
1738 - 1536 = 202
2035
2035 - 2,479.33 =-444.33
1584
1584 - 1536 = 48
1713
1713- 2,479.33 = -766.33
1361
1361-1536 = - 175
2073
2073- 2,479.33 = -406.33
1528
1528-1536 = - 8
2543
2543- 2,479.33 = 63.67
1552
1552-1536 = 16
2401
2401- 2,479.33 = -78.33
1738
1738-1536 = 202
2926
2926- 2,479.33 = 446.67
1536
1536-1536 = 0
2477
2477- 2,479.33 = - 2.33
1255
1255-1536 = - 281
1755
1755- 2,479.33 = - 724.33
1741
1741-1536 = 205
2035
2035- 2,479.33 = - 444.33
1584
1584-1536 = 48
1713
1713- 2,479.33 = - 766.33
1361
1361-1536 = - 175
2073
2073- 2,479.33 = - 406.33
1528
1528-1536 = - 8
2543
2543- 2,479.33 =63.67
1552
1552-1536 = 16
531
531- 2,479.33 = -1,948.33
1439
1439-1536 = - 97
3000
3000- 2,479.33 = 520.67
1591
1591-1536 = 55
2892
2892- 2,479.33 =412.67
1899
1899-1536 = 363
2800
2800- 2,479.33 = 320.67
1944
1944-1536 = 408
2979
2979- 2,479.33 = 499.67
2103
2103-1536 = 567
Determining STATA
X mean = 2,479.33
Y mean = 1536
X1 2896
Y1 1412
b1 = ∑ (x 1 – x mean) y 1 / ∑ (x1 – x mean) 2
Substituting the figures into the formula;
b1 = ∑ (2896 – 2479.33)2 / ∑ (2896 – 2479.33)
= ∑416.67/416.67 = 173,613.88 – 416.67 = 173,197
S 2 = ∑ (y i – y i mean) 2 / n – 2.
= (1412 – 1536)2 / 36 – 2
= 15,376 / 3445
= 452.24
b0 = y mean – b1 x mean is the estimator of β o.
= 1536 – 452.24 x 2.479
= 415
CIs for Regression parameters
b1 = n (β 1 [б / ∑x 1 – x mean] 2) (Christensen, Johnson, & Turner, 2011)
Given that 6 are not given, then s, is used to estimate it. This will result to t-confidence interval for β o and β 1
Regression Inference: Conditions for Inference
The sample used is an SRS drawn from the population
The population has a linear relationship, this, is checked by assessing the linearity of a scatter diagram from the sampled data.
The standard deviation about the population of the response for all the values is the same for the explanatory variables.
This is determined by assessing and plotting for all the residuals and determining whether the observation spread is uniform around the least squares as the value of x varies.
There is a variation of the response about the regression line of the population .This condition is determined by the observation of a normal quintile residual plot.
Determining the Confidence Intervals
The confidence interval for β 0 by the level (1 - α) is determined as below.
b0 – t S E b o+ t S E bo (Vergura, Acciani, Amoruso, Patrono, & Vacca, 2009)
Where t * = α / 2. It is the critical value of the t n -2, and the S E (t i) =∑ s / (x 1 – x mean) 2
436 / (2896 – 1536)2 = 0.00024
Hypothesis test on regression parameters
In order to determine the test of hypothesis h o: β 1 = α. In this case, the test statistic used is T = b 1 – α / s e (b 1), then, the p value can be determined by the distribution of t n-1
This will help in determining whether there is a relationship between the values of x and y.
The STATA regression analysis
The ANOVA Equations
The variation in y can be determined by an account of the SS (residual) and SS (regression)
SS total ∑ (Y1 – y) = SS regression∑ (y1 – y mean) = SS residual ∑ (y I – y mean) (Vergura, Acciani, Amoruso, Patrono, & Vacca, 2009).
This is, then, broken down into the degrees of freedom. SS total n - 1 = SS regression n 1 = SS residual n- 2
Then, the mean square is determined as follows;
MS (residual) = SS (residual) / d f (residual) = ∑ (y I – y I mean) 2 / n – 2 = s 2 (Christensen, Johnson, & Turner, 2011)
R 2 = regression SS / Total SS = r 2 (Vergura, Acciani, Amoruso, Patrono, & Vacca, 2009).
2430 / 5632 = 0.4314
Determination coefficient of variation;
This is determined by the formula CV = (S / X) 100 % (Vergura, Acciani, Amoruso, Patrono, & Vacca, 2009).
Given that the average price for shoes men is 1432 and that of the shoes for women is 1639. The standard deviation of shoes men and that of shoes for women is 4 and 6 respectively; the determination of the coefficient of variation can be done as follows;
(4 / 1432) x 100 % =0.27%
(6/1639) x 100 % = 0.37%
Month
units_shoes_men
units_shoes_women
units_boots_men
units_boots_women
1
507
504
787
595
2
616
526
986
593
3
772
564
1009
548
4
1044
811
968
610
5
1170
1079
926
514
6
1560
1487
863
457
7
1642
1509
1092
508
8
1970
1462
1137
588
9
2466
1888
1141
672
10
3231
1489
996
611
11
2613
1372
448
664
12
2401
1738
454
632
1
2926
1536
446
661
2
2477
1255
470
674
3
1755
1741
507
661
4
2035
1584
461
733
5
1713
1361
562
765
6
2073
1528
462
884
7
2543
1552
313
1058
8
2531
1439
438
797
9
3000
1591
520
687
10
2892
1899
521
765
11
2800
1944
509
1093
12
2979
2103
527
783
1
4546
2220
623
916
2
4998
3289
544
1089
3
6013
2926
475
863
4
7297
2843
494
865
5
8537
3077
524
830
6
7496
3415
482
904
7
11188
3798
561
929
8
13064
4336
561
792
9
9413
4024
638
801
10
9888
4625
769
850
11
11589
4909
605
909
12
11935
6008
698
1167
1
14844
7484
972
1390
2
11397
8862
777
1602
3
10435
11537
870
1724
4
12113
15857
1043
1525
5
15301
15334
960
1279
6
10337
12985
921
1537
7
11468
14191
1104
1717
8
11202
19645
1183
1752
9
12915
25107
1225
1503
10
14619
30250
1183
1766
11
19488
30760
1339
1754
12
18451
26452
1264
2487
From the above data the correlation values of a, and b, can be determined as follows;
Let the shoe units for men be x / 10,000 and those of women be y / 10,000.
S X = ∑ x i = 1546.25 S Y = ∑ y i = 100 S X X = ∑ x I 2 = 0. 4568 S X Y = ∑ x i yi = 1356 S Y Y = ∑ y 1 2 =23.6
B = (n s x y –s x s y) / (n s x x –s x) =-912 / 814.2 = - 1.12
A = (1/n) s y – b (1 / n) s x = 423.35+ 440.28 = 1.04
There is a negative correlation between the men shoes and women shoes, it can be inferred that supply is on the basis of what will earn the company more profits. That is why if the profits from the men shoes are high, then there is high supply of the shoes and vice versa.
Determining the a, and b, values for boots
Let the men boots be y / 10000 and that of women be x / 10000
Then, the values of a and b are;
S X = ∑ x i = 2.35 S Y = ∑ y i = 11 S X = ∑ x I 2 = 0.3324 S X Y = ∑ x i y i = 1.256 S Y Y = ∑ y 1 2 = 2.26
B = (n s x y –s x s y) / (n s x x –s x) = 6.123 / 3.25 = 1.884
a= (1 / n) s y – b (1 / n) s x = 1.092 + 1.075 = 1.1842
Interpretation;
There is a positive correlation between the men and women boots. This can be inferred that there is an equal demand for both of these boots amongst the both genders.
Then, the regression correlation between the profits can be determined to establish the relationship between the sales of the boots and those of the shoes worn between men and women by considering their respective profits.
profit men shoes
profit men shoes
profit men shoes
profit men shoes
profit total
19995
14964
44955
29963
109877
20969
14002
41633
27068
103672
18451
15686
47504
27966
109607
20227
16834
49775
30904
117740
20481
22160
48557
31891
123089
23587
25822
57093
35687
142189
26609
31486
50061
33835
141991
25336
32922
53382
31759
143399
26091
31678
49617
34788
142174
28615
37366
49751
41924
157656
34836
35993
54818
39323
164970
33119
44500
58632
38088
174339
30503
51613
63508
38974
184598
30545
53334
70482
33915
188276
34377
60041
63799
35270
193487
38102
69605
67775
41524
217006
38790
73414
64623
39755
216582
39412
81366
58589
41400
220767
44565
84489
70948
40494
240496
43953
71017
69854
34187
219011
36258
78082
70351
37226
221917
34488
74991
69574
39088
218141
38695
81468
73700
39550
233413
36674
83313
78179
41634
239800
46470
89611
81847
34375
252303
55381
92840
88217
39464
275902
57077
105614
97913
45078
305682
75348
111113
96703
44086
327250
79161
110419
103441
35812
328833
72465
114462
134325
37945
359197
71405
124177
119840
34678
350100
69667
133971
121139
36682
361459
76015
145309
114084
41665
377073
76657
169917
86705
44419
377698
76761
216237
94352
47118
434468
78262
204811
83776
55517
422366
84948
232845
91160
59420
468373
77444
245577
87908
73490
484419
86026
292020
93839
86645
558530
90259
357343
96703
84303
628608
84308
415097
103344
76879
679628
84888
465811
115278
80723
746700
97675
485313
139347
83625
805960
122871
494634
142124
78114
837743
111948
540289
160742
83924
896903
128722
571556
152216
92881
945375
117628
643643
150018
91380
1002669
134146
614784
144561
107147
1000638
From the above data the correlation values of a, and b can be determined as follows;
Let the shoe units for men be x / 10,000 and those of women be y / 10,000.
S X = ∑ x i = 198 S Y = ∑ y i = 112 S X X = ∑ x I 2 = 0.5768 S X Y = ∑ x i y i = 1467 S Y Y = ∑ y 1 2 = 2356
B = (n s x y –s x s y) / (n s x x –s x) = 3876 / 56.9 = A = (1/n) s y – b (1 / n) s x = 27+ 102.5 = 129.5
Determining the, a, and b, values for boots
Let the men boots be y / 10000 and that of women be x / 10000
Then, the values of a and b are;
S X = ∑ x i = S Y = ∑ y i = 11 S X = ∑ x I 2 = S X Y = ∑ x i y i = 1.7645 S Y 1.5344 = ∑ y 1 2 = 3.9476
B = (n s x y –s x s y) / (n s x x –s x) = 9.56 /4 .85 = 1.4767
A = (1 / n) s y – b (1 / n) s x = 65 + 1.23 = 1.88
Interpretation Analysis;
There is a positive correlation between the two values; this can be inferred to the increased turnover for the products of men and women. That is, shoes and boots for the both gender are increasingly sought within the market.
References List
Barnett, V. 1999, Comparative statistical inference, Chichester, Wiley http://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&db=nlabk&AN=17849
Christensen, L, B, Johnson, B, & Turner, L, A, 2011, Research methods, design, and analysis, Allyn & Bacon
Krzanowski, W, J, 2000, Principles of multivariate analysis, Oxford: Oxford University Press
Oakes, M. W. 1990. Statistical inference. Chestnut Hill, MA, Epidemiology Resources Inc
Silvey, S. D.1975. Statistical inference. London, Chapman and Hall
Trochim, W, M, & Donnelly, J, P, 2008, Research methods knowledge base, Mason, OH: Atomic Dog/Cengage Learning.
Vergura, S, Acciani, G, Amoruso, V, Patrono, G, E, & Vacca, F, 2009, Descriptive and inferential statistics for supervising and monitoring the operation of PV plants, Industrial Electronics, IEEE Transactions on.
Weiss, N, A, & Weiss, C, A, 2012, Introductory statistics, Pearson Education
Read
More
Share:
CHECK THESE SAMPLES OF Statistical Analysis - Simple Linear Regression Model
This paper discusses statistical inferences and cuts across other areas of statistics including regression, linear regression, nonlinear regression, least-squires method, and the maximum likelihood method.... Statistical Inference and regression Name Institution Statistics is the study that involves collecting, organizing, analyzing, and interpreting data.... They also give tools for forecasting, as well as prediction using statistical models....
This paper attempts to review in detail the step-wise regression model and its application through SPSS version 21.... ccording to Investopedia, Step-wise regression is a step-by-step iterative establishment of a regression model that necessitates automatic excerption of independent variables.... Stepwise regression can be accomplished either by testing single independent variable at one time and admitting it in the regression model if it is found to be statistically significant, or by admitting all possible independent variables within the model and eradicating those that are found to be statistically insignificant, or by a amalgamation of both methods (Investopedia US, A Division of ValueClick, Inc....
He may even be able to come up with a regression model that will then allow him to forecast his next exam score, given the amount of time spent studying, the amount of sleep taken the night before the exam, and the amount of beer drank the night before the exam.... This regression line is sometimes called the line of best fit and this is the line that represents the regression model of a given problem.... This paper "Multiple Regression as One of the Most Widely Used Tools in Statistical Analysis" tells that multiple regression is simply an extension of linear regression, a statistical process that seeks to find the linear relationship between an independent variable and a dependent variable....
hen some more explanatory variables are added we now have a multiple linear regression model, which contains: Educational attainment, currently employed, Poor health self-rating, number types of abuse, and Respondent's age at the time of interview.... This multiple linear regression model, with five explanatory variables, has an R squared value of 0.... he association between y and x can then be estimated by carrying out a simple linear regression analysis....
Software such as Microsoft Excel, Minitab or graphical calculators such as the Ti-89 Titanium created by Texas Instrument can be utilized to perform linear regression analysis.... The results or outputs of the statistical methods are great tools that allow people to make decision supported by a scientifically proven statistical analysis.... The regression analysis method is a predictive model that allows the statistician to determine the values of a variable within an equation based on the given data about the other variable....
The author of the essay "regression Analysis" casts light on the concept of regression.... It is mentioned here that regression analysis estimates the extent to which two or more variables are related.... In this work, the focus is to employ various aspects of regression and correlation analysis to be able to explore how imports and export ratios affect the GDP of the UAE.... he multiple regression analysis estimates the coefficients of the linear equation especially in the cases where more than one independent variable exists....
"Heteroscedastic regression model for Survival Analysis" paper studies the rates of survival for specific cancer suitable covariates relevant to that cancer identified.... Statistically, the specifications of a model require choosing both systematic and error components.... The choice of an error component involves specifying the statistical distribution of what remains for an explanation after the model is fit.... They encompass a multi-dimensional and extensible model class for approximating overall distribution function in a semi-parametric dimension and this makes the modeling technique a popular technique accounting for a relevant diversity....
The major regression models available are the linear regression model, the non-linear regression model, logistic regression, and multinomial logistic regression.... Multiple linear regression is a regression that applies more than two variables.... on-linear regression is a quantitative statistical method of finding a nonlinear model of the relationship between the dependent variable and a set of several independent variables....
9 Pages(2250 words)Research Paper
sponsored ads
Save Your Time for More Important Things
Let us write or edit the assignment on your topic
"Statistical Analysis - Simple Linear Regression Model"
with a personal 20% discount.