StudentShare
Contact Us
Sign In / Sign Up for FREE
Search
Go to advanced search...
Free

Bivariate Data Analysis of Sprint Cup Drivers - Essay Example

Cite this document
Summary
The essay "Bivariate Data Analysis of Sprint Cup Drivers" focuses on the critical analysis of the major issues in the Bivariate data analysis of the average time (pit stop time) and the number of wins for the top 30 Sprint Cup Drivers (the Nextel Cup Series) for 2008 (whole season)…
Download full paper File format: .doc, available for editing
GRAB THE BEST PAPER93% of users find it useful
Bivariate Data Analysis of Sprint Cup Drivers
Read Text Preview

Extract of sample "Bivariate Data Analysis of Sprint Cup Drivers"

Assignment (Teresa Phares) Math215 Table of Contents Introduction 2 Data Collection 3 Statistical Analysis 4 Independent and Dependent Variables 4 Scatter plot of Average Pit Stop Time and Number of Wins 4 The Coefficient of Correlation 5 The Equation of the Regression Line 6 The Scatter Plot and Regression Trend Line 7 Prediction Using Regression Model 8 Residuals (Largest) 9 Conclusion 10 References: 11 Appendix: 12 Introduction Bivariate data analysis involves analysis of relationship between two variables. Visual displays, correlation analysis and regression analysis are used for analyzing bivariate variables. A visual display such as scatter plot of data variables provides a visual indication of the strength of relationship or association between the two variables. Correlation coefficient measures the degree of linearity in the relationship between two variables. A linear regression fits the paired data of two variables to provide a model for prediction. In this paper, the average time (pit stop time) and number of wins for top 30 Sprint Cup Drivers (the Nextel Cup Series) for year 2008 (whole season) is taken for bivariate correlation and regression analysis. Data Collection For this paper, the average time (pit stop time)1 and number of wins for top 30 Sprint Cup Drivers (the Nextel Cup Series) for year 2008 (whole season) is collected (Race results, 2008; Sprint Cup Drivers, 2008). Table 1: Driver’s Number of Win(s) and the Pit Stop Time in Appendix, shows the data for top 30 drivers for the Nextel Cup Series, 2008. Statistical Analysis Independent and Dependent Variables In bivariate correlation and regression analysis , it will be determined whether average time of pit stops is related to the number of wins. Therefore, the average time will be taken as independent variable (x) and the number of wins will be taken as dependent variable (y). Scatter plot of Average Pit Stop Time and Number of Wins Figure 1: Scatter Plot of Average Pit Stop Time vs. Number of Wins Figure 1 shows the scatter plot of Average Pit Stop Time and Number of Wins. From figure 1 , it can be seen that as the average pit stop time increases the number of wins decreases. Therefore, there is a negative relationship exists between the variables average time and number of wins. The Coefficient of Correlation Correlation coefficient, r is given by , Where , , , and (Doane & Seward, 2007). Putting values of , , and from Table 2: Sum of Squares in Appendix, the correlation coefficient is The value of sample correlation coefficient , r = -0.55 indicate that there is a negative relationship present between the variables average pit stop time and number of wins for top 30 Sprint Cup Drivers. The correlation is significant at level of significance, α = 0.01. For a two-tailed test at level of significance, α = 0.01 and degree of freedom 28, the critical value of r is ± 0.463. The value of correlation coefficient r = - 0.550 is less than the left tail critical value of -0.463, therefore the null hypothesis of no correlation is rejected and the data provide sufficient evidence of correlation between variables average pit stop time and number of wins (Table 3: Correlation Matrix). The Equation of the Regression Line The regression equation is given by Where, Slope, and Intercept, (Doane & Seward, 2007). Putting values of , , and from Table 2: Sum of Squares in Appendix, the slope and intercept of linear regression equation is Slope = Intercept = Therefore, the regression equation is given by Or Number of Wins = 26.653 – 1.803*(Average Time) The slope equal to -1.803 suggests that an additional second in average pit stop time decreases the number of wins of sprint cup drivers by approximately 1.8. The number of wins for sprint cup drivers is approximately 26.7 with zero average pit stop time. However, the intercept is not meaningful because the average pit stop time can never be zero. The regression is significant at level of significance, α = 0.01. The higher F statistic (12.148) for overall regression suggest that regression is significant at level of significance, α = 0.01. This is also confirmed by p-value (0.002). The p-value for slope and intercept are equal to 0.002, and 0.001, therefore the slope and intercept are significant at level of significance, α = 0.01 (Table 4: Regression Summary Output). The value of coefficient of determination is equal to 0.303 (). Therefore, the average pit stop time explains 30.3 percent of the variation in number of wins for sprint cup drivers. On the other hand , 69.7 percent of the variation in number of wins is not explained by average pit stop time. The Scatter Plot and Regression Trend Line Figure 2: Scatter Plot of Average Time vs. Number of Wins and Linear Trend line Figure 2 shows the graph of Scatter Plot of Average Time vs. Number of Wins and Linear Trend line. From figure 2, it can be seen that the trend line approximately fits the data points. Prediction Using Regression Model Using regression model the prediction (number of wins) for average pit stop equal to 12, 13, 14 and 15 seconds are For average pit stop time = 12 seconds Number of Wins = 26.653 – 1.803*(12) = 5.017 5 For average pit stop time = 13 seconds Number of Wins = 26.653 – 1.803*(13) = 3.214 3 For average pit stop time = 14 seconds Number of Wins = 26.653 – 1.803*(14) = 1.411 1 For average pit stop time = 15 seconds Number of Wins = 26.653 – 1.803*(15) = -0.392 0 From above analysis, it can be seen that for average pits stop time equal to 12 (less than 13) and 13 seconds the number of wins is equal to 5 and 3, respectively, this is also confirmed by the scatter plot (figure 2). For average pit stop time equal to 14 seconds, the number of win is equal to 1 that is true for some drivers. For average pit stop time equal to 15 seconds, the number of win is zero that is also confirmed by the scatter plot (figure 2). The standard error for regression equation is ±2.145. Therefore, the regression model can be used for reliably predicting number of wins for sprint cup drivers based on average pit stop time within error of ±2. Residuals (Largest) Table 5 in appendix shows the residuals output of number of wins using regression model. The points with largest residuals are (13.1, 9), (13.2, 9) and (13, 7). The residual for points (13.1, 9), (13.2, 9) and (13, 7) are 6, 6, and 4 respectively. The unexplained variation in number of wins is the sum of squared residuals (or the error sum of squares). These high values of residuals contribute in the sum of squared residuals more than the other residual values because of which the unexplained variation in number of wins increases. Conclusion In conclusion, the number of wins and the average pit stop time for sprint cup drivers is related. Further, using average pit stop time for sprint cup driver, number of wins for whole season can be approximately predicted. References: Doane D.P. & Seward L.E. (2007). Applied Statistics in Business and Economics. McGraw-Hill/Irwin: New York Race Results, retrieved on December 4, 2008 from http://www.nascar.com/races/cup/2008/rr_index.html Sprint Cup Drivers, retrieved on December 4, 2008 from http://store.nascar.com/sm-nextel-cup-drivers--ci-1736852_cp-2056638.html Appendix: Click here for Excel spreadsheet. Table 1 Driver’s Number of Win(s) and the Pit Stop Time Driver Wins in 2008 Pit Stop Time Greg Biffle 2 13.2 Clint Bowyer 1 13.5 Jeff Burton 2 13.1 Kurt Busch 0 14.0 Kyle Busch 9 13.1 Dale Earnhardt, Jr 1 13.6 Carl Edwards 9 13.2 Bill Elliott 0 13.9 Jeff Gordon 0 13.5 Denny Hamlin 1 14.5 Kevin Harvick 0 14.8 Dale Jarrett 0 13.9 Jimmie Johnson 7 13.0 Kasey Kahne 2 13.8 Matt Kenseth 0 14.9 Travis Kvapil 0 15.2 Bobby Labonte 0 13.8 Mark Martin 0 15.1 Jeremy Mayfield 0 13.7 Jamie McMurray 0 15.2 Casey Mears 0 15.6 Paul Menard 0 14.9 Juan Pablo Montoya 0 14.1 Joe Nemechek 0 15.1 Ryan Newman 1 13.9 Kyle Petty 0 14.8 David Ragan 0 15.1 David Reutimann 0 13.2 Tony Stewart 1 13.9 Martin Truex, Jr. 0 14.0 Table 2 Sum of Squares Average Time (x) Number of Wins (y) 13.2 2 -0.920 0.800 -0.736 0.846 0.640 13.5 1 -0.620 -0.200 0.124 0.384 0.040 13.1 2 -1.020 0.800 -0.816 1.040 0.640 14.0 0 -0.120 -1.200 0.144 0.014 1.440 13.1 9 -1.020 7.800 -7.956 1.040 60.840 13.6 1 -0.520 -0.200 0.104 0.270 0.040 13.2 9 -0.920 7.800 -7.176 0.846 60.840 13.9 0 -0.220 -1.200 0.264 0.048 1.440 13.5 0 -0.620 -1.200 0.744 0.384 1.440 14.5 1 0.380 -0.200 -0.076 0.144 0.040 14.8 0 0.680 -1.200 -0.816 0.462 1.440 13.9 0 -0.220 -1.200 0.264 0.048 1.440 13.0 7 -1.120 5.800 -6.496 1.254 33.640 13.8 2 -0.320 0.800 -0.256 0.102 0.640 14.9 0 0.780 -1.200 -0.936 0.608 1.440 15.2 0 1.080 -1.200 -1.296 1.166 1.440 13.8 0 -0.320 -1.200 0.384 0.102 1.440 15.1 0 0.980 -1.200 -1.176 0.960 1.440 13.7 0 -0.420 -1.200 0.504 0.176 1.440 15.2 0 1.080 -1.200 -1.296 1.166 1.440 15.6 0 1.480 -1.200 -1.776 2.190 1.440 14.9 0 0.780 -1.200 -0.936 0.608 1.440 14.1 0 -0.020 -1.200 0.024 0.000 1.440 15.1 0 0.980 -1.200 -1.176 0.960 1.440 13.9 1 -0.220 -0.200 0.044 0.048 0.040 14.8 0 0.680 -1.200 -0.816 0.462 1.440 15.1 0 0.980 -1.200 -1.176 0.960 1.440 13.2 0 -0.920 -1.200 1.104 0.846 1.440 13.9 1 -0.220 -0.200 0.044 0.048 0.040 14.0 0 -0.120 -1.200 0.144 0.014 1.440 = 14.120 = 1.200 = -31.020 = 17.208 = 184.800 Table 3 Correlation Matrix Average Time Number of Wins Average Time 1.000   Number of Wins -.550 1.000 30 sample size ± .361 critical value .05 (two-tail) ± .463 critical value .01 (two-tail) Table 4 Regression Summary Output Regression Statistics Multiple R 0.550 R Square 0.303 Adjusted R Square 0.278 Standard Error 2.145 Observations 30 ANOVA   df SS MS F Significance F Regression 1 55.918 55.918 12.148 0.002 Residual 28 128.882 4.603 Total 29 184.8         Coefficients Standard Error t Stat P-value Lower 95% Intercept 26.653 7.313 3.645 0.001 11.673 Average Time -1.803 0.517 -3.485 0.002 -2.862 Table 5 Residuals Output Number of Wins (y) Predicted () Residual () 2 3 -1 1 2 -1 2 3 -1 0 1 -1 9 3 6 1 2 -1 9 3 6 0 2 -2 0 2 -2 1 1 0 0 0 0 0 2 -2 7 3 4 2 2 0 0 0 0 0 -1 1 0 2 -2 0 -1 1 0 2 -2 0 -1 1 0 -1 1 0 0 0 0 1 -1 0 -1 1 1 2 -1 0 0 0 0 -1 1 0 3 -3 1 2 -1 0 1 -1 Read More
Cite this document
  • APA
  • MLA
  • CHICAGO
(“PROJECT IV Essay Example | Topics and Well Written Essays - 1000 words”, n.d.)
PROJECT IV Essay Example | Topics and Well Written Essays - 1000 words. Retrieved from https://studentshare.org/miscellaneous/1550743-project-iv
(PROJECT IV Essay Example | Topics and Well Written Essays - 1000 Words)
PROJECT IV Essay Example | Topics and Well Written Essays - 1000 Words. https://studentshare.org/miscellaneous/1550743-project-iv.
“PROJECT IV Essay Example | Topics and Well Written Essays - 1000 Words”, n.d. https://studentshare.org/miscellaneous/1550743-project-iv.
  • Cited: 0 times

CHECK THESE SAMPLES OF Bivariate Data Analysis of Sprint Cup Drivers

The Impact of Advertising on Consumer Behavior

A paper "The Impact of Advertising on Consumer Behavior" claims that there are new methods of advertising such as the use of audio media e.... .... radio, print media e.... .... magazines and newspaper, internet etc.... Social media such as Facebook and YouTube are the most recent advertising channel....
11 Pages (2750 words) Dissertation

Sprint Communications Company Overview

The paper "sprint Communications Company Overview" underlines that sprint has a long way to go in a maturing, consolidated, and highly competitive industry and for that, it needs to struggle its way by coming up with more innovations for its clients.... sprint Communications, established in 1991, provides technological solutions to its customers, mostly businesses.... The clientele manages their own core businesses, while sprint Communications assists them in the technological arena....
10 Pages (2500 words) Term Paper

Summative Evaluation and Suchmans Five Levels

The univariate and bivariate queries for effort evaluation are as follows: 1.... Is there a partiality between the girls and boys (bivariate)4.... Is there any bias in treating between wealthy or un wealthy clients (bivariate)2.... The univariate and bivariate queries for efficiency evaluation are as follows:1.... What is the shortest time to solve a particular case with respect to maximum number of people involved to solve (bivariate)4....
2 Pages (500 words) Research Paper

Statistical Analysis of Stock Indices

As the paper "Statistical analysis of Stock Indices" examines, the regression analysis especially the autoregressive model that is of interest in this case has successfully been used during the development of a series of robust tests of the 'intrinsic value measure.... technical overview of the nuances of the unit root test is presented followed by the analysis of the Stock indices given in SPSS v14.... A very useful tool for the evaluation of the performance of the Stock Markets is the regression analysis as has been adopted by many studies (such as Fama and French, 1988)....
14 Pages (3500 words) Research Paper

Data Analysis (Applied Research Method)

Twelve out of 36 independent correlations are observed to be negative correlations.... In which number of public transport users in household negatively correlates with total city CO2 emissions per household per annum, number of household members, average household income per annum, number of cars per household and household car miles per year i....
5 Pages (1250 words) Essay

Data Interpretation Practicum

In this case, the variables Hours worked, PerSafeBeh, Injury Rate, Safety Climate, and Risk were chosen for analysis, and one-way analysis of variance test was best suited.... "data Interpretation Practicum" paper examines the relationship between several variables at work locations in Boston, Phoenix, and Seattle.... One way ANOVA, independent test, and bivariate correlation are performed based on the research purposes....
6 Pages (1500 words) Statistics Project

Financial Modelling - Relationship between Market Risk and Stock Return

Graphical representation and tabular analysis have been used in the variable analysis.... Additionally, the univariate, bivariate, and multivariate analyses have been used to explain the relationships between one variable and another as well as the relationship with the dependent variable (stock return)....
16 Pages (4000 words)

The Vector Topological Data Model in the Geographical Information Systems

This implies that the order of points and line connectivity defines the shape of either an arc or a polygon, indicating that without the topologic data structure, manipulation and analysis of the data capability of a GIS would not be achievable.... The object of analysis for the purpose of this paper "The Vector Topological Data Model in the Geographical Information Systems" is topology as an important model, particularly where the vector data model is applied to analyze spatial geographical data....
8 Pages (2000 words) Term Paper
sponsored ads
We use cookies to create the best experience for you. Keep on browsing if you are OK with that, or find out how to manage cookies.
Contact Us