StudentShare
Contact Us
Sign In / Sign Up for FREE
Search
Go to advanced search...
Free

Mathematical Statistics and Data Analysis - Math Problem Example

Cite this document
Summary
This math problem "Mathematical Statistics and Data Analysis" discusses the determination of a confidence interval, the formula uses the mean, standard deviation, and T statistic value, the standard deviation is a measure of the dispersion of the mean…
Download full paper File format: .doc, available for editing
GRAB THE BEST PAPER94.7% of users find it useful

Extract of sample "Mathematical Statistics and Data Analysis"

Question 1: a. Mean and standard deviation: Given that the sample was randomly selected then the sample means is expected to be unbiased and therefore depict the population means, this means that the sample mean of the random sample will be an unbiased estimator of the population mean, in this case therefore we determine the sample mean, the results below shows the SPSS output where n =258: Total Mean 57.1783 Std. Deviation 17.97748 Sum 14752.00 In the above case n = 258, the population means is mean is 57.1783 and standard deviation is 17.97748 B. confidence interval: The confidence interval is calculated as follows according to Stuart (1998): P{ [ X – ST] ≤ X≤ [ X+ ST]} = 90% Where X is the mean, S is the standard deviation and T is the T statistic at 90% level, we substitute our formula as follows: P{ [57.1783 – (17.97748)( 2.32635)] ≤ 57.1783≤ [57.1783 + (17.97748)( 2.32635)]} = 90% P{15.35639 ≤ 57.1783≤ 99.00021} = 90% This means that we are 90% confident that the populations mean of exam results lies between 15.35639 and 99.00021. C. justification of the formula: The above formula is used in the determination of a confidence interval, the formula uses the mean, standard deviation and T statistic value, the standard deviation is a measure of dispersion of the mean, by constructing a confidence interval we determine the deviation of the mean given a level of confidence, therefore the confidence interval above states that there is a 90% probability that the mean lies between 15.35639 and 99.00021. D. sample means for Australian and non Australian residents The following is the SPSS output for the sample means of the different countries: Report Exam Country Mean N Std. Deviation 1 60.0254 118 15.80461 6 56.2174 92 19.80167 7 52.4524 42 19.18231 8 49.0000 6 11.71324 Total 57.1783 258 17.97748 Means: Country Mean 1 60.0254 6 56.2174 7 52.4524 8 49.0000 Total 57.1783 From the above output it is evident that country 1 mean is higher than in any other country. E. standard deviation: Country Std. Deviation 1 15.80461 6 19.80167 7 19.18231 8 11.71324 Total 17.97748 The standard deviations are also summarized in the above table. F. difference in means: In this case we test whether exam mean results in Australia are higher than in non Australian residents, we assume that country 1 represents Australia, Null hypothesis: H0: a = b Alternative hypothesis: Ha: a > b Where a is the mean for Australia exam results, and b is mean result for the other country. The following is the SPSS output: Country 1 and 6: Group Statistics Country N Mean Std. Deviation Std. Error Mean Exam 1 118 60.0254 15.80461 1.45493 6 92 56.2174 19.80167 2.06447 Independent Samples Test Levene's Test for Equality of Variances t-test for Equality of Means F Sig. t df Sig. (2-tailed) Mean Difference Std. Error Difference 95% Confidence Interval of the Difference Lower Upper Exam Equal variances assumed 4.642 .033 2.517 158 .013 7.57304 3.00901 1.62998 13.51611 Equal variances not assumed 2.296 61.939 .025 7.57304 3.29815 .98000 14.16608 Country 1 and 7 Group Statistics Country N Mean Std. Deviation Std. Error Mean Exam 1 118 60.0254 15.80461 1.45493 7 42 52.4524 19.18231 2.95989 Independent Samples Test Levene's Test for Equality of Variances t-test for Equality of Means F Sig. t df Sig. (2-tailed) Mean Difference Std. Error Difference 95% Confidence Interval of the Difference Lower Upper Exam Equal variances assumed 4.642 .033 2.517 158 .013 7.57304 3.00901 1.62998 13.51611 Equal variances not assumed 2.296 61.939 .025 7.57304 3.29815 .98000 14.16608 Country 1 and 8: Group Statistics Country N Mean Std. Deviation Std. Error Mean Exam 1 118 60.0254 15.80461 1.45493 8 6 49.0000 11.71324 4.78191 Independent Samples Test Levene's Test for Equality of Variances t-test for Equality of Means F Sig. t df Sig. (2-tailed) Mean Difference Std. Error Difference 95% Confidence Interval of the Difference Lower Upper Exam Equal variances assumed .765 .383 1.683 122 .095 11.02542 6.55283 -1.94657 23.99741 Equal variances not assumed 2.206 5.966 .070 11.02542 4.99835 -1.22182 23.27266 We reject the null hypothesis H0: a = b and accept the alternative hypothesis that Ha: a > b in all the cases and therefore conclude that the performance of Australian residents is higher than in the other countries. Question 2: A. male broken down into faculty and country: Faculty 1 2 3 Count Count Count Country 1 28 12 19 6 40 0 8 7 4 8 13 8 0 1 1 B. female broken down into faculty and country: Faculty 1 2 3 Count Count Count Country 1 29 26 4 6 44 0 0 7 6 11 0 8 4 0 0 C. proportion: 1 =Business, 2=Sciences, 3=Engineering and Surveying Male students in group: Total in group = 258 Total male =134 Proportion of male = 134/258 = 51.938% Male students from Australia enrolled in the business faculty: Male from Australia = 59 Male enrolled in business = 28 Proportion = 28/59 = 47.4576% Female in science faculty Female = 124 Science faculty females= 37 Proportion = 37/124 = 29.84% D. confidence interval: We assume that the sample considered in this study was randomly selected and therefore represents the entire population, in this case therefore we consider the sample for overseas students in analyzing this relationship, the following is a summary of the male female total proportion country 1 59 59 118 0.457364 6 48 44 92 0.356589 7 25 17 42 0.162791 8 2 4 6 0.023256 total 258 1 mean 64.5 0.25 standard deviation 50.15642 0.194405 proportion of overseas 0.542636 mean proportion of overseas 0.180879 The confidence interval is calculated as follows: P{ [ X – ST] ≤ X≤ [ X+ ST]} = 95% Where X is the mean proportion of overseas students, S is the standard deviation and T is the T statistic at 95% level, we substitute our formula as follows: P{ [0.18088– (0.194405)( 2.32635)] ≤ 0.18088≤ [0.18088+ (0.194405)( 2.32635)]} = 95% P{0.09038 ≤ X ≤ 0.9949} = 95% This means that we are 95% confident that the mean proportion of overseas students lies between 0.09038 and 0.9949. Question 3: a. Lurking variable: a lurking variable can be defined as a variable that is not included in a study yet it is very important in the relationship being studies, example a study aimed at forecasting sales level in a firm, this study may not consider other factors affecting consumer expenditure and therefore the variable that is omitted is referred to as a lurking variable. B. central limit theory and law of large numbers: Stuart, D. (1998) states the central limit theory states that as the number of random variables increase then the distribution assumes a normal distribution, the law of large numbers depict the nature of the sample means, this law states that if we are to sample a population and increase the sample size then the mean derived becomes stable and is close to the expected value. Therefore the two differ in that the central limit theorem depict a normal distribution for random variables as these numbers are increased indefinitely, the law of large numbers on the hand depict the stability of the mean as the sample size is increased. C. sample mean and population mean in random samples: The sample mean and the population mean differs in some instances, if we consider a small sample then we expect the sample mean to deviate more from the population mean, however if we use a large sample then the sample mean is close to the population mean. The mean deviates from the population mean as a result of the standard error derived, when considering data we calculate the mean and the standard deviation, the standard deviation depicts the deviation of variables from the mean and therefore due to this deviation the sample mean is different from the population mean. D. sample mean and population mean in voluntary response samples: In voluntary response sample the sample mean differs from the population due to the unpredictable human behaviour, for this reason therefore the sample mean will differ from the population mean. E. parameter and a statistic: when we apply a function to a data set the result is referred to as a statistic, on the other hand the quantity that define a characteristic of a function is referred to as a parameter, example when we estimate Y = a + bx then a and b are parameters because they define the characteristic of the function. F. central limit theory in statistic: The central limit theory is important in statistic in that it depicts the distribution of random variables as their number increase indefinitely, this means that the larger the sample used in a study then the more accurate the estimates will be. G. sampling distribution of the mean: Sampling Distribution of the mean can be defined as the distribution of sample means taken from the same population and the same sample size, this means that the mean of a sample can therefore is predictable and can be determined before sampling. H. telephone survey problems: telephone interview are appropriate in that they are less expensive and less time consuming than face to face interviews, however there are various validity problems associated with telephone surveys, one of this problem is sampling whereby when sampling the sample produced may not be a representative probability sample, this problem arises where people may not have phones in their homes and that majority of individuals may have phones that arte not listed in directories. The other validity problem is the quality of information obtained from the telephone interview, respondents may refuse to provide information truthfully and therefore studies may not be valid, in a face to face interview it is very easy to know when a respondent is lying and in the telephone interview a respondent may provide untruthful information. Question:4 Union official 40% of trucks carry heavy load, Main road department 17 out of 60 carry heavy load The main road department 17/60 = 28.33% carry heavy load We establish a binomial distribution probability: Hypothesis: Null hypothesis H0: a = b, alternative hypothesis: Ha: a ≠ b Where a is the union official probability and b is the probability of main road department, we establish where the two probabilities are equal. The binomial probability function is as follows according to Durrett, R. (1996): Pr = nCk X Pk X (1 – P)n-k Where n is number of trials, K is number of successes and P is the probability of successes, we substitute our formula as follows: Pr = 60C17 X 0.283317 X (1 – 0.2833)60-43 Pr = 0.113674 The binomial probability depict that the probability of achieving a success in 60 trials is 0.113674. From our above analysis it was evident that the probability value is 0.4 and this means that we reject the null hypothesis and accept the alternative hypothesis. Question 5: Twins: A. parametric test: Hypothesis: We test whether there is a difference in the test for foster twins and adopted twins: Null hypothesis: H0: a=b Alternative hypothesis: Ha: a ≠ b The following is the SPSS output T test for the two means: Paired Samples Test Paired Differences t df Sig. (2-tailed) Mean Std. Deviation Std. Error Mean 95% Confidence Interval of the Difference Lower Upper Pair 1 adopted - foster 4.875 8.509 3.009 -2.239 11.989 1.620 7 .149 From the test we reject the null hypothesis; therefore there is a difference in the test for foster twins and adopted twins B. non parametric test: We determine the Wilcox on signed rank test: Ranks N Mean Rank Sum of Ranks foster - adopted Negative Ranks 7(a) 4.21 29.50 Positive Ranks 1(b) 6.50 6.50 Ties 0(c) Total 8 A foster < adopted B foster > adopted C foster = adopted Test Statistics(b) foster - adopted Z -1.620(a) Asymp. Sig. (2-tailed) .105 A Based on positive ranks B Wilcox on Signed Ranks Test From the above we reject the null hypothesis H0: a=b and therefore depict that the foster WISC is greater than adopted children C. difference in the two tests: There are differences evident in the two tests, the first test uses the T table to analyze the relationship and test the hypothesis whether the two mean are equal or different, the non parametric test on the other hand test the rank of the two variables and the number of positive and negative ranks in the data, therefore the parametric method is based on a t table while the non parametric method not only test the hypothesis but also provides the ranks of the two variable. References: Durrett, R. (1996) Probability: theory and examples, London, Sage publishers Rice, J. (1995) Mathematical Statistics and Data Analysis, New York, McGraw hill press Stuart, D. (1998) Statistics: An Introduction, New Jersey, Prentice hall publishers Read More
Cite this document
  • APA
  • MLA
  • CHICAGO
(Data Analysis Assignment Example | Topics and Well Written Essays - 2140 words, n.d.)
Data Analysis Assignment Example | Topics and Well Written Essays - 2140 words. https://studentshare.org/mathematics/2043495-data-analysis-assignment
(Data Analysis Assignment Example | Topics and Well Written Essays - 2140 Words)
Data Analysis Assignment Example | Topics and Well Written Essays - 2140 Words. https://studentshare.org/mathematics/2043495-data-analysis-assignment.
“Data Analysis Assignment Example | Topics and Well Written Essays - 2140 Words”. https://studentshare.org/mathematics/2043495-data-analysis-assignment.
  • Cited: 0 times

CHECK THESE SAMPLES OF Mathematical Statistics and Data Analysis

A Simulated Research Study

Table of Contents Name i Abstract ii Table of Contents 1 Introduction 2 Methodology 3 Data Collection 3 data analysis 4 Results 5 Descriptive Statistics 5 Correlation 6 Regression 6 T-test 7 ANOVA 7 Discussion 8 Introduction The soft drinks industry is an industry that has the biggest competitive presence in the global industry.... Name Instructor Statistical analysis May 10, 2012 Abstract Statistical analysis is usually focused on determining the relationship between two or more variables....
10 Pages (2500 words) Research Paper

Ethical Behavior of Business Students at Bayview University

analysis and discussion a) All students Internet copying Exam copying Collaborated Cheated Number % Number % Number % Number % Yes 16 17.... The report will analyze the collected data, discuss the findings and make recommendations for the dean regarding business student cheating on exams....
5 Pages (1250 words) Case Study

MATHEMATICAL STATISTIC AND ITS IMPACT ON LAWENFORCEMENT

aw enforcement agencies and the lawyers Running Head: mathematical statistics and ITS IMPACTS ON LAW ENFORCEMENT mathematical statistics and Its Impacts on Law Enforcement University:Tutor:Date:mathematical statistics and Its Impacts on Law EnforcementIntroduction Statistical tools are employed in different areas for the evaluation of data.... Probability is one of the statistical tools used in data analysis.... Raw data is subjected to the analysis process to help generate conclusive results....
2 Pages (500 words) Term Paper

Is there a relation between age and income

The statistics about the differences between age and income are almost perpetually about abstract income brackets.... One thing that always struck my mind when I have gone on luxury cruise ships is the fact that most passengers are looks older than the captain-and these luxury cruise ships do not have juveniles as captains....
6 Pages (1500 words) Speech or Presentation

Relationship Between ANA Test Titers Autoimmune Disease

During the sample, analysis three groups of diagnosis of autoimmune disease were detected.... Among the patients 'data we collected, 13 (29.... In this project, the author tests the null hypothesis that the ANA and diagnosis are associated.... Additionally, according to Hirschfield & Heathcote (2011), the ANA test is the main testing tool used for the diagnosis of autoimmune rheumatic conditions....
7 Pages (1750 words) Research Paper

The Central Limit Theorem

The main aim of this paper 'The Central Limit Theorem' is to give a brief introduction about an important topic of the Statistics, which is used in the analysis of Descriptive Statistics with the help of SPSS and also useful in conducting the statistical inference.... No doubt, statistics is an important subject, which helps us in all of our researches and testing the reliability of hypothesis and Central Limit Theorem is one of the reliable tests.... population distributionAll along with the concepts of standard deviation and the normal distribution, we generally know that the concept of central limit theorem is one of the backbones of statistics especially for the descriptive testing and hypothesis....
7 Pages (1750 words) Term Paper

Quantifying Experts Uncertainty about the Future Cost of Exotic Diseases

In the context of statistical analysis, elicitation is the process by which a person's beliefs about some uncertain quantities are translated into a probability distribution (Gosling et al.... When experts identify relevant data and information, such as models, experimental results, and numerical methods, they are providing expertise....
10 Pages (2500 words) Report

The Use of Statistics in Mathematics

Probability refers to the study and analysis of random events in mathematics.... Statistics is used mainly in describing and analyzing many different types of test scores, analysis of results of an election, and also used by shoppers to determine their preferences in some particular products in the market.... Statistics are the subjects in mathematics that allows mathematicians to collect organize and effectively understand arithmetical data (Hogg, 2005)....
6 Pages (1500 words) Coursework
sponsored ads
We use cookies to create the best experience for you. Keep on browsing if you are OK with that, or find out how to manage cookies.
Contact Us