StudentShare
Contact Us
Sign In / Sign Up for FREE
Search
Go to advanced search...
Free

Quantitative and Statistical Analysis in Statistical Package for the Social Sciences - Assignment Example

Cite this document
Summary
In the paper "Quantitative and Statistical Analysis in Statistical Package for the Social Sciences", analysis results indicate a very small percentage (1%) of the population reporting to be in very bad health. Almost three-quarters of the population (72%) self-reported to be in good or very good health…
Download full paper File format: .doc, available for editing
GRAB THE BEST PAPER98.6% of users find it useful

Extract of sample "Quantitative and Statistical Analysis in Statistical Package for the Social Sciences"

ASSESSMENT: 2015-2016 Q1: (a) Make a statistical assessment of each of these three variables independently. [5 marks] i) Subjective health status (health) Subjective health status was measured as a self-reported categorical variable using a Likert scale ranging 1-5. Results from this survey question are therefore likely to be subjective. Data from the survey indicates 2417 valid responses were received from the sample population as shown below: Subjective general health Frequency Valid Percent Very good 740 31% Good 985 41% Fair 509 21% Bad 151 6% Very bad 33 1% Total 2417 100% Table 1: Frequency table for subjective health status Generally, analysis results indicate a very small percentage (1%) of the population reporting to be in very bad health. Almost three quarters of the population (72%) self-reported to be in good or very good health. On a scale of 1-5, 1 being ‘Very good’ and 5 being ‘Very bad’, respondents gave an average rating of 2.07 regarding their health status. This translates to an average score of ‘Good’. The pie chart below indicates the proportion of respondents in each response category. Figure 1: Pie chart on subjective health status A histogram was constructed to visualize the distribution of weighted responses with regards to self-reported health status. As seen below, the distribution is right skewed with most responses portraying the population to be in good health. Generally, self-report studies suffer from validity problems as seen in this case study where respondents over-report their health status to make them look healthier than could be the case. Fig 2: Histogram showing the skewness of reported general health ii) Age of respondent (agea) Results from the survey indicate the youngest respondent was aged 15 years. The minimum age-restriction is possibly an inclusion criteria set in the study protocol. The oldest respondent was aged 98 as shown in the summary statistics in the table below: Descriptive Statistics N Minimum Maximum Mean Std. Deviation Age of respondent, calculated 2415 15 98 47.34 18.693 Table 2: Summary statistics for respondent age The above results indicate a mean respondent age of 47 years with a standard deviation of 19. These results were calculated and are more valid than the self-report data on health-status mentioned earlier in this paper. As in most experiments, age distribution is almost normal albeit with a slight right skewness, as shown by the normal curve superimposed onto the histogram below: Fig 3: Histogram showing the age distribution of study participants Most respondents are in the age bracket of 20-70. The normality in age distribution ensures non-biased responses particularly for questions whose responses vary by age, such as health status. iii) How happy are you? (happy) Similar to health status, happiness was measured as a self-reported question and its interpretation should be made with similar caution. Generally, results from the survey indicate a large proportion of respondents self-reporting to be happy. Results are based on a Likert-scale ranging from 0-10, 0 being the lowest (‘Extremely unhappy’) and 10 the highest (‘Extremely happy’). Responses from 0 to 4 were rated as ‘unhappy’. The frequencies for the various categories of responses are shown below: How happy are you? Frequency Percent 0 - 4 unhappy 168 7% 5 205 9% 6 157 7% 7 430 18% 8 761 32% 9 416 17% extremely happy 282 12% Total 2419 100% Table 3: Summary statistics for respondent happiness Using the scale, more than three quarters of the sampled population self-reported a ‘happiness score’ of 7 and above. The distribution of responses is hence expected to be left skewed and is confirmed in the histogram below: Fig 4: Histogram showing the distribution of self-reports on happiness (b) Consider whether and in what way there is an association between any combination of two of the three variables. [10 marks] For this question, it was hypothesized that there is an association between health and age. Generally, biological studies indicate that controlling for other variables, health declines with age due to the tear and wear of body tissues. To investigate whether this hypothesis holds in this study, two indicator variables for age and health will be used: subjective health status (health), and age of respondent (agea). Preliminary analysis Preliminary investigations involve a crosstab of means of respondent age for each level of reported health status. The question is whether a pattern exists among persons reporting a specific health status and their age. Mean age of respondents differentiated by health is computed and is shown below: Subjective general health Mean, age of respondent, calculated Frequency Very good 41.85 739 Good 47.06 980 Fair 52.29 509 Bad 57.1 150 Very bad 57.92 33 Total 47.34 2410 Table 4: Mean respondent age differentiated by health status Table 4 above indicates a possible correlation between age and health status. We observe that better health outcomes are reported by respondents with a lower age. Consequently, as reported health outcomes worsen, the mean age increases correspondingly. This relation is visualized in the graph below: Fig 5: A graph showing a possible correlation between age and reported health outcomes From the above figure, it is observed that as age increases, health outcomes steadily become worse. At a mean age of around 57 years, there is a sudden peak in the gradient of the curve, indicating a slight change in age while reported health outcomes jump from ‘bad’ to ‘very bad’. Confirmatory analysis/ tests of association Whilst a possible correlation has been hypothesized between age and health outcome, the result can only be confirmed using appropriate tests of hypothesis. For this test, we create two categorical outcomes and subject them to a Chi-square test of association. Since health is already a categorical variable, no transformations are necessary. Instead, age of respondents is recoded to create categories and stored in a new variable, age_cat (label: Age category), as shown below: Age category Value Frequency 15-25 1 387 26-35 2 341 36-45 3 392 46-55 4 450 56-65 5 389 66-75 6 288 76-85 7 133 86-98 8 35 Total 2415 Having created age categories, a Chi-Square test of association can be used to confirm the hypothesized association between age and reported health. The test was run in SPSS and the output is shown below; Chi-Square Tests Value df Asymp. Sig. (2-sided) Pearson Chi-Square 174.127a 28 4.8E-23 N of Valid Cases 2410 Chi-square tests on the crosstab gives a p-value ~ 0, hence we fail to reject the null hypothesis and conclude that a correlation exists between age and subjective general health. Assumptions It is assumed that the observations are independent of each other, i.e. observations made on any variable are not correlated in any way. (c) Draw a plausible causal diagram for the relationship between the three variables, justifying why you have drawn it in the way you do. [3 marks] Age is an independent variable, i.e. its values do not change even though other variables may be changing. Consequently, it has to be free of influence from any of the two variables of interest here. Happiness too is caused partly by good health. Hence, happiness is a direct result of general health and indirect result of age. (d) Analyse the causal diagram and explain what conclusions you can come to about the relationship between the three variables. [13 marks] From earlier statistical analyses, age and general health have a causal relationship, i.e. increase in age causes deterioration in health. Age is the antecedent variable, i.e. age is causally antecedent to general happiness. It is also known that positive health outcome is linked to happiness. This linkage is based on existing theory and knowledge. Happiness and health have, for a long time now, been linked together resulting into the popular phrase- laughter is the best medicine. To sum it all, age has a direct effect on health, and health has a direct effect on happiness. Therefore age has an indirect effect on happiness but is also the antecedent in this chain. If we intend to investigate the effect of age on happiness, then it is prudent that we first control for the effect of general health in our model (a confounding variable). This is important since the model does not imply a one-to-one matching of the effects from one stage to another, i.e. external influences may also lead to improved health outcomes hence happiness. Controlling for the effects of general health in the model can be achieved through processes such as matching. The effect of age on happiness would be a direct one due to the causality linkage between the two variables. (e) What are the possible limitations of your model? [2 marks] The model does not factor in influence from latent external factors that are not included into the model. These are hard to detect and measure using regular techniques. Factor analysis methods can be used to eliminate their influence. QUESTION 2 (a) How many minutes per day, on average, do women engage in paid work? [2 marks] 132 minutes (b) What is the standard deviation of the sampling distribution for the average number of minutes women engage in paid work? [2 marks] SE = SD/√(sample size) Hence SD=SE/ √(sample size) Standard deviation (SD) is a measure of dispersion, or spread, while standard error (SE) is a measure of how close the sample mean is to the population mean. Consequently, as sample size increases, SE tends to zero as the sample estimate gets closer to the population estimate. SD, on the other hand, gets closer to the population SD. In the absence of the SD, the SE provides an unbiased estimate of the SD. Hence the answer is 4.12 minutes (c) State whether the sampling distribution for the average number of minutes men engage in paid work is more spread out or less spread out than the sampling distribution for the average number of minutes women engage in paid work [2 marks]. Briefly explain how you know this [2 marks]. The sampling distribution for the average number of minutes men engage in paid work is more spread out. The SD is a measure of spread. A higher SD implies the data is more spread and vice versa. Hence, since men have a higher SD (estimated form SE), than women, the sampling distribution is more spread out. (d) Suppose that we double the number of men in our sample. Describe two ways in which this increase in sample size changes the sampling distribution for the average number of minutes men engage in paid work [4 marks]. As mentioned earlier, an increase in sample size leads to a reduction in the standard error while the standard error shifts towards the mean. Consequently, when we double the number of men in our sample, we are likely to have a marginal drop in standard error. (e) a. Give one example of a possible confounding variable and explain why it might be a confounding variable [3 marks]. A possible confounding variable is age of the respondent. Older women are likely to spend more time in paid work and less time in childcare as compared to younger women. The latter are more likely to split their time between childcare and paid work as they have younger children that require their attention. However, older women will have already gotten through with childcare. (e) b. Now give one example of a possible intervening variable and explain why it might be an intervening variable [3 marks]. An example of an intervening variable would be number of children. Number of children would affect the number of minutes women spend on childcare and also affect time spent on paid work. (e) c. Now give one example of a possible instrumental variable and explain why it might be an instrumental variable [3 marks]. An instrumental variable would be nature of work as it has a direct effect on how many hours women spend on paid work (f) (g) -One was of controlling for causal relationships is through stratification. To effectively investigate the relationship between minutes spent on childcare and paid work, it would make sense if the population is stratified by the confounding variable, i.e. age. For instance, analyses should be separately for different age groups and if the relationships is not found to be similar between the groups, the results should be presented for different age categories. -A second way to improve the investigation of the relationship would be to first check if a relationship actually exists between the two variables and the intervening variable, number of children. QUESTION 3 Read More
Cite this document
  • APA
  • MLA
  • CHICAGO
(Quantitative and Statistical Analysis in Statistical Package for the Assignment, n.d.)
Quantitative and Statistical Analysis in Statistical Package for the Assignment. https://studentshare.org/sociology/2056723-quantitative-and-statistical-analysis-in-spss
(Quantitative and Statistical Analysis in Statistical Package for the Assignment)
Quantitative and Statistical Analysis in Statistical Package for the Assignment. https://studentshare.org/sociology/2056723-quantitative-and-statistical-analysis-in-spss.
“Quantitative and Statistical Analysis in Statistical Package for the Assignment”. https://studentshare.org/sociology/2056723-quantitative-and-statistical-analysis-in-spss.
  • Cited: 0 times

CHECK THESE SAMPLES OF Quantitative and Statistical Analysis in Statistical Package for the Social Sciences

Development of Type 2 Diabetes Mellitus Risk Factors in Rural Children

Data analysis The samples were subjected to BMI and the results analyzed statistically.... Screen is a good method for this kind of analysis; however, a better analysis such as Body Mass Index has... For instance, open ended questionnaires were used to collect both qualitative and quantitative data....
5 Pages (1250 words) Research Paper

Satisfaction Ratings for Stock, Quiet and Staff in Male and Female Students

The paper "Satisfaction Ratings for Stock, Quiet and Staff in Male and Female Students" highlights that there were 28 part-time students who were sampled.... They spent a mean number of 9.... 9 hours in the library per week.... There were also 122 full-time students who were sampled.... nbsp;… The full-time students significantly spend more hours in the library compared to a part-time student with a mean number of hours being of 9....
12 Pages (3000 words) Research Paper

Business Success Is Customer-Driven

The National Summary of U.... .... flights for 2007 pegs the number of airline passengers at 660 million.... By January 2008, this figure has risen to 679 million (Bureau of Transportation Statistics,… In 1995, the average domestic airfare was $288 but by 2007, the average price has reached $328 (Bureau of Transportation Statistics, 2008)....
15 Pages (3750 words) Essay

CRITIQUE OF QUANTITATIVE ARTICLE

statistical package for Social Sciences version 15, 2007 was used to analyze the data.... Responses collected through close-ended questionnaires were coded into percentages and frequencies to make it viable for statistical analysis.... Consequently, coded data can be analyzed easily using statistical packages.... Data analysis and PresentationQuantitative studies present quantifiable data.... Consequently, this analysis responds to key qualitative questions that could have been left out if the study was conducted quantitatively....
2 Pages (500 words) Research Paper

Analyzing the Year of Studying

This assignment "Analyzing the Year of Studying" describes the data using appropriate graphical displays and summary statistics: year of study, gender, number of hours used per week, overall satisfaction and the three satisfaction ratings for the stock, quiet areas and staff.... nbsp;… This assignment also discusses to what extent there is a difference of opinion between males and females relating to the three satisfaction ratings for the stock, quiet areas and staff, to what extent there is a difference between the modes of study and the number of hours spent in the library per week and differences in the overall satisfaction rating according to the year of study....
11 Pages (2750 words) Assignment

Priming Study Attitudes to Smoking

"Priming Study Attitudes to Smoking" paper argues that negative priming and attitude to smoking had the highest mean scores in males and females at 59.... 222 and 64.... 727 scores respectively.... The mean value for men with positive prime about smoking was higher than that of women at 59.... 000 and 51....
7 Pages (1750 words) Lab Report

Regression Analysis Models for Marketing Decision Making

Analysis of quantitative data which is mainly measurable is done through several statistical models.... Regression analysis is a statistical technique that determines linear relationships between two or more variables.... Simple regression models use only two variables to achieve a particular statistical result.... on-linear regression is a quantitative statistical method of finding a nonlinear model of the relationship between the dependent variable and a set of several independent variables....
9 Pages (2250 words) Research Paper

The Way That the Researcher Structures a Research Project

This section entails the development of research instruments, pre-testing, sampling techniques and statistical data analysis techniques.... The first section explains the rationale of the research methodology and design, especially the reasons for using both quantitative and qualitative dimensions of research in the study.... According to Wiersma and Jurs (2005, P4) the attributes of the systematic nature of the research process is as illustrated below:Having a view that this research is a systematic inquiry, this study employed the dimensions of both quantitative and qualitative approaches....
14 Pages (3500 words) Thesis
sponsored ads
We use cookies to create the best experience for you. Keep on browsing if you are OK with that, or find out how to manage cookies.
Contact Us