Major Disputable Biostatistics Questions Assignment Example | Topics and Well Written Essays

? This examination has 11 pages (inclusive of front cover). This examination has 5 questions. PART A – 4 questions (80 PART B question (20 TOTAL Marks for exam = 50 WEIGHTING = 45% of final grade Materials supplied: EXAMINATION PAPER – PART A and B DATASET FOR PART B Any hand-held calculators are allowed. Any statistical programme maybe used Special Instructions: This is a TAKE HOME EXAMINATION. Course notes, reference material, and computers may be used. PART A – Each question is worth the number of marks indicated in brackets. Attempt all questions. PART B – Attempt all questions. You will be required to open and use the dataset which has been posted on blackboard. Your solutions should be typed and you should clearly show any intermediate steps so partial marks may be assigned. Please ensure that a title page has been attached. Good Luck! PART A Question 1 [ 10 Marks] Adapted from Dawson B and Trapp RG “Basic and Clinical Biostatistics” McGraw-Hill 2001. a) Tables are a very useful way to display relationships. When two measures are of interest, the purpose of the study generally determines which measure is viewed within the context of the other. Suppose investigators are interested in the relationship between levels of patient medication compliance and their type of USA health insurance coverage. If they wish to show the percentage of patients with different types of insurance coverage within each of the three levels of patient compliance which of the two tables (1a, 1b) should investigators use in their paper to show this relationship? Briefly explain your choice of table. Table 1a Percentages Based on Level of Compliance (Column %) Level of Compliance with Medication Insurance Coverage Low Medium High Medicaid 30 20 15 Medicare 20 25 30 Medicaid and Medicare 5 5 5 Other Insurance 10 30 40 No Insurance 35 20 10 Table 1b Percentages Based on Insurance Coverage (Row %) Level of Compliance with Medication Insurance Coverage Low Medium High Medicaid 45 30 25 Medicare 25 35 40 Medicaid and Medicare 33 33 33 Other Insurance 15 35 50 No Insurance 55 30 15 [2 marks] Solution The investigators should use table 1a since it is the one which gives the percentage of patients with different types of insurance Coverage within each of the three levels of patient compliance. One way to know this is that percentage within each level of patient compliance should add up to 100% fi there are no missing entries. b) For the following frequency distribution of scores: i) What is the mode? The mode is 32 because it is the number that appears most. ii) What is the total number of scores? iii) What is the frequency of the score 28? Frequency of score 28 is 8 iv) What percentage of observations has a score less than 31? The percentage of observations with score less than 31 is v) What is the median? The median score is 30 vi) What is the mean? The mean is 29.405 vii) What is the standard deviation? The standard deviation is 5.833 viii) What is the range? The range is 6 [8 marks] Question 2 [12 marks] a) A large, population-based study estimated that 65% of all Australian children regularly eat fruit. In a random sample of 70 Australian children, what is the probability that at least 50 regularly eat fruit? Show your working (raw output of Stata commands is not required). [2 marks] The test statistic is given by A probability corresponding to this test statistic is 0.8697 b) In the same random sample of 70 children, what is the probability that between 40 and 45 children regularly eat fruit? Show your working (raw output of Stata commands is not required). [3 marks] Corresponding to 40 children Corresponding to 45 children The probability that between 40 and 45 children regularly eat fruit is 0.3645 c) A national aptitude exam is administered annually to sixth graders. The test has a mean score of 78 and a standard deviation of 13. If Suzie’s z-score is 1.5, what was her score on the test? Show your working. [1 mark] Solving for the x gives a score of 97.5. There for Suzie’s score is 97.5 d) Consider the boxplot below. 2 4 6 8 10 12 14 16 18 Which of the following statements are true? I. The distribution is skewed right. II. The interquartile range is about 8. III. The median is about 10. For your answer, choose one statement, from options (A) – (E) below: (A) I only (B) II only (C) III only (D) I and II (E) II and III The answer is B (II only) [1 mark] e) The cumulative frequency plot below shows diastolic blood pressure (in mm Hg) for a random sample of 18 year old women. What is the interquartile range? The interquartile range is 6 [1 mark] f) A researcher undertakes an experiment to test a hypothesis. If she doubles her sample size, which of the following will increase? I. The power of the hypothesis test II. The estimated effect size III. The probability of making a Type 2 error. For your answer, choose one statement, from options (A) – (E) below: (A) I only (B) II only (C) III only (D) All of the above (E) None of the above [1 mark] If the sample size doubles, the power of the hypothesis test increases. The answer is A g) A pharmaceutical company claims that severe adverse drug reactions affect no more than 7% of individuals taking a particular drug. An independent study in a random sample of 300 individuals taking the drug finds that 35 have experienced a severe adverse drug reaction. Should the 7% claim be rejected? Assume that the significance level is 0.05. Show working for each step you undertake. [3 marks] Solution Let P be the true population proportion of severe adverse effects and P0 Be the observed proportion of severe adverse effects. Null hypothesis: Alternative hypothesis: Using normal approximation to binomial, we have P-value corresponding to this test statistic is 0.0008 Since this p-value is less than out alpha level of 0.05, we reject the null hypothesis and conclude that the true proportion of individuals affected by severe drug reaction is not equal to 7%. Question 3 [8 marks] Use the Abstract and Table 3 shown below to answer the following questions. Note Table 3 below is only part of the table shown in the paper but it will be used to answer the majority of the questions. Benson JT, Lucente V and McClellan E. Vaginal versus abdominal reconstructive surgery for the treatment of pelvic support defects: A prospective randomized study with long-term outcome evaluation Am J Obstet Gynecol 1996;175:1418-22. OBJECTIVES: Our purpose was to determine whether a vaginal or abdominal approach is more effective in correcting uterovaginal prolapse. STUDY DESIGN: Eighty-eight women with cervical prolapse to or beyond the hymen or with vaginal vault inversion >50% of its length and anterior vaginal wall descent to or beyond the hymen were randomized to a vaginal versus abdominal surgical approach. Forty-eight women underwent a vaginal approach with bilateral sacrospinous vault suspension and paravaginal repair, and 40 women underwent an abdominal approach with colposacral suspension and paravaginal repair. Ancillary procedures were performed as indicated. Detailed pelvic examination was performed postoperatively by the nonsurgeon coauthor yearly up to 5 years. The women were examined while standing during maximum strain. Surgery was classified as optimally effective if the woman remained asymptomatic, the vaginal apex was supported above the levator plate, and no protrusion of any vaginal tissue beyond the hymen occurred. Surgical effectiveness was considered unsatisfactory if the woman was symptomatic, the apex descended >50% of its length, or the vaginal wall protruded beyond the hymen. RESULTS: Eighty women (vaginal 42, abdominal 38) were available for evaluation at 1 to 5.5 years (mean 2.5 years). The groups were similar in age, weight, parity, and estrogen status, and 56% had undergone prior pelvic surgery. There was no significant difference between the groups in morbidity, complications, hemoglobin change, dyspareunia, pain, or hospital stay. The vaginal group had longer catheter use, more urinary tract infections, more incontinence, decreased operative time, and lower hospital charge. Surgical effectiveness was optimal in 29% of the vaginal group and 58% of the abdominal group and was unsatisfactory leading to re-operation in 33% of the vaginal group and 16% of the abdominal group. The re-operations included procedures for recurrent incontinence in 12% of the vaginal and 2% of the abdominal groups. The relative risk of optimal effectiveness by the abdominal route is 2.03 (95% confidence interval 1.22 to 9.83), and the relative risk of unsatisfactory outcome using the vaginal route is 2.11 (95% confidence interval 0.90 to 4.94). CONCLUSIONS: Reconstructive pelvic surgery for correction of significant pelvic support defects was more effective with an abdominal approach. Table 3 Means and standard errors on variables from the study on reconstructive surgery for pelvic defects Variable Group N Mean Standard Deviation Standard Error Age Vaginal Abdominal 48 40 63.5625 66.1500 9.3055 9.6571 1.3431 1.5269 Haemoglobin change Vaginal Abdominal 48 40 2.5792 2.9525 0.9496 0.9766 0.1371 0.1544 Operating room time Vaginal Abdominal 48 40 195.6250 214.7750 38.3198 46.8065 5.5310 7.4008 Time to recurrence Vaginal Abdominal 30 15 11.5667 17.8667 11.4762 11.8072 2.0953 3.0486 a) Conduct an appropriate hypothesis test using the available data in Table 3 above to determine whether the results from Table 3 agree with the results statement “The groups were similar in age,…..” in the abstract above. Assume that age is normally distributed in the two groups and show full working. [3 marks] Solution Null: Mean age is the same in both groups Alternative: Mean age is not the same in both groups Using independent sample t test The critical value of a two sided t test statistic with 86 degrees of freedom and alpha=0.05 is 0.7345 Therefore we fail to reject the null hypothesis and conclude that there no significant difference in age for the two groups. This is in agreement with what was stated in the paper. b) Calculate 95% confidence intervals for the haemoglobin change in vaginal surgery group and the abdominal reconstructive surgery group using the data in Table 3. [2 marks] 95% confidence interval for haemoglobin change in vaginal surgery group is given as 95% confidence interval for haemoglobin change in reconstructive surgery group is given as c) Comparing the two 95% confidence intervals from part c), what might this suggest? [1 mark] The two confidence intervals do not include zero which means that the haemoglobin change in both groups is significantly different from zero. d) Benson et al. found the 95% confidence interval of the difference in mean operating times in the two groups was statistically significant. According to the results in the abstract above, which group showed a decrease in operating time? Calculate and report the 95% confidence interval for this group using the data in Table 3. [2 marks] The vaginal group had decreased operative time 95% confidence interval of the mean operating time for vaginal group is as follows Question 4 [10 marks] a) Recall from Question 3 above, Benson et al. (1996) compared the operating times for 40 women who had an abdominal procedure with the operating times for 48 women who had a vaginal procedure to correct uterovaginal prolapse. Suppose the investigators determined, prior to beginning their study that they wished to demonstrate a mean difference of 25 mins or more between the two procedures (two-sided test). Assume they were willing to accept a type I error rate of 5% and wanted to be able to detect a true difference with 85% probability. Based on their clinical experience, they estimated the standard deviation of operating times as 45 min. What would be the required sample size if the researchers decided to factor in a 20% loss to follow up of participants and aimed to recruit equal numbers for each procedure? Using the following formula for calculating -------------------------------------------------------- Perform the calculation manually and show your working (that is, do not use PS or the sampsi command in Stata. NB Stata’s display function may be used for mathematical calculations). Type II error=1-0.85=0.15 Required change is 25 n=58.164 We solve for n and obtain a sample size of approximately 58 Briefly summarise your results If we account for 20% lost to follow up, then we need to add 20% of 58 which makes it approximately 11.5. So the required sample size is approximately 70 Notes: i) You may find it helpful to use the PS program to check your answer, but marks will not be awarded for this since some students do not have access to PS. [2 marks] b) Using the sample size result in part a) (allowing for 20% loss to follow- up), what detectable difference could they have achieved if they considered a power of 95%? -------------------------------------------------------- Perform the calculation manually and show your working (that is, do not use PS or the sampsi command in Stata. NB Stata’s display function may be used for mathematical calculations). Solution Using the following formula for calculating Type II error=1-0.95=0.05 Required power is 95% Briefly summarise your results. A detectable difference of 27.42 would have been achieved if a power of 95% was used. Notes: i) You may find it helpful to use the PS program to check your answer, but marks will not be awarded for this. ii) The following formulae may be helpful for answering this question. This is derived from the formula in section 9.1 on page 9 of Module S4 after rearranging for ?. , which is equivalent to the following after taking the square root of both sides [2 marks] c) A sample size calculation is performed to determine the number of participants required to detect a difference of 9 mm Hg between the mean systolic blood pressure (SBP) values of two independent treatment groups. Assuming a within group standard deviation of 30 mm Hg, significance level of 0.05 and 80% power, a sample size of 175 participants per group was estimated, assuming equal participants in each group. How many individuals per group would be required to provide the same power, if one of the treatment groups had 5 times the number of participants as the other? (i.e. there is a sample size ratio of 1:5 between the groups). -------------------------------------------------------- Perform the calculation manually and show your working (that is, do not use PS or the sampsi command in Stata. NB Stata’s display function may be used for mathematical calculations) Briefly summarise your results 105 patients for group 1 and 418 patients for group 2 will be required to provide that same power. Notes: i) You may find it helpful to use the PS program to check your answer, but marks will not be awarded for this. [2 marks] d) A small, initial study provides modest evidence that a particular genetic variant might influence one’s risk of developing osteoporosis. In this study, the proportion of osteoporosis cases carrying at least one copy of the variant was found to be 0.36, while the proportion of controls carrying a copy was 0.31. However, the study was underpowered and the difference was not quite significant (p=0.08). In a larger, follow-up study, how many cases and controls would be required to detect the same difference with power of 90%, at a significance level of 0.01, if the ratio of cases to controls is 1:2 (that is, there are two controls for every case)? Perform the calculation manually and show your working (that is, do not use PS or the sampsi command in Stata. NB Stata’s display function may be used for mathematical calculations) Briefly summarise your results The study would require 5947 cases and 11895 controls to detect the same difference with a power of 90% Notes: i) You may find it helpful to use the PS program to check your answer, but marks will not be awarded for this. [4 marks] PART B Question 1 [ 10 marks] Using the dataset located on blackboard answer the following questions. The dataset is called: biosA_exam_cancer.dta Some information about the dataset is described below: desc Contains data from D:\biosA_exam_cancer.dta obs: 48 Patient Survival in Drug Trial vars: 4 4 Jun 2008 20:47 size: 576 (99.9% of memory free) ------------------------------------------------------------------------------------- storage display value variable name type format label variable label -------------------------------------------------------------------------------------- studytime int %8.0g Months to death or end of exp. died int %8.0g 1 if patient died drug int %8.0g Drug type (1=placebo) age int %8.0g Patient's age at start of exp. -------------------------------------------------------------------------------------- Sorted by: a) Describe the distributions of studytime for subjects in the 2 drug treatments and the placebo group. Show all necessary descriptive statistics and graphs. [2 marks] For placebo group, the individuals’ mean study time was 9 months whereas for the treatment arm 2 it was 14.9 and 25.35 for treatment arm 3. This shows that individuals on treatment arm 3 were had the highest number of months to death or end of study Drug = 1 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- Study time | 20 9 6.448174 1 23 ------------------------------------------------------------------------------------------------------------------------------------ -> Drug = 2 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- Study time | 14 14.92857 7.600246 6 32 ------------------------------------------------------------------------------------------------------------------------------------ -> Drug = 3 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- Study time | 14 25.35714 9.580486 6 39 The following histograms show the distribution of study time for the three groups. There is much variability in the study times for treatment arm 3 as compared to the arm 1 and 2. For arm 1 and 2, study times are concentrated on lower values which is in line with what is observed form the descriptive statistics above. Histograms showing the distribution of study time by treatment group b) Determine the age of the oldest and youngest participants in all 3 drug groups at the start of the study. Show your output. [1 mark] Tables below show the oldest (maximum) and youngest (minimum) age of the participants at the start of the study in the three drug groups. There is no much difference in the age range of the participants at the start of the study as shown by the results. drug = 1 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- age | 20 56.05 5.558067 49 67 ------------------------------------------------------------------------------------------------------------------------------------ -> drug = 2 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- age | 14 56.92857 6.787594 47 67 ------------------------------------------------------------------------------------------------------------------------------------ -> drug = 3 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- age | 14 54.57143 4.636217 48 62 c) What was the overall percentage of participants who were known to have died? Show your output. [1 mark] Overall, 64.58% of participants died. 1 if | patient | died | Freq. Percent Cum. ------------+----------------------------------- 0 | 17 35.42 35.42 1 | 31 64.58 100.00 ------------+----------------------------------- Total | 48 100.00 d) Using an appropriate test, (stating the hypotheses) determine whether the mean (or median) study times were different for the two drug treatment groups (2 and 3). Show all steps performed. [3 marks] Null: The mean study time is the same in treatment groups 2 and 3 Alternative: The mean study times in treatment groups 2 and 3 differ Analysis of variance was used to test this hypothesis Note: Only data of only treatment 2 and 3 was used to construct the ANOVA From the results, the p-value is 0.0037 which is less than alpha level of 0.05. This means we reject the null hypothesis and conclude that the mean study times in treatment groups 2 and 3 differ (Kutner, et al., 2005) Analysis of Variance Source SS df MS F Prob > F ------------------------------------------------------------------------ Between groups 761.285714 1 761.285714 10.18 0.0037 Within groups 1944.14286 26 74.7747253 ------------------------------------------------------------------------ Total 2705.42857 27 100.201058 e) Using an appropriate test, determine whether the mean (or median) age differed between individuals in the placebo group and the combined group of individuals in the two drug groups. Show all steps performed. [3 marks] Null: The mean age is the same in placebo and combined treatment groups 2 and 3 Alternative: The mean age in placebo and combined treatment groups 2 and 3 differ Results of ANOVA give a p-value of 0.8586 which is much larger than alpha level of 0.05. This shows that the mean age is the same in placebo and combined treatment groups 2 and 3 Analysis of Variance Source SS df MS F Prob > F ------------------------------------------------------------------------ Between groups 1.05 1 1.05 0.03 0.8586 Within groups 1504.2 46 32.7 ------------------------------------------------------------------------ Total 1505.25 47 32.0265957 Reference Kutner, M. H., Nachtsheim, C. J., Neter, J. & Li, W., 2005. Applied linear Statistical Methods. New York: McGraw-Hill. END OF EXAMINATION Read More

Major Disputable Biostatistics Questions - Assignment Example

Extract of sample "Major Disputable Biostatistics Questions"

CHECK THESE SAMPLES OF Major Disputable Biostatistics Questions

Epidemiology and Biostatistics

Major Calculations in Biostatistics

Graduate School Admission M.A. Biostatistics

EPIDEMIOLOGY and BIOSTATISTICS

LOVE Chapter 17 Questions

Distribution of Daily Carbohydrate Intake

Biostatistics - Variable Data Types

Primer of Biostatistics