Free

Biostatistics - Variable Data Types - Essay Example

Add to wishlist

Summary

By saying “The P value greater than 0.05 was considered to be significant” the author of the paper "Biostatistics - Variable Data Types" meant that the observed evidence was enough to make a decision that there was no relationship between periodontal disease during pregnancy and preeclampsia…

Download full paper File format: .doc, available for editing

GRAB THE BEST PAPER92.3% of users find it useful

Read Text Preview

Subject: Health Sciences & Medicine
Type: Essay
Level: Undergraduate
Pages: 4 (1000 words)
Downloads: 2
Author: pfefferalexane

Extract of sample "Biostatistics - Variable Data Types"

BIOSTATISTICS DENT70001: MID-TERM ASSIGNMENT Assignment March Question a) By saying “The P value greater than 0.05 was considered to be significant” (p. 771) the author meant that the observed evidence was enough to make a decision that the there was no relationship between periodontal disease during pregnancy and preeclampsia. In statistics, P values less than 0.05 (or any other level of significance being used) are usually considered statistically significant (Lawrence, Klimberg, & Lawrence 2009). The author goes on to indicate, “mean of clinical attachment loss was not significantly different between two groups (P = 0.16),” (Yaghini et al. 2012, p. 772). Here, the P value is greater than 0.05, and yet the authors say that the differences were not significant. However, the statement means that the evidence presented in the case was not strong enough to consider that the means were different. b) Variable data types i. Number of pregnancies is discrete or discontinuous variable. This is because for the subjects to be included in the study they had to be pregnant. Pregnancy takes a distinct value because one is either pregnant or not. ii. Clinical attachment loss (CAL) measured in [whole] millimetres (mm) is a continuous variable because it is a mean of the distances from the cento enamel junction and the center of the pocket. The distances can take on any numerical values hence are continuous variable. iii. Gingival bleeding index (GBI) is a continuous variable because it takes any numerical value. This is because it is obtained from percentages of the mean bleeding areas. iv. Mean GBI is a continuous variable because it is a measure of central tendency and can take any numerical value. v. Plaque index (PI) is a continuous variable because it consists of the mean PI values, which can take on any numerical value. vi. Mean PI is a continuous variable because the average of the PIs can take on any value. Question 2 Assuming that maternal age for the population of women with preeclampsia is normally distributed and the true population mean and standard deviation are known to be 28.5 and 4.5 years respectively: a) Mean of maternal age of women with preeclampsia in the sample is a true reflection of the population. This is because the sample mean age and standard deviation are equal to the population mean age and standard deviation. This shows that the population is normally distributed and is in line with some of the assumptions that are prerequisites for accurate statistics (Lomax 2007). b) The most likely maternal age of a woman with preeclampsia drawn at random is 28.5 years. This is because in a normal distribution, the mean, mode, and the median are the same value (Lomax 2007). If the mean of the population is 28.5 years, it implies that the mode and median are also 28.5 years. Therefore, there are high chances that a woman drawn at random from the population is 28.5 years, which is the mode of the population. c) It is expected that 95% of the population lie between the age of 19.5 and 37.5 years. The empirical rule states that in a normal distribution, almost all values lie within 3 standard deviations of the mean (Grafarend 2006). About 68% lie within one standard deviation, 95% within two standard deviations, and 99% within three standard deviations. It follows that 95% of the population lie between 28.5-(4.5×2) and 28.5+(4.5×2). This give a range of 19.5 years and 37.5 years. d) A histogram sketch of the distribution. Question 3 3. A second sample of mothers is drawn at random from each of control and case populations. a) A graphical summary comparing the case and control groups. b) The sample mean and standard deviation of each group. Control Case Mean=0.223 Mean=0.509 Standard deviation=0.28808 Standard deviation=0.553894 c) Interpretation of the descriptive statistics. The mean CAL in controls is higher than the mean CAL in the cases. The standard deviations are large implying that the values are highly distributed in the population and are far from the means. d) The mean CAL values are normally distributed in the two groups. Most of the data points are close to the mean, where as few data points are far from the mean. Therefore, there is a high likelihood that the data follows a normal distribution. e) The difference in the mean CAL between case and control groups with the corresponding 95% confidence interval. The confidence interval is given by the formula M1-M2 ± t α/2, df × √(S1/n1+ S2/n2). N=20, M1-M2 is the difference in the mean between case and control groups, which is (0.509-0.223) = 0.286. The degrees of freedom= (n1-1)+ (n2-1), = (20-1)+(20-1) =38 t α/2 is obtained by finding the value of t at (0.05/2) and (38) degrees of freedom, which is 2.0244. √(S1/n1+ S2/n2)=√[(0.28808)2÷20]+[(0.553894)2÷20] =√0.019489432 =0.1396604556 Upper limit=0.286+ [2.0244×0.1396604556] =0.286+0.2827 =0.5687 Lower limit=0.286-[2.0244×1396604556] =0.286-0.2827 =0.0033 The confidence interval is 0.0033≤µcontrol-µcase≤0.5687 f) T-test to investigate the statistical significance of the difference in mean CAL between case and control groups. The null hypothesis is H0: µ1=µ2, whereas the alternative hypothesis is H1: µ1≠µ2. The test statistic is t=X1-X2/sp√(1/n1+1/n2) with n1+n2-2 df. Sp2=(n1-1)s21+(n2-1)s22÷ (n1+n2-2) Rejection criteria: We reject H0 if tcal ≥ t α/2, n1+n2-2 or if tcal ≤- t α/2, n1+n2-2 t 0.025, 38= 2.0244. sp2=19(0.553894)2 +19(0.28808)2÷20+20-2 sp=√.194894324 sp=0.4414 tcal =(0.509-0.223)÷0.4414×√(1/20+1/20) =0.286÷0.1395823 =2.0489 From the calculations, tcal is greater than t 0.025, 38. Therefore, we reject the null hypothesis and conclude that the difference in mean CAL between the control and the case is statistically significant. Performing the same t-test on MS Excel gives a p-value of 0.047439, which is less than 0.05, further evidence that the difference between the mean CAL values is statistically significant. g) The results in the confidence interval and the t-test are consistent with our findings. They show that the differences in mean CAL in control and case are statistically significant from 0.0033 to 0.5687mm at 95%. This means that for any mean CAL values beyond this range may not be statistically significant. Question 4 The findings in question 3 show that the mean CAL differences between the control and the case are statistically significant for values ranging between 0.0033 to 0.5687 mm, and yet Yaghini et al. report no statistically significance in the mean CAL values (P value 0.16). However, a careful look at the P value (0.047439) shows that this evidence is not strong enough to support the claim. According to Lawrence, Klimberg, and Lawrence, the smaller the P value the stronger the evidence is that supports a statistical decision (2009). The P value obtained is extremely close to 0.05 hence the contradicting results in the final report by Yaghini et al. The central limit theorem implies that the more the number of samples tested the closer a researcher gets to the true reflection of the population mean (Walker & Maddan 2012). The findings in question 3 utilize a smaller sample number (twenty) compared to the final sample number used by Yaghini et al. (25 and 26). This increase in the sampling population may have reduced the standard errors in the sampling distribution hence the observed differences in the results. This increase further raised the P value and made the observed mean in the larger sample statistically insignificant. References Grafarend, E 2006, Linear and non-linear models: fixed effects, random effects, and mixed effects, Walter de Gruiter, Stuttgart, Germany. Lawrence K. D., Klimberg, R. K., & Lawrence, S. M 2009, Fundamentals of forecasting using Excel, Industrial Press Inc., London. Lomax, R. G 2007, An introduction to statistical concepts, 2nd edn, Lawrence Erlbaum Associates Inc., New Jersey. Walker, T. J., & Maddan, S 2012, Statistics in criminology and criminal justice, 4th edn, Jones & Bartlett Publishers, Burlington, MA. Yaghini, J., Mostajeran, F., Afshari, E., & Naghsh, N 2012, “Is periodontal disease related to preeclampsia?” Dental Research Journal, vol. 9 no. 6, pp. 770-773. Read More

Cite this document

APA
MLA
CHICAGO

(“Biostatistics Essay Example | Topics and Well Written Essays - 1000 words”, n.d.)
Biostatistics Essay Example | Topics and Well Written Essays - 1000 words. Retrieved from https://studentshare.org/health-sciences-medicine/1620089-biostatistics

(Biostatistics Essay Example | Topics and Well Written Essays - 1000 Words)
Biostatistics Essay Example | Topics and Well Written Essays - 1000 Words. https://studentshare.org/health-sciences-medicine/1620089-biostatistics.

“Biostatistics Essay Example | Topics and Well Written Essays - 1000 Words”, n.d. https://studentshare.org/health-sciences-medicine/1620089-biostatistics.

Cited: 0 times

CHECK THESE SAMPLES OF Biostatistics - Variable Data Types

Major Disputable Biostatistics Questions

If they wish to show the percentage of patients with different types of insurance coverage within each of the three levels of patient compliance which of the two tables (1a, 1b) should investigators use in their paper to show this relationship?... The investigators should use table 1a since it is the one that gives the percentage of patients with different types of insurance.... The assignment "Major Disputable biostatistics Questions" focuses on the critical analysis of the major disputable questions in biostatistics....

13 Pages (3250 words) Assignment

Biostatistics. Advantages of a randomized controlled trial over a trial with systematic allocation

variable data type and justification Number of adverse events Number of adverse events is a quantitative variable because it assumes numeric values.... Randomized controlled trials also have the advantage of probability sampling basis, which offers validity to data analysis, over the non-probability based systematic sampling (Friedman, Furberg and DeMets 2010, p.... 5 This data can be presented in graphical form as follows.... It can further be classified as discrete quantitative variable because it can only assume positive whole numbers (Weiers 2010, p....

4 Pages (1000 words) Essay

Research Data Analysis: Qualitative and Quantitative Types

The author discusses business research with the help of qualitative and quantitative research types.... The author state that to achieve the expansion in the business different types of companies focuses on various issues.... So, essentially a substantial level of data support the possibility of good cost management results in better outcomes (Peccei 2004)....

8 Pages (2000 words) Assignment

EPIDEMIOLOGY and BIOSTATISTICS

The following table shows data from a... The chronic debilitating conditions mentioned in these diseases is a subjective and dependant variable concern.... Two tests A and B are available to diagnose a particular communicable disease that is usually fatal if not diagnosed and treated....

13 Pages (3250 words) Essay

Distribution of Daily Carbohydrate Intake

When we are face with such data where we expect outlying observations, it's better to use median because it is not much affected my extreme a) The authors concluded about their study that use of gastric banding compared with lifestyle intervention resulted in a greater percentage achieving a loss of 50% of excess weight, corrected for age among obese adolescent participants....

4 Pages (1000 words) Assignment

Primer of Biostatistics

In the paper 'Primer of Biostatistics' the author focuses on statistics, which is concerned with virtually all dimensions of data, even the planning of data collection in terms of the survey methodologies and experiments therein.... During data analysis, quite often, either of descriptive statistics or inferential statistics methodologies is employed.... Descriptive statistics (which unlike inferential statistics is not developed on the basis of probability theory) basically refers to the discipline of quantitatively describing the main features of a collection of data while inferential statistics on the other hand denotes the process of arriving at conclusions from data open to random or sampling variations (such as observational errors) (Graham, pg....

8 Pages (2000 words) Term Paper

Epidemly and biostat

== The SECOND SECTION includes questions related to analyzing, and interpreting data, and summarizing the findings and its impact on nursing practice (FIVE QUESTIONS=19 POINTS).... Download and use the file ... ... 1.... Which of the following best describes the design where subjects are....

5 Pages (1250 words) Assignment

Epidemiology and Biostatistical Critique

This research was carried out to describe retention by age and visit type with a view to determining the characteristics associated with visit types in a longitudinal study among older adults (Strotmeyer et al, 2010).... It also aimed at determining the characteristics of visit types for a longitudinal epidemiological study of older adults (Strotmeyer et al, 2010).... Epidemiology is a data-driven discipline and relies on objective approaches to collect, analyze and interpret data....

6 Pages (1500 words) Assignment