Unreliable test Essay Example | Topics and Well Written Essays

Answer each of the following Questions, each should be about 2 page response. One Source Per Questions if possible should be used Can an unreliable test be valid Why or why not A reliable test estimates the consistency of the measurement conducted. Reliability means how the results are measured and if they are found the same way each time they are gathered. Validity is the measure of accuracy. It is the determination of whatever method used is consistently giving accurate results. This is more important because if the results are not correct then there is no reason to use them. They will not offer an appropriate conclusion. There are cases where unreliable tests can be valid, for example in the first Gallup Poll conducted, random digit dialing was employed in order to determine the public's opinion of political candidates. However, in the 1930s when this survey was conducted, the majority of people who owned phones were upper class, Caucasian families or just males. Their opinions on the upcoming election and on the candidates were valid but the sampling itself was unreliable because it was not necessarily consistent with the entire country's views and was not a representative sampling of how the entire country felt. It was just a small elite portion who could be reached (Golay 1996). 2. What are the four methods of establishing validity How are these methods similar How are they different Kirk (1985) defines four methods of establishing validity as case studies, focus groups, surveys, and blind experimental studies. Case studies are non-experimental, descriptive types of studies where an in-depth record is kept by an outside observer. These records can include observations, recording individuals' experiences, biological data, medical records, family history, interviews, and the results of psychological tests. The objective of focus groups is to interview six to ten people at the same time to obtain their opinion, their view, and see their perspective when evaluating a new household product, the media, or even food. The goal is start an open discussion stemming from a few questions in search of a particular answer. Surveys can have various different objectives from determining employee satisfaction or finding out the opinion of individuals about the government. Surveys can be conducted anonymously, without consequence to the person taking the survey or they can be taken to determine information for an individual. Finally there are blind experimental studies, where individuals are not told the objective of the study until after it is completed. There are independent variables, which are used to measure change and a dependent variable, which is the constant results are being compared to. The benefits of experiments are that they help researchers determine cause and effect relationships by manipulating one variable. The similarities between the four methods are that they can be used in any type of situation, they can be used with many different types of people, and each method offers participants complete anonymity. The differences between the four, is the time to takes to conduct each method will vary significantly, the information obtained will be different, and the people who conduct these methods will be different. 3. What are the four methods of establishing reliability How are these methods similar How are they different There are four methods of testing reliability, as defined by Rudner and Shafer (2001), test-retest reliability, inter-rater reliability, parallel -forms reliability, and internal consistency reliability. In test-retest reliability, a test is administered twice and at two different times, this type of reliability measures whether or not there is a change in the quality of results over time. In inter-rater reliability, two different people are determining the outcomes of the test, if both people are consistent in their scores, in their outcomes of the test, then it is reliable. In parallel-forms reliability, tests are compared to each other but they were made from the same material or from the same content. The tests are administered at the same time to the same subjects. Finally in internal-consistency reliability, test items that may be worded differently are used to measure the same information but are compared against each other. The similarities between the four is that to determine reliability these methods must be employed at least twice and all four can be used in the same situation to determine reliability. The differences are, in test-retest there are time limitations that no other methods experience, in inter-rater reliability the opinions of two different people can vary drastically, in parallel-forms the two separate measures may not be exactly the same when compared to each other, and internal consistency is determined based on the wording of the questions. 4. What would be considered evidence of validity Explain. What would be considered as evidence of validity would be the adherence to the measures initially set when first conducting an experiment. The steps in determining validity include a test that measures what it is intending to measure i.e. it attempts to answer the question posed in its hypothesis and the study itself supports the conclusion drawn from the results (Cronback and Meeh 1955). Furthermore, evidence of validity includes that there is a appropriate sampling of the population to come to the conclusion drawn by the study and support if that there is no bias behind the conclusion of a test. Other factors include that the same conclusions can be determined in other populations, the test itself was done under appropriate controlled conditions, it is non-discriminate, and it is concurrent with other studies done in the same field. There are no specific pieces of evidence that are determinants of validity but there are certain guidelines as established above that improve the chances of an experiment, a study, or a test being valid. 5. What is a construct How can you establish a measure for a construct's validity A construct is a concept that describes a number of characteristics, attitudes, and values that are unobservable and often times abstract. They are generalized concepts that are invented by people in a society or culture that is universally accepted, such as social status (Cronback and Meeh 1955). To establish a measure for construct validity, one must focus on the social construct variables that are known to affect a study, and then the expectations of why they can cause an effect should also be evaluated. A proper measure examines the correlations behind underlying generalizations form numerous studies. For example in Campbell and Fiske's mulititrait-multimethod matrix (1959), six major considerations are made to determine construct validity including determining whether participants come from a similar location, where the same social constructs are held, and if those correlate with observable traits. The best way to test what cannot be seen is to compare results to what can be seen. For example a person who is polled on who they want as they're presidential candidate can be influenced by the underlying belief that males should always be in positions of power. This belief may not be seen or heard but can be correlated with where a person lives, how they vote, and the people they surround themselves with. 6. What are some examples of possible test bias How can test bias negatively affect test results Test bias is the difference in test scores, which is attributed to demographic variables. Examples of which can be gender, ethnicity, race, or age. Other factors can include the structure of the school system, the home life of the test taker, the place of origin for test-taker, etc. Many standardized tests that are given to students to determine progress rely on the school system to appropriately prepare a student to take these tests. Test bias can negatively affect test results because there are extraneous circumstances that cannot be controlled (Berk 1982). For example a student may not do well on a reading comprehension test or a grammar test but that is due to English being their second language, which many tests would not accommodate. Another possibility is that students are taking an age appropriate test, such as high school students taking the PSAT, however they are from an underdeveloped neighborhoods, their school system cannot employ enough teachers to help student study, and their reading, writing, and math skills are not equivalent to their grade level expectations. These are situations can cause negative results but are not accounted for in test bias. Finally a more obvious test bias, is when young boys are given a test about a female's experience during a menstruation cycle, their results are obviously skewed because of their gender. 7. Are the differences of an IQ test score based on cultural differences an example of slope or intercept bias Explain your reasoning. The difference of an IQ test score based on cultural differences is an example of intercept bias. By making a prediction of how an individual will do based on their cultural background can significantly inflate the results of the test (Berk 1982). For example kids from foreign backgrounds such as Indians and Chinese children hold the stereotype of intelligence. If that is what is expected of them and the children are aware of this expectation then they are more likely to perform high. However if someone from a low income community is expected to do poorly on their test, their IQ scores will reflect that prediction. References Berk, Ronald A. (1982), ed. Handbook of methods for determining test bias. Baltimore: Johns Hopkins University Press. Campbell, D.T., & FiskeD.W. (1959). Convergent and discriminant validation by the multitrait- multimethod matrix. Psychological Bulletin. Cronbach, Lee J. and Meeh, Paul E. (1955). Construct validity in psychological tests. Psychological Bulletin. Golay, Michael. (1996). Where America stands. Boston: John Wiley & Sons. Kirk, Jerome. (1985). Reliability and validity in qualitative research. Sage Publications, Inc. Rudner, L.M., & Shafer, W.D. (2001). Reliability. ERIC Digest. College Park, MD: ERIC Clearinghouse on Assessment and Evaluation. Read More

Unreliable test - Essay Example

Extract of sample "Unreliable test"

CHECK THESE SAMPLES OF Unreliable test

Social Research Methods

Breach Of Patients Safety By Using Of Medical Devices

The Essence of Stress and Perceived Faculty Support

Test item reliability-ANOVA

Item Reliability Entire assignment desciption sent for reference

Reliance on National Test Scores in Admissions

Short Paper on Descartes

Whether IELTS Is a Reliable and Valid Assessment Procedure for International English Test