StudentShare
Contact Us
Sign In / Sign Up for FREE
Search
Go to advanced search...
Free

Quality Control, Microarrays Data Calibration and the Quantification of Differential Expression - Lab Report Example

Cite this document
Summary
"Quality Control, Microarrays Data Calibration and the Quantification of Differential Expression" paper are about data quality of microarrays using two groups’ data is control and treatment in files. The results were compared through normalization and random array as well as outlier removal…
Download full paper File format: .doc, available for editing
GRAB THE BEST PAPER93.1% of users find it useful

Extract of sample "Quality Control, Microarrays Data Calibration and the Quantification of Differential Expression"

Quality Control – Microarrays Name of Student Name of University Date Table of Contents Abstract 3 Introduction 4 Background 5 Method 6 Results 7 Scatter plot 7 Normalization 8 Discussion 10 Conclusion 11 References 12 Abstract In Microarrays analysis, data quality is important in obtaining reliable results. This report is about data quality of microarrays of using two groups’ data that is control and treatment in files. The results were compared through normalization and random array as well as outlier removal. From the results it was observed that the removal of outliers led to the two data gene express differentially. The result shows that dataset control and removal of outliers from microarrays improved the results of gene difference expression among the two groups. Introduction In gene profiling, microarrays are used in determining who are likely to use a certain type of drug. In microarray data testing two approaches are used in the analysis for different types of genes. One of the approach is numerical statically approach where data is normalized using a software such as R. Normalization is the manipulation of the datasets to have uniformity using many approaches among the box plots and best fit in scatter diagram. This can be used to control tab 1 errors which are likely to be associated with the data. Microarray data quality control aims determining suitability of the information or the data usefulness in providing reliable results. It is this analysis that provides the percentage of data with the range of determining the results. The main aim of the experiment was to determine whether gene of the two groups that is control and treatment is shown differently. The average values and variances will be used. This paper reviews data quality approach in managing quality control in microarrays. Microarrays data quality evaluation was performed before processing of the data for use in the production of the results that was to be relied upon for decision making about the effectiveness of the treatment. Microarray data quality control is a critical area whereby the problems of implication, outlier, and unnecessary conditions need to be understood and eliminated. The experiment was repeated five times producing different results and was to normalized This paper analyzes and summarizes the various nuances of this issue and provides perspective based on the results of analysis. Studies appear to suggest that a quality control system is in play given that outlier occurred even in control dataset. Finally, modalities for measuring differences in gene expressions have been assessed, and strategies for increasing quality. Background Academic literature teems with articles discussing microarray data quality control as it relates to gene expressions differences for among patients in control and treatment groups of drug testing. However, the phenomenon of microarray data quality control has not been adequately explored, and assessed. R package has been used in the past to improve and assess data collected from the experiment with intention of deciding whether the outcome is correct or needs to be repeated. Thus it is used by medical scientists to determine whether experimental data corrected are fit for purpose it intended or it is needed to be repeated. It also helps them choose the from among the experiments which one provides proper data for use. The poor quality issues in the microarrays arise from differences in RNA quality and wrong hybridization step when carrying the testing. Wrong hybridization step will lead to entire experiment being wrong and individual outlier arrays being large while differences in RNA quality will affect only a specific experiment data. In any experiment control group received care from scientist working for the experiment while the treatment group is given the drug. The study revolves around visits conducted by the two groups of scientists and their follow-up measures, mostly through observation, testing and interpretation. It is expected that that control group show difference in gene expressions as compared to treatment group. Method The five control and the five treatment files from Affymetrix CEL were normalized using R software different algorithms. In normalization variance stabilization was done that is transforming extracted data from the files using log(2x) function in the software. This works by identifying a log-likelihood function and maximizing the values of the genes. Maximum likelihood method is utilized in both linear and non-linear models with reliable results. Then the data was analyzed using the median function and tested using box plots, density plots and pairwise plots while representing genes that are differentially. In order to have good normalized data that is sweave was used by looking at the impact of each gene in the results. In the processes some of the genes will be discarded for their irrelevance. High quality which comes after elimination poor quality microarrays provides measurements of genes expression. Our computations relied on R software was to determine ANOVA for gene expressions in control and treatment groups. ANOVA was employed because it took account into differences in genes for the two groups and P-values were determined. RMA was used to obtain normalized and corrected data from CEL-files. There were best fits using Log linear mixed model for normalised data where all conditions and interactions were considered in terms of their effects. There was the utilisation of R package function of Prediction Analysis of Microarrays helped in identifying those genes that were responsive to treatment. This is because it uses ranking by z-values or t-values. Results The dataset used in this experiment consisted of CEL files that are ctrl1T2.CEL, ctrl2T1.CEL, ctrl2T3.CEL, ctrl3T1.CEL and ctrl3T3.CEL for control data while the treatment file had treat4T3.CEL, treat4T2.CEL, treat3T1.CEL, treat2T1.CEL, treat1T2.CEL. Scatter plot To begin with R code for creating best fit for each of the selected microarrays on a scatter diagram: Figure 1: Graph showing lines of best fit expression levels for control probes and treatment The graphs above shows the best fit line out of control variables and treatment data as it was obtained. The scatter plots was also plotted to assist in creating the best fit as well as show distribution for the data. The structure of the graph shows clearly that the best fit line is correct and does not have much difference as no major outliers can be noted. The graph is plotted using the raw data provided in CEL files. These coefficients that have been determined help to create linearity in the equation showing that the result is linear. According to the graph it indicates that under experiments both control and treatment variables will give a best fit that is linear Sometimes the upper and lower are treated as outliers and are ignored from the data set. Normalization In normalization box plot that is box and whisker diagram was utilized as a convenient way of graphically depicting groups of numerical data through their 5 number summaries. Box plot is a very good way to observe outliers. If any whisker is more than 1.5 times as long as the length of the box, then we have evidence of outliers. In the case we are analyzing, box plot will help us in knowing the data has many outliers and unreliable. Unreliability is usually caused by ssystematic bias and anomalies in data source. R software we utilised hist () as well as boxplot () functions in making this plots from the dataset provided. The following figure emerged from the datasets selected. Figure 2: box plots as well as histograms for log2-transformed intensities From the graph above it can be noted that all the datasets have outliers as the whiskers are more 1.5 times box for the cases of contrl3 and treatment3. This allows the reader to have a visualization of the datasets, so the histogram comes into use. The histogram allows the reader to have a very broad idea of how the datasets are distributed. It is a graphical representation showing a visual impression of the distribution of data. Usually this tool is used to know the shape of the data sets. Also, it helps the reader to know whether the data sets are evenly distributed or not. The histogram provides important information about the shape of a distribution. According to the values presented, the histogram is either highly or moderately skewed to the left or right. A symmetrical shape is also possible, although a histogram is never perfectly symmetrical. If the histogram is skewed to the left, or negatively skewed, the tail extends further to the left. Also sometimes a histogram is used where multiple data sets are drawn. This is usually used to compare the spread of the data sets in relation to each other. A drawback in this type is that if the data sets are not in the same range it would be very hard to compare. From the results above it can be noted that all other datasets are normally distributed except control3 and treatment3 which are skewed to the left. Discussion In this section, all the work will be discussed thoroughly. For each type of analysis one or more observations will be mentioned. The proper microarray data quality control helps in producing results that is reliable in terms of the t-values, gene set enrichment, p-values and clustering. This is done by the use of data normalization and outliers’ removal from the datasets. Some analyses are just to backup or further prove what is observed from other analysis. The scatter plots and best fit will make life easier in terms of recommending a car to buyer. The box-whisker plots showed spread of the datasets how visualized. From figures 1 & 3, we can say that the spread of datasets except in ctrl3T3.CEL and treat3T1.CEL is much lower. Also, we can say ctrl3T3.CEL and treat3T1.CEL has outliers because the whiskers are more than 1.5 the size of the box. The histogram shows that except for files ctrl3T3.CEL and treat3T1.CEL , all look like a normal distribution till it reached the highest frequency then a break appeared in the graph. The use of normalization and outlier removal proved efficient in producing quality data that can be relied upon. Any outlier microarray data that were in the two sets of data proved to be low quality data as it affected the results that were produced. Their presence in the analysis could have distorted the results and provide information that could be unreliable biologically and statistically by adding noise. In the experiment, outliers were detected using the R software for normalization. Conclusion Quality control for microarrays dataset involves the elimination outlier after testing for randomness. This enables scientist to do a better job of detecting abnormalities in the datasets and provides insights into correcting experiment that is out of control. The aim of any scientist collecting data for microarray is to obtain genes that are slow or quick expressly in responding to treatment. The datasets provided showed only two datasets had outliers that is control3 and treatment3 that in files ctrl3T3.CEL and treat3T1.CEL respectively. Lastly, the results do confirm our suggestion that we put forward in the beginning of the experiment. Many of the tests performed added value of the findings that were obtained, and some of the hypothesis were actually rejected while others were accepted. It was found that the confidence interval for the data lies similar to the findings of the scatter plot. This proves that the graphical method sometimes is the best was to figure out something about a pattern. References Gnatenko D., Dunn J., McCorkle S., Weissmann, D., Perrotta, P. & Bahou W., 2003. Transcript profiling of human platelets using microarray and serial analysis of gene expression. Blood. Huber, W., von Heydebreck, A., Sültmann, H., Poustka, A. & Vingron, M., 2002. Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 2002;18 Suppl 1:S96-104. Irizarry RA & Gautier L., 2003. The Analysis of Gene Expression Data: Methods and Software. New York: Springer. Tusher V, Tibshirani R & Chu G., 2001. Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci Read More
Cite this document
  • APA
  • MLA
  • CHICAGO
(Quality Control, Microarrays Data Calibration and the Quantification o Lab Report, n.d.)
Quality Control, Microarrays Data Calibration and the Quantification o Lab Report. https://studentshare.org/biology/2051876-quality-control-in-microarrays
(Quality Control, Microarrays Data Calibration and the Quantification O Lab Report)
Quality Control, Microarrays Data Calibration and the Quantification O Lab Report. https://studentshare.org/biology/2051876-quality-control-in-microarrays.
“Quality Control, Microarrays Data Calibration and the Quantification O Lab Report”. https://studentshare.org/biology/2051876-quality-control-in-microarrays.
  • Cited: 0 times

CHECK THESE SAMPLES OF Quality Control, Microarrays Data Calibration and the Quantification of Differential Expression

R Packages to Identify Differentially Expressed Genes

It specifies the range of values from which the data is to be covered for this… Mu stands for the true value of the mean ( for the data to have insignificant differences the mean =0). T-tests and ANOVA are the recommended ways of establishing whether there exist significant differences between the means of two or more inputs.... In this case the aim The data were classified according to the gene expressions before and after application of the drug on patients....
2 Pages (500 words) Coursework

Metabolic research critique

published in 2010 Orphanet Journal of Rare Diseases which is entitled The diagnosis of inherited metabolic diseases by microarray gene expression profiling.... The authors of the article explored the gene characteristics of 68… The article presented a short background of the study which is more library research based but there is no literature review which can be related to the fact that it is Article Critique: The diagnosis of inherited metabolic diseases by microarray gene expression profiling This paper presents the study made by Hernandez et al....
2 Pages (500 words) Essay

RNA-Seq and Microarray Analysis

Domestic dogs are considered the perfect models for the study.... This is because dogs form spontaneous tumors and at the same time they have a tendency towards… Further the field of animal veterinary and the discipline of human medicine applies the same tools for diagnosis and therapy.... Additionally, the response of the canines to chemotherapies has more similarity to that of humans, when Therefore, dogs become the best models for indentifying the genetic underpinnings that are associated with cancer in humans (Mooney, et al....
11 Pages (2750 words) Research Paper

HIV-1 detection by western blot and DNA/RNA microarray

NA/RNA MicroarraysThis technique uses DNA/RNA microarray chips to detect gene expression of thousands of specific genes (Sealfon & Chu 2011, p.... The resulting combination can be viewed using special scanners and indicate the level and type of gene expression in the cell (National Human Genome Research Institute 2011, para 3).... Since HIV-1 alters the expression of gene expression in host cells, it is possible to test and know if one is infected with the virus using this technique....
2 Pages (500 words) Essay

Calibration of Pressure Measurement System

In the paper “calibration of Pressure Measurement System,” the author develops analytical competence by investigating the performance of measurement systems and quantify its static performance.... It involves simplifying the adjustment, checking, and direct calibration of other pressure measurement devices on the site.... It involves simplifying the adjustment, checking, and direct calibration of other pressure measurement devices on the site....
3 Pages (750 words) Lab Report

Strain Gauge Calibration

The paper "Strain Gauge calibration" states that for the purpose of applications, the calibration resistor is determined in order to come up with the same bridge output voltage that would be brought about when a strain gauge of the specified gage factor is exposed to a given strain....
5 Pages (1250 words) Lab Report

Field Instrument Calibration, Test and Troubleshooting

This term paper "Field Instrument Calibration, Test and Troubleshooting" presents calibration and troubleshooting of field instruments such as pressure transmitters, temperature sensors, and flow elements.... This article will provide a brief theory of calibration of the common types of field instruments.... There are two adjustments made in the calibration of the transmitter.... ALIBRATION OF LCD DISPLAYNOTE: Be sure the transmitter is in calibration before attempting to adjust the LCD display....
7 Pages (1750 words) Term Paper

Estimation of Genetic Variability between Angus, Brahman, and Santa Populations

The genotypes expression changes in the Angus, Brahman, and Santa helped to group them into different categories in terms of productivity.... Since they were files with large data were coded using a collection of SNPs.... The data was loaded and read using data= read.... xt) data= read.... xt) data= read.... xt) data= read.... xt) >75000 100 In order to identify each line the following code is used data=readLines(con="angus....
6 Pages (1500 words) Lab Report
sponsored ads
We use cookies to create the best experience for you. Keep on browsing if you are OK with that, or find out how to manage cookies.
Contact Us