DNA Sequencing and Microarray Technology Essay Example | Topics and Well Written Essays

Biology Topic: Biotechnology Quiz 21/11 DNA Sequencing There are different methods which have been developed for DNA sequencing such as the Maxam-Gilbert method and Sanger method developed independently in 1977. In the Maxam-Gilbert method (Gilbert and Maxam 560-561), the technique determines DNA sequence by terminally labeling the four dNTPs with chemical agents and digesting the DNA with chemicals. A set of nested radioactive fragments are produced through this partial cleavage of DNA. Polyacrylamide Gel Electrophoresis (PAGE) is used to resolve the single-stranded fragments according to size which is determined by the point of breakage. An autograph representing the nucleotide base as occurring in the DNA fragments is then generated from the gel where bases are read and the sequence derived. The Sanger method also uses the gel to resolve the sequences but does not involve chemical digestion of DNA rather the modified nucleotide dideoxynucleotides triphosphates (ddNTPS) are used as chain terminators. The ddNTPS lack a 3’ –OH important in chain elongation by DNA polymerase. Sanger method applies three strategies; the use of Dye-primer where DNA polymerase extends the primer until a ddNTP is incorporated causing chain termination. In another Sanger method Dye-nucleotides (labeled dNTPs) are used in place of labeled primer where more label per sequence is equated to better sensitivity. The Dye-terminator method is an improvement of the former two Sanger sequencing methods where a labeled ddNTP with a specific dye per base are used. Unlike the labeled primer and dye nucleotide Sanger methods, this method makes it possible to run the different DNA in one lane. Gel or capillary electrophoresis is used to resolve the fragments. Genomic sequencing strategies The Sanger methods are able to sequence best from 30-350 nucleotides and therefore genomic sequencing strategies have been developed to sequence longer DNA of interest such as the gene of interest in this plant. In the shortgun sequencing strategies, DNA often of large size is shredded into smaller fragments that can then be sequenced individually. Shredding of the DNA is done by restriction enzymes or mechanically by shearing the DNA. The sequences of these fragments are then reassembled into their original order based on overlaps. Usually alignment of the sequences is done by a computer program to yield the complete sequence. In Whole-genome shortgun, the DNA is obtained without prior physical map knowledge and indiscriminately sheared into fragments of 100kb which are then cloned into plasmids and transformed. The DNA inserts obtained from the plasmids are sequenced individually and consequently assembled into a long contiguous sequence. The strategy has limits due to gaps which arise during assembly due to the repeats in the sequences. Another strategy is primer walking which tends to deal with whole shortgun sequencing challenges in assembly of “gaps”. . Clones carrying inserts for sequences for both sides of the gap are identified and the DNA is sequenced normally. Resultant sequence is used to design a primer downstream from the former primer position. These steps are repeated over and over until the complete sequence of the insert is elucidated. Pairwise-end sequencing (double-barrel shortgun) is another strategy for genome sequencing which is performed on both sides of DNA of interest as opposed to one in whole-genome shortgun. It reduces “gaps” thereby minimizing assembly errors which are common in whole-gun sequencing. However it poses a huge computational challenge during assembly. DNA is shredded into 150mb fragments and inserted into BACs in hierarchical shortgun sequencing strategy. Inserts are mapped into a physical map and organized by known location “Golden Tiling Path”. Inserts are fragmented further and cloned into plasmid where they are again recovered and sequenced according to “the Golden Tiling Path”. This strategy is applied for long pieces of DNA such as whole genome or chromosome and in Next generation sequencing. The Next generation sequencing (NGS) are revolutionary strategies applied in DNA sequencing and are more accurate and cost effective than the existing capillary-based cycle sequencing reactions methods. In NGS, instruments are applied in various implementations of cyclic-array sequencing which circumvents the various challenges in conventional sequencing strategies. One such technology is the sequencing by synthesis and the generation of clonally clustered amplicons (amplification products) using emulsion or bridge PCR (Mardis 387-389). In this scenario, the Next generation sequencing would be my preferred method to sequence the genome of the plant since the methods are accurate faster and in addition cost effective. The NGS method such as pyrosequencing (sequencing by synthesis) avoid the tedious cycles present in the conventional methods of DNA sequencing. Sequencing Expressed Sequences The eukaryotic genome is composed of exons which are the coding DNA and interspersed by non-coding portion, introns. Sequencing the entire genome though important in getting more insights to an organism is an expensive undertaking. Sequencing expressed sequence is cost effective since one directly sequences only the gene of interest. The method is devoid of redundancy because only the exons, which are the coding regions, are sequenced. The introns are not sequenced as the process involves reverse transcribing of mRNA to obtain the cDNA of interest. A disadvantage of the method is that one lacks the opportunity to sequence the complete gene. Mature processed mRNA in eukaryotes is spliced off introns which may contain important signals involved in the expression of the gene. Another disadvantage is the expression of a gene, for instance the color change in this plant may be a result of influence of other genes or the combined pattern of expression of a family/group of genes with some either being upregulated or downregulated. Therefore one may not obtain the full information pertaining to a certain trait observed in an organism by just sequencing the expressed genes. ESTs stand for Expressed Sequenced Tags Sequence received from third party service provider for DNA sequencing ENTRY novel TITLE novel 832 bases SEQUENCE 5 10 15 20 25 30 1 C T G C T C T C T C T C T T C C G G C G C C G A T C A T G G 31 C G G A C A A G G A G G C A A A G A A G G T G C C A T C T G 61 T C C C G G A A A G C C T C C T G A A G A G G C G A C A G G 91 C T T A T G C A G C C G C A A A A G C C A A A C G T T T G A 121 A G A G G C T G T T G G C T C A A A A A A A G T T T C G T A 151 A A G C G C A A A G G A A A A T C A T C T A T G A A A G G G 181 C C A A A G C T T A C C A T A A G G A G T A C A G A C A C A 211 T G T A C A G A C A G G A G A T C C G C A T G G C C A G G A 241 T G G C C A G G A A A G C C G G T A A T T A C T A T G T T C 271 C A G C T G A A C C G A A G C T T G C A T T T G T C A T C A 301 G G A T A A G A G G T A T C A A T G G T G T G A G C C C C A 331 A G G T C C G G A A G G T G T T A C A G C T T C T T C G C C 361 T G C G T C A G A T T T T T A A T G G C A C C T T T G T G A 391 A G C T C A A C A A A G C T T C T A T C A A C A T G C T G C 421 G G A T T G T T G A G C C C T A T A T T G C A T G G G G T T 451 A T C C C A A T C T G A A G T C T G T G C A T G A T T T G A 481 T C T A C A A G C G T G G T T A T G G C A A G A T C A A C A 511 A G A A G C G C A T T G C T C T C A C T G A C A A C T C C C 541 T G A T T C G G A A G C G C C T T G G A A A A C T T G G C A 571 T C A T C T G C A T G G A A G A T G T G A T C C A T G A A A 601 T T T A T A C T G T T G G C A A G A A C T T C A A A G T T G 631 T G A A C A A C T T C C T T T G G C C C T T C A A G T T A T 661 C C T C T C C T C G G G G T G G A A T G A A G A A G A A A A 691 C G A T C C A C T T T G T G G A G G G T G G A G A T G C T G 721 G T A A C A G A G A A G A T C A G A T A A A C A G A C T C A 751 T C A G G A G A A T G A A C T A A T G A T T C A G A T G C C 781 C A G C T G C A G T T T T T C A A A G T C T G G T C T G T T 811 A A T A A A A T C T T C T T G C A C A A A A /// Sequence conversion Readseq which is available at http://www-bimas.cit.nih.gov/molbio/readseq / is a tool for converting sequence data from one format to another for instance from GenBank format to EMBL format and can also combine numerous sequence data files into one. Readseq fasta file format output >nameless_1 832 bp (sequence x) CTGCTCTCTCTCTTCCGGCGCCGATCATGGCGGACAAGGAGGCAAAGAAGGTGCCATCTGTCCCGGAAAGCCTCCTGAAGAGGCGACAGGCTTATGCAGCCGCAAAAGCCAAACGTTTGAAGAGGCTGTTGGCTCAAAAAAAGTTTCGTAAAGCGCAAAGGAAAATCATCTATGAAAGGGCCAAAGCTTACCATAAGGAGTACAGACACATGTACAGACAGGAGATCCGCATGGCCAGGATGGCCAGGAAAGCCGGTAATTACTATGTTCCAGCTGAACCGAAGCTTGCATTTGTCATCAGGATAAGAGGTATCAATGGTGTGAGCCCCAAGGTCCGGAAGGTGTTACAGCTTCTTCGCCTGCGTCAGATTTTTAATGGCACCTTTGTGAAGCTCAACAAAGCTTCTATCAACATGCTGCGGATTGTTGAGCCCTATATTGCATGGGGTTATCCCAATCTGAAGTCTGTGCATGATTTGATCTACAAGCGTGGTTATGGCAAGATCAACAAGAAGCGCATTGCTCTCACTGACAACTCCCTGATTCGGAAGCGCCTTGGAAAACTTGGCATCATCTGCATGGAAGATGTGATCCATGAAATTTATACTGTTGGCAAGAACTTCAAAGTTGTGAACAACTTCCTTTGGCCCTTCAAGTTATCCTCTCCTCGGGGTGGAATGAAGAAGAAAACGATCCACTTTGTGGAGGGTGGAGATGCTGGTAACAGAGAAGATCAGATAAACAGACTCATCAGGAGAATGAACTAATGATTCAGATGCCCAGCTGCAGTTTTTCAAAGTCTGGTCTGTTAATAAAATCTTCTTGCACAAAA BLAST The Blast (Basic Local Alignment Search Tool) algorithm, developed by Altschul and others (3389-3402), is a sequence similarity search tool. The NCBI Blast server at http://www.ncbi.nlm.nih.gov/ is the mostly widely used for sequence database. The algorithm increases the search speed by first searching common words (k-tuples) in the query sequence and each sequence present in the database. In proteins, significance is determined through the evaluation of word matches using log odds scores in the BLOSUM62 amino acids substitution matrix. The word length for proteins is fixed at 3 and for nucleotides at 11 (Baxevanis and Ouellette 301-310). There are various Blast programs available at NCBI Table 1: Blast programs Program Query Database Type of alignment BLASTP Protein Protein Gapped BLASTN Nucleic Acid Nucleic Acid Gapped BLASTX Translated Nucleic Acid Protein Each Frame Gapped TBLASTN Protein Translated Nucleic Acid Each Frame Gapped TBLASTX Translated Nucleic Acid Translated Nucleic Acid Ungapped High-scoring Segment Pair (HSP) High-scoring segment is a subsegment of a pair of sequences which are highly similar. The sequences may either be nucleotide or protein. The HSPs are generated by alignment programs such as Blast. Redundant HSPs may arise in some blast searches output due to the presence of homologues, repeated fragments within a genome or due to pseudogenes (Zhang 1391). E-value is a measure of the statistical significance of the returned alignment. It is important in determining whether an alignment is pertinent or not. The E-value is calculated from the formulae E =kmNe-?s Where: k=constant m = length of query sequence N =total length of all sequences in the Database ? = constant to normalize the raw score of HSP s = HSP score In protein an E-value, E < 10-3 is considered significant, less or greater than this may be considered a false positive for HSP. In DNA an E-value, E < 10-3 is considered significant, less or greater than this may be considered a false positive for HSP. Difference and advantages of running BLAST using the blastx vs. blastn algorithm Blastx searches a protein database using a query which is a nucleotide translated in all the six frames whereas Blastn searches nucleotide databases using a nucleotide query. Blastx is more informative than blastn because it searches homologs for annotation and also attempts to find the Open Reading Frame (ORF) in the query. Therefore it is an important blast tool in sequence annotation and phylogeny. On the other hand Blastn can also be important in finding non-coding regions of DNA such as promoters. Number of sequences returned from the blastx with significant E-value: 61 blast hits Likely gene product, gene name: 60S Ribosomal protein Relevant translation frame: Frame = +3 Molecular Clock Molecular clock is a technique in molecular evolution that uses fossil constraints and rates of molecular change to deduce the time in geologic history when two species/other taxa diverged. It is used in estimation of the time of occurrence of events called speciation or radiation. The molecular data used for such calculations is usually nucleotide sequences for DNA or amino acid sequences for proteins. It is sometimes called a gene clock or evolutionary clock Top scoring sequence for 8 species from significant BLAST hits returned >nameless (sequence x) ALSLPAPIMADKEAKKVPSVPESLLKRRQAYAAAKAKRLKRLLAQKKFRKAQRKIIYERAKAYHKEYRHMYRQEIRMARMARKAGNYYVPAEPKLAFVIRIRGINGVSPKVRKVLQLLRLRQIFNGTFVKLNKASINMLRIVEPYIAWGYPNLKSVHDLIYKRGYGKINKKRIALTDNSLIRKRLGKLGIICMEDVIHEIYTVGKNFKVVNNFLWPFKLSSPRGGMKKKTIHFVEGGDAGNREDQINRLIRRMNFRCPAAVFQSLVCNLLAQX >Vitis vinifera MGGEEVKGVVVPESVLKKRKRSEEWALAKKQELECTKKKNATNRTLIYVRAKQYAKEYDEQQKELIQLKREAKLKGGFYVSPEAKLLFIIRIRGINAMHPKTRKILQLLRLRQIFNGVFLKVNKATMNMLHRVEPYVTYGYPNLKSVRELIYKRGYGKLNKQRTALTDNSIIEQALGKFGIICIEDLIHEIMTVGPHFKEANNFLWPFKLKAPLGGLKKKRNHYVEGGDAGNREDYINELIRRMN >Populus trichocarpa MGEEVMVAVPESVLKKRKREEEWALAKKQELAATKKKNAENRKIIFKRAKQYSKEYEEQGKQLVQLKREARLKGGFYVDPEAKLLFIIRIRGINAMHPKTRSILQLLRLRQIFNGVFLKVNKATVNMLRRVEPYVTYGYPNLKSVRELIYKRGFGKLNQQRIPLTDNSIIDQGLGKHGIICVEDLIHEIITVGPHFKEANNFLWPFQLKAPLGGLKKKRNHYVEGGDAGNRENYINELIRRMN > Ricinus communis MGSEEVKIVPESVLKKTKRNEEWALAKKQELEASKKIAKESRKLIFNRAKQYAKEYDQQQKEVIQLKREAKLKGGFYVNPEAKLLFIIRIRGINAIDPKTRKILQLLRLRQIFNGVFLKVNKATVNMLHRVEPYVTYGYPNLKSVKELIYKRGYGKVNQQRIALTDNSIVEQVLGKHGIICMEDLIHEILTVGPHFKEANNFLWPFQLKAPLEGLKKKRNHYVEGGDAGNREDYINELIRRMN >Arabidopsis thaliana MTEAESKTVVPESVLKKRKREEEWALAKKQELEAAKKQNAEKRKLIFNRAKQYSKEYQEKERELIQLKREAKLKGGFYVDPEAKLLFIIRIRGINAIDPKTKKILQLLRLRQIFNGVFLKVNKATINMLRRVEPYVTYGYPNLKSVKELIYKRGFGKLNHQRTALTDNSIVDQGLGKHGIICVEDLIHEIMTVGPHFKEANNFLWPFQLKAPLGGMKKKRNHYVEGGDAGNRENFINELVRRMN >Physcomitrella patens subsp. patens MSAEQAPAVVPETLLKKRKRDEQWAAAKSSQLAAAKSRNAKNRTLIFKRAQQYQEEYQRQEKELISLKREAKLKGGFYVEPEPKLMFVVRIRGINDMHPKVRKIMQLLRLRQIFNGVFMKVNKATVNMLRRVEPYVTYGYPNLKTVRELIYKRGYGKLNKSRTALTDNSIIEEALGKYGIICIEDLIHEIYTVGPHFKEANNFLWPFKLSAPLGGLTKKRNHYVEHGDAGNREAKLNNLVRQMN >Oryza sativa Japonica Group MASEAAKVVVPESVLRKRKREEVWAAASKEKAVAEKKKSIESRKLIFSRAKQYAEEYEAQEKELVQLKREARMKGGFYVSPEEKLLFVVRIRGINAMHPKTRKILQLLRLRQIFNGVFLKVNKATINMLRRVEPYVAYGYPNLKSVRELIYKRGYGKLNKQRIPLTNNKVIEEGLGKHDIICIEDLVHEIMTVGPHFKEANNFLWPFKLKAPLGGLKKKRNHYVEGGDAGNRENYINELIRRMN >Zea mays MSAAEAKAAAVPESVLRKRKREEQWAADKKEKALADRKKALESRKIIFARAKQYAEEYHAQEKELVQLKREARLKGGFYVSPEAKLLFVVRIRGINAMHPKTRKILQLLRLRQIFNGVFLKVNKATINMLRRVEPYVAYGYPNLKSVRELIYKRGYGKLNKQRIPLSNNSVIEEGLGKHNIICIEDLVHEIMTVGPHFKEANNFLWPFKLKAPLGGLKKKRNHYVEGGDAGNRENYINELIKRMN >Sorghum bicolor MSSEVAKVAVPESVLRKQKREEQWAVEKKEKALAEKKKSIKSRKLIFTRAKQYAKEYDAQEKELVQLKREARLKGGFYVSPEVKLLFVVRIRGINAMHPKIRKILQLLRLRQIFNGVFLKVNKATINMLRRVEPYAAYGYPNLKSVRELIYKMGYGKLNKQRIPLSNNQVIEAGLGKHNIICIEDLVHEIMIVGPHFKEANNFLWPFKLKAPLGGLKKKRNHYVEGGDAGNRENYINELIKRMN Multiple sequence alignment Figure 1: Multiple sequence alignment Multiple sequence alignment tool used ClustalW Neighbor joining and UPGMA methods Unweighted Pair Group Method With Arithmetic Mean (UPGMA) is clustering algorithm that joins tree branches based on the greatest similarity among pairs and joined. fit generates accurate topology only with true branch lengths if the divergence is according to a molecular clock.. In Neighbor Joining method a fully resolved tree is decomposed from a fully unresolved star shaped tree by progressively inserting branches between a pair of the closest neighbor. This consolidates the closest neighbor pair thereby reforming a star tree and this process is repeated over and over. Tree building Species name: Physcomitrella patens Species name: Physcomitrella patens Distance tool: DNADIST Tree tool MEGA phylogenetic software package (Tamura et al 2011) Draw tool: MEGA phylogenetic software package (Tamura et al 2011) Draw tool used: MEGA Non-distance based tree construction methods available Maximum Likelihood Maximum parsimony Maximum Likelihood Species name: Physcomitrella patens Yes, the tree generated by character-based method, Maximum likelihood method corroborates with the distance-based tree of Neighbor joining method thereby supporting the clades represented. Sequence submission NCBI’s GenBank at http://www.ncbi.nlm.nih.gov/Genbank/ Large submissions are made through; a tool is made available, Sequin at http://www.ncbi.nlm.nih.gov/Sequin/index.html . >genomic TATGATCATACGTATTAATCACTAGTGGTAACTAGTTATTAAGCTGGTGTTGCTTTTGCACCATCTCACGCCATGGCCTTAAACATGGTCCTTCTCTGGGAACTCACCATGGCGGCTCTCTTCTTCTTTATTAACTACCTGTACAGTACATTTTAGGCACATTGGTACACTCTTTTGATTGGAAATTGCCAAAAGATGTTGAGCTCAACAATTGGTGCTATTCCCGTCCTAGGAGCCATGCCCCATGCCGCCTTAGCCAAAATGGCCAAACAATACGGCC CCGTCATGTACTTAAAAATGGGCACTTGTAACATGGTTGTGGCCTCCACCCCAGACGCAGCTCGAGCCTTCCTCAAAACCCTTGATCTAAATTTCTCAAACCGTCCCCCAAATGCAGGTGCAACTCATTTAGCCTATAATCGGAGGAGAGATTTCTCTCTCTGAGCTAGCTGACGAGACCGATCCAGCATCTACATCACGGCACGACTTTTGCTAGGTGGTAAGGCTCTAGAGGATTGGGCTCATGTCCGAGTATCCGAGCTAGGGCACATGCTTCGAGCCATGTGCGAGGCTAGCCGAAAAGGCGAACCTGTGGTGGTGCCAGAAATGTTGACCTATGCCATGGCAAATATGATAGGCCAAATTATACTTAGTCGTCGTGTGTTTGTTACTAAAGGTTCAGAGTCTAATGAGTTCAAGGACATGGTAGTGGAGCTCATGACAAGCGCAGGATTATTCAATGTTGGCGATTATATACCGTCGGTTGCTTGGATGGATTTGCAAGGAATTGAACGTGGGATGAAGAGATTACATCGAAGGTTTGATGTGTTACTGACAAAAATGATGGAGGAGCATATTGCAACGGCTCATGAACGCAAGGGAAAGCCGGATTTTCTGGACGTTCTCATGGCTAACCAAGAAAATTTGGATGGTGAGAAGCTTAGCTTTACCAACATCAAGGCTCTTCTTTTGGTATGTCTCACTCTTCCGTCTCATTTTAATTACAAATTACGTACCTAGCTAGATTCTGTGCTTATACATATAAATTGCACATGATCTACGTACCATGCATGCACGGTTTTAATATATATATATATGTGCGATATTGTATGCGAAGAGATGGAGTCCGAATCAATAATACATCAAAATGTTCCAAGAAGAAAATATATTATTAGATCGTTACTTACATC CGCTTTTGTTTTATTTTGGAAGAACTTATTCACAGCCGGCACTGACACTTCCTCGAGCATAATCGAATGGTCACTTGCTGAGATGTTGAAAAACCCTCGCATCCTTAAACAAGCTCAGGATGAGATGGATCAAGTCATTGGTCGAAATCGGCGACTAGAGGAATCTGACATACCAAAACTTCCCTACCTGCAAGCCATATGCAAAGAAACATTCCGAAAACACCCATCTACACCTCTATTACGGCATAGAGTCAGACTTTTTAGAGGAGATCCCATATAT TACTATATCCCCAAAGGCACTAGACTGAGTGTCAACATCTGGGCAATAGGGAGGGATCCTGATGTATGGGACAATCCACTAGACTTTACTCCTGAGAGATTTTTTAGTGAGAAATATGCGAAAATCAACCCTCAAGGAAATGATTTTGAGCTAATTCCATTTGGAGCTGGAAGAAGAATTTGTGCTGGGACTAGAATGGGGATTGTGCTATCTTACTCGCTGTTTGATAAGAAAACTGTCCACCCGTCAACTGCCTCCAGGCCCCCGAGGGTGGCCGATC TGGATGAGGTCTTTGGTCTTGCATTGCAGAAGGCAGTTCCCCTTTCCGCCATGGTTACTCCACGCCTGGAACCTAACGCATATCTTGCTTAA Gene prediction Likely gene of interest from the genomic sequence (protein sequence) >gene_1|GeneMark.hmm|423_aa MLSSTIGAIPVLGAMPHAALAKMAKQYGPVMYLKMGTCNMVVASTPDAARAFLKTLDLNFSNRPPNAGGKALEDWAHVRVSELGHMLRAMCEASRKGEPVVVPEMLTYAMANMIGQIILSRRVFVTKGSESNEFKDMVVELMTSAGLFNVGDYIPSVAWMDLQGIERGMKRLHRRFDVLLTKMMEEHIATAHERKGKPDFLDVLMANQENLDGEKLSFTNIKALLLNLFTAGTDTSSSIIEWSLAEMLKNPRILKQAQDEMDQVIGRNRRLEESDIPKLPYLQAICKETFRKHPSTPLLRHRVRLFRGDPIYYYIPKGTRLSVNIWAIGRDPDVWDNPLDFTPERFFSEKYAKINPQGNDFELIPFGAGRRICAGTRMGIVLSYSLFDKKTVHPSTASRPPRVADLDEVFGLALQKAVPLSAM Tool used: Eukaryotic GeneMark.hmm Predicted gene coding or non-coding Tool used:ORF Finder (Open Reading Frame Finder) Figure 4: output of ORF indicating a considerable long (+1) frame of 1269 nucleotides Open reading frame which is defined by a start and a stop codon is quite long indicating a coding region of the genomic sequence predicted protein most similar to flavoid 3 5-hydroxylase Species name: Populus trichocarpa Tool used:FAsta Protein structure determination Homology-Modeling Method Figure 5: 3-D Structure of the predicted protein Tool used: EsyPred Lambert et al (2002) homologous template used to predict the protein structure 1OG2 chain ‘A’ used as template >Mouse (mmu) candidate 1 GCAGCCCUCUGUUAGUUUUGCAUAGUUGCACUACAAGAAGAAUGUAGUUGUGCAAAUCUAUGCAAAACUGAUGGUGGCCUGC >Mouse (mmu) candidate 2 AUCCAGGUCUGGGGCAUGAACCUGGCAUACAAUGUAGAUUUCUGUGUUUGUUAGGCAACAGCUACAUUGUCUGCUGGGUUUCAGGCUACCUGGAA >Mouse (mmu) candidate 3 AGGGUGGCAGGGCCACAACCAGCGCAGACUGGCGCGCCCCAGGGAUCUCUGGGUGAGUAUCUCUUUGAGCGCCUCACUCUCAAGCACAACUAGGAGGCCUCUGCCUUCC >Mouse (mmu) candidate 4 AUAUAGUGCUUGGUUCCUAGUAGGUGCUCAGUAAGUGUUUGUGACAUAAUUCGUUUAUUGAGCACCUCCUAUCAAUCAAGCACUGUGCUAGGCUCUGG >Mouse (mmu) candidate 5 GCCGUGGCCAUCUUACUGGGCAGCAUUGGAUAGUGUCUGAUCUCUAAUACUGCCUGGUAAUGAUGACGGC >Mouse (mmu) candidate 1 GCAGCCCUCUGUUAGUUUUGCAUAGUUGCACUACAAGAAGAAUGUAGUUGUGCAAAUCUAUGCAAAACUGAUGGUGGCCUGC >gi|262205831|ref|NR_029786.1| Mus musculus microRNA 19a (Mir19a), microRNA GCAGCCCTCTGTTAGTTTTGCATAGTTGCACTACAAGAAGAATGTAGTTGTGCAAATCTATGCAAAACTGATGGTGGCCTGC >Mouse (mmu) candidate 2 AUCCAGGUCUGGGGCAUGAACCUGGCAUACAAUGUAGAUUUCUGUGUUUGUUAGGCAACAGCUACAUUGUCUGCUGGGUUUCAGGCUACCUGGAA >gi|262205931|ref|NR_029806.1| Mus musculus microRNA 221 (Mir221), microRNA ATCCAGGTCTGGGGCATGAACCTGGCATACAATGTAGATTTCTGTGTTTGTTAGGCAACAGCTACATTGTCTGCTGGGTTTCAGGCTACCTGGAA >Mouse (mmu) candidate 3 AGGGUGGCAGGGCCACAACCAGCGCAGACUGGCGCGCCCCAGGGAUCUCUGGGUGAGUAUCUCUUUGAGCGCCUCACUCUCAAGCACAACUAGGAGGCCUCUGCCUUCC >gi|262205388|ref|NR_030465.1| Mus musculus microRNA 692-1 (Mir692-1), microRNA AGGGTGGCAGGGCCACAACCAGCGCAGACTGGCGCGCCCCAGGGATCTCTGGGTGAGTATCTCTTTGAGCGCCTCACTCTCAAGCACAACTAGGAGGCCTCTGCCTTCC >Mouse (mmu) candidate 4 AUAUAGUGCUUGGUUCCUAGUAGGUGCUCAGUAAGUGUUUGUGACAUAAUUCGUUUAUUGAGCACCUCCUAUCAAUCAAGCACUGUGCUAGGCUCUGG >gi|262205707|ref|NR_029759.1| Mus musculus microRNA 325 (Mir325), microRNA ATATAGTGCTTGGTTCCTAGTAGGTGCTCAGTAAGTGTTTGTGACATAATTCGTTTATTGAGCACCTCCTATCAATCAAGCACTGTGCTAGGCTCTGG >Mouse (mmu) candidate 5 GCCGUGGCCAUCUUACUGGGCAGCAUUGGAUAGUGUCUGAUCUCUAAUACUGCCUGGUAAUGAUGACGGC >gi|262206115|ref|NR_029587.1| Mus musculus microRNA 200b (Mir200b), microRNA GCCGTGGCCATCTTACTGGGCAGCATTGGATAGTGTCTGATCTCTAATACTGCCTGGTAATGATGACGGC Sequence with a conserved binding sequence present in the 3’UTR for AQP2? >Mouse (mmu) candidate 2 AUCCAGGUCUGGGGCAUGAACCUGGCAUACAAUGUAGAUUUCUGUGUUUGUUAGGCAACAGCUACAUUGUCUGCUGGGUUUCAGGCUACCUGGAA >gi|262205931|ref|NR_029806.1| Mus musculus microRNA 221 (Mir221), microRNA ATCCAGGTCTGGGGCATGAACCTGGCATACAATGTAGATTTCTGTGTTTGTTAGGCAACAGCTACATTGT CTGCTGGGTTTCAGGCTACCTGGAA CCUGGCA Search Tool used: TargetScan, Lewis (15-16) The part of the "Stem Loop" sequence which actually binds the UTR for AQP2 (underlined below) >gi|262205931|ref|NR_029806.1| Mus musculus microRNA 221 (Mir221), microRNA ATCCAGGTCTGGGGCATGAACCTGGCATACAATGTAGATTTCTGTGTTTGTTAGGCAACAGCTACATTGT CTGCTGGGTTTCAGGCTACCTGGAA CCUGGCA Figure 6: Structure of the mature RNA showing the "stem-loop structure Microarray Technology Restriction Of Microarray Technology Microarray is a technique which enables the detection of genes at the same time frame. Russo et al (6498) refers to this techniques as to this method as an ordered collection of microspots. The technique is based on DNA hybridizations between labeled free target and an array of numerous DNA probes immobilized on a matrix. The targets are cDNA which are derived from reverse transcription of DNA and then labeled. Microarray technology is limited by the challenge of obtaining good quality and amounts of mRNA which will then be used in cDNA synthesis as template is a great challenge to the technique. given the plant species you have is novel, what would you need to do in order to leverage and use this technology? Would you not use? Or, what could you do to use? In this novel plant, my interest to employ microarray analysis is to measure the differential expression of color. I would select a plant expressing green color, and extract the all RNA, which represent the expressed genes and do the same for the plant expressing the purple from the RNA pool I would prepare the labeled probes complementary to the expressed sequence tags of the plant and immobilize them on a microarray chip. If provided microarrays capable of measuring gene expression in the plant species you have, what are the experiment and analysis steps involved, given samples of plant collected in the green state and samples collected in the purple state, to determine what genes have differential expression significantly associated with change in color? Microarray is technique employs hybridization strategy much like in the Northern blotting. The method involves the pairing of a probe complementary to the gene of interest and the probe is fixed on spots in the microarray. The first procedure is isolate and purifies the mRNAs from each of the plants separately. After isolation step, the mRNA are reversed transcribed into cDNA and the resultant cDNA labeled with a fluorescent dye for instance, a red dye for the plant expressing red color and a green dye for the purple expressing plant. Different spots in the microarray represent different genes though some genes may be represented by multiple spots. The dyed cDNA are incubated with a microarray slide resulting to hybridization to complementary probes spotted on the array.. A laser is used to excite the dye and a scanner records an image. This is the image which is quantified to obtain measures of fluorescence intensity for each pixel. Pixel values are further processed to obtain abundance of mRNA for each spot. Since two dyes have been used in this experiment, red and blue, they will fluoresce at different wavelengths making it possible to obtain separate images of the red and green dye. Figure 7: Microarray slide showing how different dyes fluorescence at different wavelengths Work Cited Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman, "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25 (1997) 3389-3402. Baxevanis, D. and Ouellette, F. “Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins”, Second Edition, 2001 by John Wiley & Sons, Lambert C, Leonard N, De Bolle X. Depiereux E. “ESyPred3D: Prediction of Proteins 3D structures, Bioinformatics.18.9 (2002) 1250-1256 Lewis P, Burge B, and Bartel P. “Conserved Seed Pairing, Often Flanked by Adenosines, Indicates that Thousands of Human Genes are MicroRNA Targets” Cell, 120 (2005)15-16 Mardis E.R. “Next-Generation DNA Sequencing Methods”, Annu.Rev.Human Genet.9 (2008) 387-402 Maxam Allan, and Walter Gilbert.“A new method for sequencing DNA” Proc. Nati. Acad. Sci. USA 74. 2 (1977) 560-564 Russo Giuseppe, Zegar Charles and Giordano Antonio. “Advantages and limitations of microarray technology in human cancer” Oncogenes (2003) 22, 6497–6507. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, and Kumar S “MEGA5: Molecular Evolutionary Genetics Analysis using Maximum Likelihood, Evolutionary Distance, and Maximum Parsimony Methods”. Molecular Biology and Evolution (2011) . Zhang, Hongyu “Alignment of BLAST high-scoring segment pairs based on the longest increasing subsequence algorithm” Bioinformatics 19.11 (2003) 1391-196 Read More

DNA Sequencing and Microarray Technology - Essay Example

Extract of sample "DNA Sequencing and Microarray Technology"

CHECK THESE SAMPLES OF DNA Sequencing and Microarray Technology

Genome Science and Biological Exploration

Biotechnology techniques

Advanced Bioinformatics

Using Tiling Arrays to Diagnose Drug Resistance in Clinical Isolates of Gonorrhea

RNA-Seq and Microarray Analysis

Theory and Practicalities of Bacterial Diagnostic Methods

Lymphoma Diagnostics Techniques

Microarray Analysis of Sickle Cell Disease - Platelets