DNA sequencing

DNA sequencing is the determination of the nucleotide sequence within a DNA molecule. DNA sequencing has revolutionized the biological sciences and initiated the era of genomics. Since 1995, the genome of over 1,000 (as of 2010) was confirmed by DNA sequencing different organisms are analyzed. Together with other DNA analytical methods, DNA sequencing is among other things used to study genetic diseases also. In addition, DNA sequencing is a key analytical method, particularly in the context of DNA cloning (English molecular cloning ), an inseparable part of a molecular biology and genetic engineering laboratory today.

  • 2.2.1 pyrosequencing
  • 2.2.2 Sequencing by hybridization
  • 2.2.3 ion semiconductor DNA Sequencing System
  • 2.2.4 Sequencing with bridge synthesis
  • 2.2.5 Two -base sequencing
  • 2.2.6 sequencing with paired ends

Problem

DNA sequencing as a reading of the nucleotide sequence of DNA was for decades until the mid-seventies, an unsolved problem until appropriate biochemical or biotechnological methods have been developed. Today, even the whole genome sequencing has become relatively quickly and easily.

However, the challenges of genome sequencing are not limited to the direct reading of the nucleotide sequence. Depending on the process can be read up to 1000 base pairs in each sequencing reaction due to technical limitations, only short DNA fragments (English reads ). Upon receipt of the sequence then the next primer (containing a sequence from the end of the previous sequencing) produced, which is referred to as primer walking or whole chromosomes and the chromosome walking was used for the first time in 1979.

A larger sequencing project, such as the Human Genome Project, in which several billion base pairs were sequenced, and therefore requires an approach which is called shotgun sequencing. This long DNA segments are first broken down into smaller units, then sequenced and subsequently assembled the sequence information of the individual short sections with bioinformatic methods restore a complete whole sequence. To obtain biologically important from the raw data sequence information ( for example information on available genes and their control elements ), is followed by the sequencing of the DNA sequence analysis. Without it, any sequence information has no scientific value.

Sequencing methods

There are today a plurality of processes for reading the sequence information from one DNA molecule. For a long time were mostly developments of the method by Frederick Sanger in use. Modern methods offer opportunities for accelerated sequencing by highly parallel applications. Developed by the Sanger method sequencing methods are often referred to as next generation sequencing ( engl. next generation sequencing ).

Classical methods

Method of Maxam and Gilbert

The method of Allan Maxam and Walter Gilbert 1977 is based on the base-specific chemical cleavage of the DNA with suitable reagents and subsequent separation of the fragments by gel electrophoresis. The DNA is first labeled at the 5 ' or 3' end with radioactive phosphate or non-radioactively ( biotin, fluorescein ). In four separate batches, then each particular bases to be modified and cleaved from the sugar-phosphate backbone of DNA, for example, the base guanine (G) is methylated by dimethyl sulfate, and the reagent is removed by alkali treatment with piperidine. Thereafter, the DNA strand at the base now loose bodies is completely split. In each approach occur fragments of different lengths whose 3'- end was always cleaved at specific bases. Gel electrophoresis separates fragments by length, wherein length differences are resolved by a base. By comparing the four approaches on the gel, the sequence of the DNA can be read. Their inventors enabled this method to determine the operon sequence of a bacterial genome. The methodology is now hardly used, as it requires hazardous reagents and heavier than the automated dideoxy method of Sanger developed at the same time.

The dideoxy method of Sanger is also called chain-terminating synthesis. It provides an enzymatic method; they developed by Sanger and Coulson in 1975 and presented in 1977 with the first complete sequencing of a genome ( bacteriophage φX174 ). Sanger was awarded for his work on DNA sequencing along with Walter Gilbert and Paul Berg 1980 Nobel Prize in Chemistry.

Starting from a short section of a known sequence (primer ) is extended by a DNA polymerase, one of the two complementary DNA strands. First, the DNA double helix is denatured by heating, after which single strands stand for the further procedure available. In four otherwise identical approaches ( all involve the four radiolabeled nucleotides) is one each of the four bases in part as a dideoxynucleoside triphosphate ( ddNTP ) was added ( ie one approach with either ddATP, ddCTP, ddGTP, or ddTTP ). Such chain-terminating ddNTPs have no 3 ' hydroxyl group: If they incorporated into the newly synthesized strand, an extension of the DNA by the DNA polymerase is no longer possible, since the OH group at the 3'- carbon atom of the linkage to the phosphate group of the next nucleotide is missing. Subsequently arise DNA fragments of different lengths, which always ends in each batch with the same ddNTP (so only A or C or G or T). After the sequencing reaction, the labeled demolition products are separated from each approach by polyacrylamide gel electrophoresis lengthwise. By comparison of the four batches, the sequence may be after the development of radioactive gel to a photographic film reading. Accordingly, the complementary sequence is the sequence of the single-stranded DNA template used. A sequencing reaction is now a variation of the polymerase chain reaction (PCR ) is used. Unlike PCR, only one primer is used, so that the DNA is only amplified linearly.

A non-radioactive method for DNA sequencing by the transfer of DNA molecules on a substrate during electrophoretic separation, was developed by Prof. Pohl and his group in the early 80s. The marketing of the "Direct Blotting Electrophoresis System GATC 1500", was made by the company GATC Biotech Constance. The DNA sequencer was, for example, used in the context of the European Genome Project for sequencing of chromosome II of the yeast Saccharomyces cerevisiae.

Since the early nineties, especially marked with fluorescent dyes dideoxynucleoside triphosphates are used. Each of the four ddNTPs is coupled with a different dye. This modification makes it possible to admit all four ddNTPs in a reaction vessel, a splitting into separate batches and the handling of radioisotopes is not necessary. The resulting chain termination products are separated by capillary electrophoresis and excited by means of a laser for fluorescence. The ddNTPs at the end of each DNA fragment thereby show fluorescence of different colors and can be detected by a detector. The electropherogram ( the sequence of color signals appearing at the detector ) directly defines the sequence of the bases of the sequenced DNA strand again.

Modern approaches

With the increasing importance of DNA sequencing in research and diagnostics methods have been developed that allow for increased throughput. Thus it is possible to sequence the entire human genome in about 8 days. The corresponding methods are referred to as second-generation sequencing (English second generation sequencing). Several companies have developed methods with different advantages and disadvantages. In addition to those listed here, there are others.

Pyrosequencing

Pyrosequencing uses such as Sanger sequencing, a DNA polymerase to synthesize the DNA strand there, where the type of DNA polymerase may well still be different. The DNA mixture is ligated with a DNA adapter and coupled through a complementary adapter sequence to beads. The DNA-loaded beads is a bead placed on a plate having pores of the size, in which a light conductor leading to a detector under each pore. The DNA polymerase is to some extent observed "in action", as they successively appending individual nucleotides to a newly synthesized DNA strand. The successful incorporation of a nucleotide is implemented through an elaborate enzyme system involving a luciferase in a flash of light and detected by a detector. The DNA to be sequenced is used as a template and is single stranded before. Starting from a primer is carried out, the strand extension, nucleotide by nucleotide, by adding a respective one of four kinds of deoxynucleoside triphosphates ( dNTP). Upon addition of the matching ( complementary ) nucleotide obtained a signal at this position not suitable NTP was added, the flash of light remains off. Then the existing NTP to be destroyed and a different type is added; This continues until the return is a response; at the latest after the fourth addition to showing a reaction, since then all kinds of NTP have been tried.

During installation of a complementary nucleotide of the DNA polymerase of pyrophosphate (PPi ) is released. The pyrophosphate is converted by ATP sulfurylase to adenosine triphosphate ( ATP). The ATP drives the luciferase reaction, which luciferin is converted into oxyluciferin. This in turn results in a detectable light signal - whose strength is proportional to the ATP consumed.

Pyrosequencing for example, to determine the frequency of specific gene mutations (SNPs, Eng. Single nucleotide polymorphism ), used for example in the study of genetic diseases. Pyrosequencing is easily automated and is suitable for highly parallel analysis of DNA samples.

Sequencing by hybridization

For this purpose ( DNA chip or microarray ) short DNA fragments (oligonucleotides ) are fixed in rows and columns on a glass substrate. The fragments of the DNA to be sequenced are labeled with dyes and the fragment mixture is applied to the Oligonukleotidmatrix, so that complementary DNA fixed and free portions may hybridize to each other. After washing out unbound fragments can the hybridization patterns with color markers and their intensity read. Since the sequences of the oligonucleotides and the fixed areas of overlap are known, one can infer the end of the color pattern on the underlying overall sequence of the unknown DNA.

Ion semiconductor DNA Sequencing System

This method of using ion Torrent semiconductor process, to perform a direct non-optical genome sequencing by means of integrated circuits. The sequencing data obtained directly from the semiconductor chip, the detection of ions produced by template- dependent DNA polymerases. The chip used for ion-sensitive field effect transistor has sensors which are arranged in a grid of 1.2 million wells, in which there is the polymerase reaction. This grid allows parallel and simultaneous detection of independent sequencing reactions. The complementary metal oxide semiconductor (CMOS) technology is used, which allows cost-effective reaction in the high - density measuring point.

Sequencing by synthesis Bridge

The DNA to be sequenced is in sequencing with bridge synthesis of Solexa / Illumina (English sequencing by synthesis, SBS ) was ligated with an adapter DNA sequence, denatured, ligated to a carrier plate and reproduced by Brückenamplifikation in situ. Subsequently, the DNA is linearized enzymatically. In a Sanger - related PCR reaction with four different colored fluorescent chain-terminating substrates, the built- nucleobase is determined.

Two-base sequencing

Sequencing paired- ends

A clearly identifiable signal is also obtained through the production of short pieces of DNA from the beginning and end of a DNA sequence (English Paired End Sequencing day, PETS), if the genome has been completely sequenced.

Third-generation sequencing

Sequencing of the third generation measures for the first time the reaction of individual molecules, creating a sequencing previous amplification deleted by PCR. This uneven amplification is avoided by thermostable DNA polymerases, as polymerases, some DNA sequences preferentially bind and this reinforced replicate (English polymerase bias). As a result, some sequences are overlooked. Furthermore, the genome of individual cells can be examined. The inclusion of the released signal is recorded in real time. In DNA sequencing of the two different signals are the third generation, depending on the process, is recorded: released protons ( as a variant of the semiconductor sequencing) or fluorophores ( with fluorescence detector ).

189032
de