ENCODE

ENCODE (composed of Engl. Encyclopedia of DNA Elements, Encyclopedia of DNA Elements ) is a research project, which was initiated in September 2003 by the U.S. National Human Genome Research Institute ( NHGRI ). The project aims to identify all functional elements in the human genome and the transcriptome and characterize. ENCODE is thus the follow-up project of the Human Genome Project, which had the sequencing of the human genome to the target and which is closed since 2003. Another goal of ENCODE is the development and implementation of high-throughput methods for the identification of functional elements.

Will be conducted by an ENCODE research consortium, which includes the total of about 30 working groups at various institutes. The project will be executed in three phases: a pilot phase, which has the goal of 1% to analyze (30 megabases ) of the genome and to evaluate the methods of investigation, a technological phase in which increases the data throughput and the cost of acquiring the data be reduced intended and a production phase, in which the remaining 99 % of the genome to be analyzed. All data that are generated in the course of the ENCODE project are placed in a publicly accessible database available to the public. As a first result ENCODE 2007 was able to confirm that about half of all produced RNAs of a cell are neither mRNA nor ribosomal RNA, or other RNA end products, but other RNAs, where it is not known whether and which fill this function.

The results of ENCODE are controversial methodological errors that have been raised and inconsistencies with evolutionary ideas to exist.

Pilot phase

In the pilot phase first 30 megabase DNA sequence of the human genome were selected, which comprises about 1%, and analyzed using established methods. 50 % of this sample were selected purposefully and 50 % random. It was the transcriptome that is, all transcribed sequences mapped, as well as the transcription start points, promoters, enhancers, repressors or silencers, exons, Origins of replication, Replikationsterminationssites, binding sites of transcription factors, methylation sites, DNase I- hypersensitive regions, chromatin and regions that are highly conserved are, but so far have no known function were identified. Genetic variations within these highly conserved regions are documented. The methods used include not only sequencing, microarrays, chromatin immunoprecipitation (ChIP ) and quantitative PCR.

Technological development phase

The technical development of methods taking place simultaneously with the pilot phase. To provide new laboratory techniques and computer programs are developed.

Production phase

In the last productive phase to the remaining 99 % of the sequence of the human genome mapped similarly to the first percentage and functional elements are identified, and do so cost-effectively and reliably as possible. Other functional elements that were not already included in the pilot phase, such as telomeres and centromeres, are then to be characterized.

Results

ENCODE rendered as a result, that in the human genome significantly more DNA is active, as accepted. For 2 % of the DNA to be attributed to protein-coding genes, and so far considered to be active, according to ENCODE about 80 % of the DNA is active. 76 % of the genome are transcribed into RNA, according to ENCODE. In the genome there are 2.9 million regulatory elements. Depending on the cell type, different genome segments active. These results are not incorporated so by all scientists, but questioned due to methodological issues.

Costs

First, eight working groups were supported with a total of 12 million U.S. dollars in the pilot phase. Since then, other groups have been added, dealing for example with the coordination of databases. Overall, the pilot project has cost 55 million U.S. dollars, and the 2012 finished production phase further 130 million.

307898
de