Genome-wide association study

A genome-wide association study (. GWAS, Genome -wide association study engl ) is an epidemiological study of the genetic variation of the human genome - designed by a certain phenotype (usually a disease ) - to associate with certain haplotypes (or alleles ).

Thus, the goal of GWAS, it is ultimately the alleles to identify ( a particular expression of a gene), which occur together with a feature (or a disease). It still not the genes are examined directly - va for economic reasons - but well-defined markers ( SNP, Single Nucleotide Polymorphism ).

Survey

To be a GWAS perform two groups of subjects needed: A comparison group (ie, "normal", or mostly: healthy) and a group having the phenotype of interest ( ie the disease or some other special feature shows ). From both groups, DNA samples are taken and tested individually on the basis of markers on their variation ( today defined SNPs are used). In the analysis, it is searched for differences in the variation between the two groups: An accumulation of a specific marker in the group of the phenotype of interest is an association represents the most used marker loci the SNPs are not located in a protein - coding region, but are either in non-coding regions between two genes ( ie, in the regulatory regions ) or introns.

Here, a GWAS but say nothing about the context in which the allele found is now concretely with the phenotype - it is a mere association ( in particular there is an association with the polymorphism and not even directly with a gene encoding allele), one for the time being purely correlative connection. The causal connection may be explored by molecular biological and biochemical methods only after the identification of such "candidate genes ".

Important GWAS win in recent years by the drop in DNA sequencing. The lower costs make it increasingly interested members of the population privately via specialized providers (eg, 23andMe ), a marker analysis of one's own genome to be carried out. This is an individual clarify risk ( genetic disposition or predisposition ) for already known allele - disease associations in the foreground, but the present in ever greater numbers records the most diverse phenotypes can be subsequently used for research purposes for GWAS ( the consent of the DNA donors provided ).

Background

The diploid human genome comprises more than six billion base pairs. Although the differences of individuals in humans - in comparison to other living beings - are extremely small (humans differ in only about 0.1 % of all base pairs from each other; comparing Drosophila melanogaster: 4%), because of the enormous number of base pairs but handsome set of about six million polymorphisms present. The vast majority of these polymorphisms lie as single nucleotide polymorphisms (SNP ) before ( in Europeans find about 3.3 million SNPs ).

Of interest would actually be only different alleles ( protein -coding and regulatory regions ) - ie Differences in regions that have a direct impact on gene function (eg, the function of the encoded protein or the rate of expression ). The sequencing of all such regions is today but still too complex and too expensive - and probably such a high resolution is not even necessary. The HapMap project mapped and collected in a first phase variants of one million SNPs, but is now working already in a second phase at a Haplotypenkarte of 3.1 million SNPs. In principle sufficient markers are identified to provide one (or more ) markers to each gene of interest is also the recombined together with the gene. Today GWAS are virtually always performed based on SNP - in more specific (ie, not genome-wide, but focusing on certain segments of DNA or genes ) studies can, however, find as appropriate, other polymorphisms or complete sequence analysis application.

Particularly appealing make the GWAS " hypothesis freedom ", ie there is no pre-selection of possible krankheits-/phänotypverursachenden genes instead (no incorporation of a priori knowledge ) - it is simply examined the whole genome. Thus, the analysis is open-ended and unexpected new genes can be associated with phenotypes.

Limits and dangers

The GWAS owning several methodological limitations. The biggest limitation of GWAS is that only associations of frequent haplotypes can be found for a phenotype - all rare variants remain undetected. Next is to emphasize that GWAS provide only correlative results. An identified gene occurs more frequently in conjunction with a phenotype - gene and disease are ' somehow ' in conjunction. The causality must first be shown or found in other studies. Even today, not the genes themselves are found, but only polymorphisms ( snp ) which in turn occur only correlative with the genes together.

With the increasing popularization of personalized medicine, more and more genomes are sequenced from patients (or genetic testing done - only parts of the entire genome sequenced ). The progressive fall in prices in the sequencing of DNA by increasingly efficient technologies favored this trend immense. Also suppliers have entered the market, which apply directly to private customers - even without disease and just out of curiosity is sequenced today.

By thus increasing availability of human genomes arise inevitably social issues and consequences, such as how to deal with health insurance companies of the highly specific information, such as patients with correlative results deal with respect to a disease probability or private should be the personal sequence. There are already online SNP databases such opensnp.

288031
de