Haplotype

As haplotype (from the Greek haplús or haplóos "simple" and typos " image, pattern " ), an abbreviation of " haploid genotype ", a variant of a nucleotide sequence on the same chromosome is called the genome of a living being. A specific haplotype may be individual-, population- or species-specific.

The case being compared alleles, as the International HapMap Project, be individual combinations of SNPs that can be used as genetic markers.

History

The term was introduced by R. Ceppellini 1967. It was originally used to describe the genetic composition of the MHC, a complex of genes encoding proteins important for the immune system.

Demarcation to genotype

If a diploid organism with respect to two genes A and B, the genotype AABB, so can the haplotypes AB | from or Ab | aB are based. In the former case, a chromosome having the alleles A and B, the other A and B. In the latter case, a chromosome having the alleles A and B, the other A and B.

Determination of haplotypes

Two cases can be distinguished (in the following, the term " allele " refers to the different nucleotides A, C, G and T, but, for example, the number of repetition of a particular microsatellite allele ) define:

  • 2 polyploid species
  • 2.1 If a maternal and paternal homologous chromosomes in the DNA nucleotide differ in an individual, so these SNPs in sequencing the corresponding chromosomes of the individual visible (there is always a mixture of homologous chromosomes sequenced ). Such a SNP is called in the corresponding individual heterozygous SNP.
  • 2.2 If in an individual, a maternal and paternal homologous chromosomes in a considered gene locus is identical, in sequencing the DNA of the individual SNPs are not visible. Only when at least a second individual in the same locus to another allele is found, can be spoken at the corresponding nucleotide position of a SNP. Such an SNP is called the first individual homozygous SNP but a heterozygous SNP, in another individual display.
  • 2.3 Diving in a SNP two different alleles ( relative to the total population under consideration ), this SNP is called " biallelic ". If there are three different alleles, this SNP is " triallelisch " and called " tetraallelisch " with four alleles. A tetraallelischer SNP contains the maximum number of different alleles, because SNPs can only be formed from the four nucleotides A, C, G and T.
  • 2.4 diploid species may have tetraallelische SNPs in principle, even though only a maximum of two alleles are possible for an individual.

A SNP is now determined in a polyploid population ( the same species ), so can the haplotypes ( of length 1 ) read as in point 1 directly from the sequencing. Even at two SNPs, it is problematic: When sequencing the assignment of individual alleles to their original chromosomes is lost. Different combinations of alleles in SNP 1 and SNP 2 are now possible, and thus also different haplotypes. The number of possible haplotypes grows exponentially with the number of SNPs.

Various methods have been developed to determine the haplotypes in polyploid species.

  • I) Experimental:
  • Ii.1 ) Based on a parsimony criterion ( parsimony based, see also Ockham 's Razor ). This method attempts to minimize the number of haplotypes needed to account for the SNP of a given population. There are various approaches based on satellite or linear programming to solve this problem effectively. Other features: If applied under the assumption that the considered locus is little or no recombination takes place. A solution found is always optimal in the sense of thrift criterion. Not practical for large-scale analyzes.
  • II.2 ) maximum likelihood (using Expectation-Maximization algorithm or Monte Carlo simulation. these methods try to find the set of haplotypes ( and the corresponding breakdown among individuals ), so that the information given by a objective function calculated probability of the observed data is maximized. Other features: Applicable even at recombination. Solutions are usually less than optimal, since the algorithm terminates in a local optimum and has to make simplifications, so that a solution is at all predictable. Practical for large-scale analyzes albeit suboptimal solution is sufficient.

Swell

375093
de