STRING ( Search Tool for the Retrieval of Interacting Genes / Proteins) is a bioinformatics database that provides a comprehensive overview of direct ( physical ) and indirect (functional ) relationships and interactions between proteins. It is operated jointly by the European Molecular Biology Laboratory ( EMBL) and the University of Zurich.

The freely accessible database is regularly updated and contains information from experimental data, other databases, literature and calculated on a computer model of interaction predictions. In the current version 8 (January 2009) about 2.5 million proteins from 630 species are considered.


Protein-protein interactions are based not only on direct physical bonds, it also play a role in indirect contexts, such as the occurrence in the same pathway, regulating mutual or common occurrence in larger protein associations.

STRING integrates these data so that you get a quick overview of a protein and its interaction with other proteins. The interactions are transmitted automatically via the addition organism in which they were first described, to orthologous protein pairs in other organisms.


Many of the predicted protein-protein interactions can be imported from other databases, a great part of the results is created de novo. In order to make such predictions, provides the constant growth of completely sequenced genomes, the possibility of using the so-called " genomic -context ". This is based on the following points:

  • Conserved genomic neighborhood
  • Gene fusion events
  • Co -occurrence of genes across genomes

All three criteria are based on the assumption that all proteins listed under common selection pressure during the period of evolution and therefore must be functionally associated. That is, it is assumed that genes / proteins that have a similar function or one occurrence in the same pathway, had to get together and be regulated, they have the same " phylogenetic profile ". It was found that they occur in proximity on the genome, participate in gene fusion events, etc. It uses the reverse, for example, in predicting the function of unknown proteins: neighborhood on the genome, common gene fusion in the same sequence genome in different species during evolution suggest a similar function of the proteins. As with all protein interactions also displayed here a confidence score is displayed with the KEGG ( Kyoto Encyclopedia of Genes and Genomes ) serves as a reference in this case.

Many proteins exist already but many articles that are based on experiments. String offers here a direct link to the respective sources, eg to PubMed, KEGG, MIPS ( Munich Information Center for Protein Sequences ) and BIND ( Biomolecular Interaction Network Database ).

Results and references

The results can be shown in different graphical representation. For each found associated proteins points are awarded. In functional relationships between proteins KEGG is taken as reference, this database was created manually and lists those proteins in which metabolism occur together. It is also involved in the published PubMed protein interactions literature. Here is but only compares whether the proteins are mentioned together in the abstract.

According to different reviews is a "combined score" between the two proteins formed the miteinschließt all sub- results. This value is often higher than the lower results, which is due to the fact that a higher value is assumed when several subgroups suggest a relationship between two proteins suggests.