Sequence database

In the field of bioinformatics to store and manage sequence databases collections of DNA, RNA or protein sequences with the aid of the computer. A database can be sequences of a different organism, for example, all proteins of the yeast Saccharomyces cerevisiae, or DNA sequences of all organisms whose genome has been sequenced. In databases you can search on different ways for information: Most common is the search for DNA or protein sequences that are similar to an already known sequence. The BLAST program allows such a query.

The biggest problem is the huge sequence databases is that entries from many different sources, from individual researchers to large genome sequencing centers. The quality of the sequences themselves and the associated biological annotations therefore varies considerably. Furthermore, are very common redundancy, since many laboratories submit numerous sequences that are identical or nearly identical with already stored entries.

Many annotations also not based on laboratory experiments, but on the results of similarity analysis ( sequence similarity searches ) with previously annotated sequences. Since an annotated in this way and stored in the database sequence may itself form the basis of future annotations, can be several other annotations between a particular database entry and the information actually obtained from a laboratory experiment. We also talk about transitive annotation trouble, that is, the transfer or handover of the annotations. Therefore biological annotations in the major sequence databases with a certain skepticism must be considered, as long as they are not supported either by references to relevant, high-quality experimental data from scientific publications, or by references to a supervised man- sequence database (such as Swiss-Prot ).

Examples

  • Bioinformatics
723230
de