The protein design, protein engineering synonymous or rational protein design, describes the targeted adjustment of the properties of proteins by site-specific mutagenesis of DNA. It is in addition to the randomly generated a directed evolution strategy of the two protein engineering.


Goals of protein design are changes of

  • Expression levels
  • Binding properties (such as substrate affinity, substrate specificity, affinity for other binding partners )
  • Catalytic properties (e.g., metabolism, substrate saturation)
  • Toxic properties
  • Immunological properties (eg repetitive epitopes, MHC binding, consensus sequences of different strains, masked by glycosylation )
  • Localization in a cell compartment
  • In the case of inclusion bodies, the solubility
  • Increasing the biological half-life by reduction of proteolysis, and increase the thermal stability and denaturation

Processes and effects

The targeted modification of recombinant proteins can lead to loss of function or gain. In addition to the specific modification of protein and DNA portions outside the protein coding sequence may be modified to increase gene expression in the context of a vector most designs. By choosing a suitable for the particular type promoter, enhancer and terminator gene expression can be increased. Furthermore, by a Shine -Dalgarno sequence (for eukaryotes ), the detection of the mRNA improved ( in bacteria ) or a Kozak sequence to the ribosome and a polyadenylation signal at the 3 ' end, and by the avoidance of AUUUA sequences, the premature degradation the mRNA are reduced.

Point mutations

By a codon - optimized expression rate can be increased by only 20 amino acid codons can be used those which are expressed in most of the species. Posttranslational modifiable amino acids as they occur in glycosylation, phosphorylation, methylation, acetylation, sulfation, Myristylierungs, palmitoylation, Farnesylierungs, GPI anchor and Geranylgeranylierungsstellen can, introduced by targeted point mutations in the DNA in the protein or removed.

By the change of a catalytic site, a binding site or a substrate necessary for activation of a binding site for other molecules (eg, cofactors, temporary protein -protein interactions or protein complexes ) competitive inhibitors can be generated.

The biological half-life of a protein may be extended Peptidaseschnittstellen, PEST sequences and certain N -terminal amino acids are changed from the N -end rule in the.

Point mutations can affect the secondary, tertiary and quaternary structure, such as, inter alia, disulfide - forming cysteines using the change in the primary structure. α -helices can be modified by rotationally flexible (glycine ), helix-forming (eg, alanine) and helix -breaking amino acids ( proline). Unusual amino acids can be introduced via the use of an expanded genetic code.

Insertions and deletions

Novel protein domain and related functions can be added ( made ​​of multiple of three nucleotides ) in a gene by frameshift compliant insertions of DNA sequences, the resulting hybrid proteins are referred to as fusion proteins.

Short frameshift compliant DNA sequences behind the start codon, or before the stop codon of the gene is occasionally added for purification and detection, which are referred to as protein tags.

Other conventional inserts are flexible connections (English linker between two protein domains of a fusion protein ), in addition also inteins or protease recognition sequences that allow cleavage of a portion of the protein in vitro or in vivo.

Transient insertions may be generated by the insertion of inteins or by use of the Cre - lox system.

By frameshift deletions of multiple compliant properties of three nucleotides can be removed. This can sometimes also other properties of the protein to the fore, such as in the removal of regulatory domains. The localization in a cell compartment can be changed by adding or removing signal sequences. By adding or removing a transmembrane domain, soluble proteins and membrane proteins can be converted into one another. Wherein an insertion of coding sequences for cell-penetrating peptides of the cell entry of a protein can be increased.


In the early 21st century, the development of protein design accelerated by the use of molecular modeling on the computer. Examples of this trend include stereoselective catalysis, the detection of ions, and antiviral properties.

Computer- aided methods a new, artificial protein folding was created ( Top7 ) in 2003, as well as developed sensors for unnatural molecules. The specificity for cofactors of xylose reductase from Candida boidinii was changed from NADPH to NADH.

However, all protein structures are probably not available via protein design, since some configurations and conformations can not form for steric reasons. Likewise, there are software-based limits the possibilities for change.


  • IPRO changed the proteins to increase the affinity for a substrate or cofactor. This is more random changes of the protein backbone in the area of specific positions to identify the lowest energy combinations of rotamers and to determine the configuration with the lowest energy at a targeted modification. The iterative approach allows IPRO additive calculation of multiple mutations to optimize the substrate specificity or cofactor binding.
  • EGAD: A Genetic Algorithm for protein design. A free software package for protein design and to predict the effects of mutations with respect to the protein folding and affinity. EGAD relates also several parallel structures in the design of binding sites or fixed conformations. In addition, mobile ligands with or without rotating bonds can be calculated. EGAD can also be used with multiple processors.
  • RosettaDesign. A software package that is free for academic use. RosettaDesign is available via a web server.
  • Sharpen is an open-source library for protein design and structure prediction. SHARPEN offers different combinatorial optimization methods (eg, Monte Carlo, simulated annealing, FASTER ) and evaluated the proteins on the basis of '' Rosetta all atom force field or the '' molecular mechanics force fields '' ( OPLSaa ). SHARPEN next includes the possibility of calculating a plurality of processors.
  • WHAT IF software. A software for modeling, protein design, validation and visualization of proteins.
  • CheShift is a software for validation of protein structures.
  • Abalone is a software for modeling and visualization.
  • ProtDes is a software for protein design based on the CHARMM molecular mechanics '' package ''.