Simplified molecular-input line-entry system

Simplified Molecular Input Line Entry Specification ( SMILES ) is a chemical structure code, in which the structure of arbitrary molecules greatly simplified as (ASCII ) character string to play. Several molecular editors can import SMILES strings and generate 2-dimensional and 3- dimensional models.

The original SMILES specification was developed by Arthur Weininger and David Weininger in the late 1980s. In particular, the Daylight Chemical Information Systems, Inc. for operation in the following years, the development and modification of the specification. In 2007, eventually became an open standard called Open Miles of Blue Obelisk, a chemically -oriented open- source community developed.

Since the SMILES language is controlled by the company Daylight and has some problems with the stereochemistry and tautomerism, the IUPAC has its own linear Molekülrepresentation, InChI developed which is freely available.

  • 5.1 Computer Programs

Examples

Conventions

Atoms

A chemical element is represented by its element symbol, which is enclosed in square brackets (eg [ Au] for gold). The isotope of the element can be specified, in which the mass number is prepended to the element symbol (eg [ 2H ] for deuterium or [ 235U ] for fissile uranium ); no such indication is the natural isotopic mixture is assumed.

Ions, ie electrically charged atoms, are described in the SMILES notation by specifying the charge in the square brackets (eg, [ Cl - ] for the chloride ion or [ Cu 2] for the copper (II ) ion ).

Direct to the atom bonded hydrogen can also be specified in the bracket notation, this is after the element symbol H followed by the number of bonded hydrogen atoms, given ( for a single hydrogen atom is an indication of the number of non- mandatory). Thus, simple molecules, such as hydrogen chloride ( [ ClH ] ) or methane ( [ CH4 ] ) will be described.

To simplify the notation, the brackets can be eliminated for elements of the so-called " organic subset ". If the clamps can be omitted, the free valencies of the atom to be filled to the lowest Standardvalenz according to the illustrated table of hydrogen atoms. For example, for the input of water an O, and methane reaches a C.

Bonds

To indicate that two atoms are connected by a chemical bond, one of the symbols between the atoms is set.

* Only open Miles Bonds in aromatic systems can be rather symbolized by alternating double and single bonds by a colon.

To simplify the notation further, the symbols represent single bonds, and aromatic compounds may be omitted.

Branches

Atoms with three or more bonds are the starting point of branching. In this case, after the corresponding atom only the side chain is placed in parentheses before the other bindings follow. The bracket levels and thus the branches can be nested arbitrarily deep.

Examples:

Separate structures

For structures that are not related, such as ionic bonds, a dot (.) Between the separated molecules is set. For example, sodium (Na HCO3- ) = [Na ] O = C ( [O- ] ) O.

Cyclic structures

One of the biggest problems of such a language is to represent cyclic structures. In SMILES fact that one which is back after a nuclear, to be joined with another atom further, an index writes happens; This also makes it at the other atom, and the two are connected. In aromatic rings the ring-forming atoms are lowercased.

Examples:

Reactions

Reactions are in SMILES using two right angle brackets (>> ) are shown. Example:. Na HCO3- HCl → Na Cl - H2CO3 = [Na ] O = C ( [O- ] ) >> O.HCl [Na ] [ Cl - ] O = C ( O) O. .

If at a reaction flows in another substance, then we write it between the tips brackets. Example: Na HCO3- HCl → Na Cl - H2CO3 = [Na ] O = C ( [O- ] ) O> HCl > [Na ] [ Cl - ] O = C ( O) O. ..

Extension

SMARTS is an extension of SMILES, which allows you to find molecular substructures. To this end, SMILES was to specify wildcards or specific bonds (eg, aromatic ), modified. It is considered that any valid SMILES expression can also be used as SMARTS. This rule does not apply the other way around. SMARTS are mainly used for search applications in chemical databases.

731375
de