What is SMILES?
The Simplified Molecular Input Line Entry System, or SMILES, is a notation that uses simple text characters to represent chemical structures. Developed by David Weininger in the 1980s, it provides a compact and human-readable way to describe molecules. While modern chemical drawing software is now common, SMILES remains a cornerstone in cheminformatics for storing and exchanging molecular information.
Examples
Ethanol: CCO
Benzene: c1ccccc1
Atoms: C, N, O, Cl, Br, etc.
Single bonds are implicit: CC = C–C
Double = and triple # bonds: C=O (carbonyl), C#N (nitrile)
Branching uses parentheses: CC(O)C = isopropanol
Rings use numbers to label ring closure points: c1ccccc1 = benzene (lowercase c indicates aromatic carbon).
To visualize these, we can use the Python library RDKit. The function Chem.MolFromSmiles() converts a SMILES string into a molecule object that can then be displayed.
RDKit example
The following codes can be used to generate a butane (C4H10) structure (Figure 1).
Code:
butane = Chem.MolFromSmiles("CCCC")
butane
Figure 1: Structure of Butane.
2. Pentane C5H12 (Figure 2).
Code:
pentane = Chem.MolFromSmiles("CCCCC")
pentane
Figure 2: Structure of Pentane.
3. Butanol C4H10OH (Figure 3).
Code:
butanol = Chem.MolFromSmiles("CCCCO")
butanol
Figure 3: Structure of Butanol.