Chemical biology is a scientific discipline spanning the fields of chemistry and biology. The discipline involves the application of chemical techniques, analysis, and often small molecules produced through synthetic chemistry, to the study and manipulation of biological systems. In contrast to biochemistry, which involves the study of the chemistry of biomolecules and regulation of biochemical pathways within and between cells, chemical biology deals with chemistry applied to biology (synthesis of biomolecules, simulation of biological systems etc.).
Some forms of chemical biology attempt to answer biological questions by studying biological systems at the chemical level. In contrast to research using biochemistry, genetics, or molecular biology, where mutagenesis can provide a new version of the organism, cell, or biomolecule of interest, chemical biology probes systems in vitro and in vivo with small molecules that have been designed for a specific purpose or identified on the basis of biochemical or cell-based screening (see chemical genetics).
Chemical biology is one of several interdisciplinary sciences that tend to differ from older, reductionist fields and whose goals are to achieve a description of scientific holism. Chemical biology has scientific, historical and philosophical roots in medicinal chemistry, supramolecular chemistry, bioorganic chemistry, pharmacology, genetics, biochemistry, and metabolic engineering.
Chemical biologists work to improve proteomics through the development of enrichment strategies, chemical affinity tags, and new probes. Samples for proteomics often contain many peptide sequences and the sequence of interest may be highly represented or of low abundance, which creates a barrier for their detection. Chemical biology methods can reduce sample complexity by selective enrichment using affinity chromatography. This involves targeting a peptide with a distinguishing feature like a biotin label or a post translational modification. Methods have been developed that include the use of antibodies, lectins to capture glycoproteins, and immobilized metal ions to capture phosphorylated peptides and enzyme substrates to capture select enzymes.
To investigate enzymatic activity as opposed to total protein, activity-based reagents have been developed to label the enzymatically active form of proteins (see Activity-based proteomics). For example, serine hydrolase- and cysteine protease-inhibitors have been converted to suicide inhibitors. This strategy enhances the ability to selectively analyze low abundance constituents through direct targeting. Enzyme activity can also be monitored through converted substrate. Identification of enzyme substrates is a problem of significant difficulty in proteomics and is vital to the understanding of signal transduction pathways in cells. A method that has been developed uses "analog-sensitive" kinases to label substrates using an unnatural ATP analog, facilitating visualization and identification through a unique handle.
While DNA, RNA and proteins are all encoded at the genetic level, glycans (sugar polymers) are not encoded directly from the genome and fewer tools are available for their study. Glycobiology is therefore an area of active research for chemical biologists. For example, cells can be supplied with synthetic variants of natural sugars to probe their function. Carolyn Bertozzi's research group has developed methods for site-specifically reacting molecules at the surface of cells via synthetic sugars.
Chemical biologists used automated synthesis of diverse small molecule libraries in order to perform high-throughput analysis of biological processes. Such experiments may lead to discovery of small molecules with antibiotic or chemotherapeutic properties. These combinatorial chemistry approaches are identical to those employed in the discipline of pharmacology.
Many research programs are also focused on employing natural biomolecules to perform biological tasks or to support a new chemical method. In this regard, chemical biology researchers have shown that DNA can serve as a template for synthetic chemistry, self-assembling proteins can serve as a structural scaffold for new materials, and RNA can be evolved in vitro to produce new catalytic function. Additionally, heterobifunctional (two-sided) synthetic small molecules such as dimerizers or PROTACs bring two proteins together inside cells, which can synthetically induce important new biological functions such as targeted protein degradation.
Chemical synthesis of proteins is a valuable tool in chemical biology as it allows for the introduction of non-natural amino acids as well as residue specific incorporation of "posttranslational modifications" such as phosphorylation, glycosylation, acetylation, and even ubiquitination. These capabilities are valuable for chemical biologists as non-natural amino acids can be used to probe and alter the functionality of proteins, while post translational modifications are widely known to regulate the structure and activity of proteins. Although strictly biological techniques have been developed to achieve these ends, the chemical synthesis of peptides often has a lower technical and practical barrier to obtaining small amounts of the desired protein.
In order to make protein-sized polypeptide chains via the small peptide fragments made by synthesis, chemical biologists use the process of native chemical ligation. Native chemical ligation involves the coupling of a C-terminal thioester and an N-terminal cysteine residue, ultimately resulting in formation of a "native" amide bond. Other strategies that have been used for the ligation of peptide fragments using the acyl transfer chemistry first introduced with native chemical ligation include expressed protein ligation, sulfurization/desulfurization techniques, and use of removable thiol auxiliaries. Expressed protein ligation allows for the biotechnological installation of a C-terminal thioester using inteins, thereby allowing the appendage of a synthetic N-terminal peptide to the recombinantly-produced C-terminal portion. Both sulfurization/desulfurization techniques and the use of removable thiol auxiliaries involve the installation of a synthetic thiol moiety to carry out the standard native chemical ligation chemistry, followed by removal of the auxiliary/thiol.
A primary goal of protein engineering is the design of novel peptides or proteins with a desired structure and chemical activity. Because our knowledge of the relationship between primary sequence, structure, and function of proteins is limited, rational design of new proteins with engineered activities is extremely challenging. In directed evolution, repeated cycles of genetic diversification followed by a screening or selection process, can be used to mimic natural selection in the laboratory to design new proteins with a desired activity.
Several methods exist for creating large libraries of sequence variants. Among the most widely used are subjecting DNA to UV radiation or chemical mutagens, error-prone PCR, degenerate codons, or recombination. Once a large library of variants is created, selection or screening techniques are used to find mutants with a desired attribute. Common selection/screening techniques include FACS, mRNA display, phage display, and in vitro compartmentalization. Once useful variants are found, their DNA sequence is amplified and subjected to further rounds of diversification and selection.
The development of directed evolution methods was honored in 2018 with the awarding of the Nobel Prize in Chemistry to Frances Arnold for evolution of enzymes, and George Smith and Gregory Winter for phage display.
Successful labeling of a molecule of interest requires specific functionalization of that molecule to react chemospecifically with an optical probe. For a labeling experiment to be considered robust, that functionalization must minimally perturb the system. Unfortunately, these requirements are often hard to meet. Many of the reactions normally available to organic chemists in the laboratory are unavailable in living systems. Water- and redox- sensitive reactions would not proceed, reagents prone to nucleophilic attack would offer no chemospecificity, and any reactions with large kinetic barriers would not find enough energy in the relatively low-heat environment of a living cell. Thus, chemists have recently developed a panel of bioorthogonal chemistry that proceed chemospecifically, despite the milieu of distracting reactive materials in vivo.
The coupling of an probe to a molecule of interest must occur within a reasonably short time frame; therefore, the kinetics of the coupling reaction should be highly favorable. Click chemistry is well suited to fill this niche, since click reactions are rapid, spontaneous, selective, and high-yielding. Unfortunately, the most famous "click reaction," a [3+2] cycloaddition between an azide and an acyclic alkyne, is copper-catalyzed, posing a serious problem for use in vivo due to copper's toxicity. To bypass the necessity for a catalyst, Carolyn R. Bertozzi's lab introduced inherent strain into the alkyne species by using a cyclic alkyne. In particular, cyclooctyne reacts with azido-molecules with distinctive vigor.
The most common method of installing bioorthogonal reactivity into a target biomolecule is through metabolic labeling. Cells are immersed in a medium where access to nutrients is limited to synthetically modified analogues of standard fuels such as sugars. As a consequence, these altered biomolecules are incorporated into the cells in the same manner as the unmodified metabolites. A probe is then incorporated into the system to image the fate of the altered biomolecules. Other methods of functionalization include enzymatically inserting azides into proteins, and synthesizing phospholipids conjugated to cyclooctynes.
The advances in modern sequencing technologies in the late 1990s allowed scientists to investigate DNA of communities of organisms in their natural environments ("eDNA"), without culturing individual species in the lab. This metagenomic approach enabled scientists to study a wide selection of organisms that were previously not characterized due in part to an incompetent growth condition. Sources of eDNA include soils, ocean, subsurface, hot springs, hydrothermal vents, polar ice caps, hypersaline habitats, and extreme pH environments. Of the many applications of metagenomics, researchers such as Jo Handelsman, Jon Clardy, and Robert M. Goodman, explored metagenomic approaches toward the discovery of biologically active molecules such as antibiotics.
Functional or homology screening strategies have been used to identify genes that produce small bioactive molecules. Functional metagenomic studies are designed to search for specific phenotypes that are associated with molecules with specific characteristics. Homology metagenomic studies, on the other hand, are designed to examine genes to identify conserved sequences that are previously associated with the expression of biologically active molecules.
Functional metagenomic studies enable the discovery of novel genes that encode biologically active molecules. These assays include top agar overlay assays where antibiotics generate zones of growth inhibition against test microbes, and pH assays that can screen for pH change due to newly synthesized molecules using pH indicator on an agar plate. Substrate-induced gene expression screening (SIGEX), a method to screen for the expression of genes that are induced by chemical compounds, has also been used to search for genes with specific functions. Homology-based metagenomic studies have led to a fast discovery of genes that have homologous sequences as the previously known genes that are responsible for the biosynthesis of biologically active molecules. As soon as the genes are sequenced, scientists can compare thousands of bacterial genomes simultaneously. The advantage over functional metagenomic assays is that homology metagenomic studies do not require a host organism system to express the metagenomes, thus this method can potentially save the time spent on analyzing nonfunctional genomes. These also led to the discovery of several novel proteins and small molecules. In addition, an in silico examination from the Global Ocean Metagenomic Survey found 20 new lantibiotic cyclases.
Posttranslational modification of proteins with phosphate groups by kinases is a key regulatory step throughout all biological systems. Phosphorylation events, either phosphorylation by protein kinases or dephosphorylation by phosphatases, result in protein activation or deactivation. These events have an impact on the regulation of physiological pathways, which makes the ability to dissect and study these pathways integral to understanding the details of cellular processes. There exist a number of challenges—namely the sheer size of the phosphoproteome, the fleeting nature of phosphorylation events and related physical limitations of classical biological and biochemical techniques—that have limited the advancement of knowledge in this area.
Through the use of small molecule modulators of protein kinases, chemical biologists have gained a better understanding of the effects of protein phosphorylation. For example, nonselective and selective kinase inhibitors, such as a class of pyridinylimidazole compounds  are potent inhibitors useful in the dissection of MAP kinase signaling pathways. These pyridinylimidazole compounds function by targeting the ATP binding pocket. Although this approach, as well as related approaches, with slight modifications, has proven effective in a number of cases, these compounds lack adequate specificity for more general applications. Another class of compounds, mechanism-based inhibitors, combines knowledge of the kinase enzymology with previously utilized inhibition motifs. For example, a "bisubstrate analog" inhibits kinase action by binding both the conserved ATP binding pocket and a protein/peptide recognition site on the specific kinase. Research groups also utilized ATP analogs as chemical probes to study kinases and identify their substrates.
The development of novel chemical means of incorporating phosphomimetic amino acids into proteins has provided important insight into the effects of phosphorylation events. Phosphorylation events have typically been studied by mutating an identified phosphorylation site (serine, threonine or tyrosine) to an amino acid, such as alanine, that cannot be phosphorylated. However, these techniques come with limitations and chemical biologists have developed improved ways of investigating protein phosphorylation. By installing phospho-serine, phospho-threonine or analogous phosphonate mimics into native proteins, researchers are able to perform in vivo studies to investigate the effects of phosphorylation by extending the amount of time a phosphorylation event occurs while minimizing the often-unfavorable effects of mutations. Expressed protein ligation, has proven to be successful techniques for synthetically producing proteins that contain phosphomimetic molecules at either terminus. In addition, researchers have used unnatural amino acid mutagenesis at targeted sites within a peptide sequence.
Advances in chemical biology have also improved upon classical techniques of imaging kinase action. For example, the development of peptide biosensors—peptides containing incorporated fluorophores improved temporal resolution of in vitro binding assays. One of the most useful techniques to study kinase action is Fluorescence Resonance Energy Transfer (FRET). To utilize FRET for phosphorylation studies, fluorescent proteins are coupled to both a phosphoamino acid binding domain and a peptide that can by phosphorylated. Upon phosphorylation or dephosphorylation of a substrate peptide, a conformational change occurs that results in a change in fluorescence. FRET has also been used in tandem with Fluorescence Lifetime Imaging Microscopy (FLIM) or fluorescently conjugated antibodies and flow cytometry to provide quantitative results with excellent temporal and spatial resolution.
Chemical biologists often study the functions of biological macromolecules using fluorescence techniques. The advantage of fluorescence versus other techniques resides in its high sensitivity, non-invasiveness, safe detection, and ability to modulate the fluorescence signal. In recent years, the discovery of green fluorescent protein (GFP) by Roger Y. Tsien and others, hybrid systems and quantum dots have enabled assessing protein location and function more precisely. Three main types of fluorophores are used: small organic dyes, green fluorescent proteins, and quantum dots. Small organic dyes usually are less than 1 kDa, and have been modified to increase photostability and brightness, and reduce self-quenching. Quantum dots have very sharp wavelengths, high molar absorptivity and quantum yield. Both organic dyes and quantum dyes do not have the ability to recognize the protein of interest without the aid of antibodies, hence they must use immunolabeling. Fluorescent proteins are genetically encoded and can be fused to your protein of interest. Another genetic tagging technique is the tetracysteine biarsenical system, which requires modification of the targeted sequence that includes four cysteines, which binds membrane-permeable biarsenical molecules, the green and the red dyes "FlAsH" and "ReAsH", with picomolar affinity. Both fluorescent proteins and biarsenical tetracysteine can be expressed in live cells, but present major limitations in ectopic expression and might cause a loss of function.
Fluorescent techniques have been used assess a number of protein dynamics including protein tracking, conformational changes, protein–protein interactions, protein synthesis and turnover, and enzyme activity, among others. Three general approaches for measuring protein net redistribution and diffusion are single-particle tracking, correlation spectroscopy and photomarking methods. In single-particle tracking, the individual molecule must be both bright and sparse enough to be tracked from one video to the other. Correlation spectroscopy analyzes the intensity fluctuations resulting from migration of fluorescent objects into and out of a small volume at the focus of a laser. In photomarking, a fluorescent protein can be dequenched in a subcellular area with the use of intense local illumination and the fate of the marked molecule can be imaged directly. Michalet and coworkers used quantum dots for single-particle tracking using biotin-quantum dots in HeLa cells. One of the best ways to detect conformational changes in proteins is to label the protein of interest with two fluorophores within close proximity. FRET will respond to internal conformational changes result from reorientation of one fluorophore with respect to the other. One can also use fluorescence to visualize enzyme activity, typically by using a quenched activity based proteomics (qABP). Covalent binding of a qABP to the active site of the targeted enzyme will provide direct evidence concerning if the enzyme is responsible for the signal upon release of the quencher and regain of fluorescence.