Aldehyde tag

Summary

(Learn how and when to remove this template message)

An aldehyde tag is a short peptide tag that can be further modified to add fluorophores, glycans, PEG (polyethylene glycol) chains, or reactive groups for further synthesis.[1] A short, genetically-encoded peptide with a consensus sequence LCxPxR is introduced into fusion proteins, and by subsequent treatment with the formylglycine-generating enzyme (FGE), the cysteine of the tag is converted to a reactive aldehyde group. This electrophilic group can be targeted by an array of aldehyde-specific reagents, such as aminooxy- or hydrazide-functionalized compounds.

Development edit

The aldehyde tag is an artificial peptide tag recognized by the formylglycine-generating enzyme (FGE). Formylglycine is a glycine with a formyl group (-CHO) at the α-carbon.[2] The sulfatase motif is the basis for the sequence of the peptide which results in the site-specific conversion of a cysteine to a formylglycine residue. The peptide tag was engineered after studies on FGE recognizable sequences in sulfatases from different organisms revealed a high homology in the sulfatase motif in bacteria, archaea as well as eukaryotes.[3]

Aldehydes and ketones are used as chemical reporters due to their electrophilic properties. These properties enable a reaction under mild conditions when using a strong nucleophilic coupling partner. Typically, hydrazides and aminooxy probes are used in bioconjugation by forming stabilized addition products with carbonyl groups that are favored under the physiological reaction conditions. At neutral pH, the equilibrium of Schiff base formation lies far to the reactant side. To form stable hydrazones and oximes, compound derivatives are used to yield more product. Since the pH optimum of 4 to 6 cannot be achieved by adding a catalyst due to associated toxicity, the reaction is slow in live cells. A typical reaction constant is 10−4 to 10−3 M−1 s−1.[4]

A carbonyl group is introduced into proteins as a chemical reporter using various techniques, including methods like stop codon suppression and aldehyde tagging.[3][5] Limiting the use of aldehydes and ketones is their restricted bioorthogonality in certain cellular environments. Limitations of aldehydes and ketones as chemical reporters include:

  • Competition with endogenous aldehydes or ketones in metabolites and cofactors, resulting in low yields and impaired specificity.
  • Side reactions, such as oxidation or unwanted addition of endogenous nucleophiles.
  • Restrained set of probes that form sufficiently stable products.[6][7]

Aldehydes and ketones are therefore best used in compartments where such unwanted side reactions are decreased. For experiments with live cells, cell surfaces and extracellular space are typical fielding areas. Nevertheless, a feature of carbonyl groups is the vast number of organic reactions that involve them as electrophiles. Some of these reactions are readily convertible to ligations for probing aldehydes. A reaction recently employed for bioconjugation by Agarwal et al. is the adaptation of the Pictet-Spengler reaction as a ligation. The reaction is known from natural product biosynthetic pathways [8] and has the major advantage of forming a new carbon-carbon bond. This guarantees long-term stability compared to carbon-heteroatom bonds with similar reaction kinetics.[9]

The modification of cysteine or, more rarely, serine[10] by FGE is an uncommon posttranslational modification that was discovered in the late 1990s.[11] The deficiency of FGE leads to an overall deficiency of functional sulfatases due to a lack of α-formylglycine formation vital for the sulfatases to perform their function. FGE is essential for protein modification and need of high specificity and conversion rate is given in the native setting, which makes this reaction applicable in chemical and synthetic biology.[12]

Aldehyde tags were first inserted into the modified sulfatase motif peptide for proteins of interest in 2007.[13] Since then, similar usage of aldehydes and ketones as chemical reporters in bioorthogonal applications has been demonstrated in self-assembly of cell-lysing drugs,[14] the targeting of proteins,[15][5] as well as glycans [16] and the preparation of heterobifunctional fusion proteins.[17]

Genetically encoding the aldehyde tag edit

The formylglycine tag or aldehyde tag is a convenient 6- or 13-amino acids long tag fused to a protein of interest. The 6-mer tag represents the small core consensus sequence and the 13-mer tag the longer full motif. The experiments on the genetically encoded aldehyde tag by clearly showed the high conversion efficiency with only the core consensus sequence present. Four proteins were produced recombinantly in E.coli with an 86% efficiency of for the full-length motif and >90% efficiency for the 6-mer determined by mass spectrometry.[3] The size of the sequence is analogous to the commonly used 6x His-Tag[18] and has the advantage that it can also be genetically encoded. The sequence is recognized in the ER solely depending on primary sequence and subsequently targeted by FGE.[11] Notably, in the setup of recombinant expression proteins in E. coli a coexpression of exogenous FGE aids full conversion,[3] although E. coli has endogenous FGE-activity.[19] The introduction of an aldehyde tag has a workflow that consists of three segments: A the expression of the fusion protein, that carries the peptide tag derived from the sulfatase motif, B the enzymatic conversion of Cys to f(Gly) and C the bioorthogonal probing with hydrazides or alkoxy amines (Fig. 1).

 
Figure 1: Formylglycine aldehyde tag Carrico et al.:[3] A The aldehyde tag is genetically inserted into a protein of interest. In this example, the human growth hormone (hGH, PDB:1HUW), one of the four initially examined proteins, is shown. The N-terminus of the protein is fused to the formylglycine aldehyde tag. B The FGE recognizes the motif and the cysteine (Cys) residue is converted into the formylglycine residue [f(Gly)]. The chemical reporter is formed on location by an enzymatic reaction. C The carbonyl group is probed using typically hydrazide- or alkoxy amine-functionalized dyes or other compounds.

As seen in Fig. 1, the engineered aldehyde tag consists of six amino acids. A set of organisms from all domains of life was chosen and the sequence homology of the sulfatase motif was determined. The sequence used is the best consensus for sequences found in bacteria, archaea, worms and higher vertebrates.[3]

FGE-mechanism of cysteine-formylglycine conversion edit

The catalytic mechanism of FGE is well studied. A multistep redox reaction with a covalent enzyme: substrate intermediate is proposed. The role of the cysteine residue for the occurring conversion was studied by mutating the cysteine to alanine. No conversion was found using mass spectrometry when the mutated peptide tag was used.[3] The mechanism shows the important role of the redox active thiol group of cysteine in the formation of f(Gly), as seen in Fig. 2. The key step of the catalytic cycle is the monooxidation of the cysteine residue of the enzyme, forming a reactive sulfenic acid intermediate. Subsequently, the hydroxyl group is transferred to the cysteine of the substrate and after hetero-analogous β-elimination of H2O, a thioaldehyde is formed. This compound is very reactive and easily hydrolyzed, releasing the aldehyde and a molecule of H2S,[20][21][12]

 
Figure 2: Conversion of Cys to f(Gly) by FGE from Dierks et al.:[12] Substrate I binds to FGE, and disulfide isomerization occurs IIb. Cys341 of FGE is oxidized to a sulfenic acid IIc The hydroxyl group is transferred to the substrate and a substrate-sulfenic acid III is generated. β‑elimination of water leads to a thioaldehyde IV which is quickly hydrated to V and after elimination of H2S, the aldehyde VI is formed. The equilibrium lies far to the product side due to the high reactivity of the thioaldehyde IV compared to the aldehyde VI and its arising tendency to form the hydrate V. CysSubs = substrate protein cysteine embedded in sulfatase motif.

Applications edit

The aldehyde tag is a technique which recently[when?] found increased application because of the introduction of bioorthogonal chemical reporters. Bioorthogonal agents contain functional groups such as azides or cyclooctynes for coupling which are not naturally found in the cell. Due to their foreignness, they seem inert and do not disrupt the native metabolism,[3][7][9]Fig. 3 gives an overview of possible labeling methods for formylglycine. For example, it can be coupled to probes such as biotin or a protein tag like Flag that are useful for purification and detection.[3][1] Furthermore, fluorophores can be directly conjugated for live cell imaging.[22] The conjugation of polyethylene glycol (PEG) chains to potential drug candidates extends the stability against proteases in body fluids and at the same time reduces renal clearance and immunogenicity.[3] The first application described here, deals with the formation of protein-protein conjugates through bioorthogonal probes.[23] Since, the aldehyde tag is strictly speaking not a true bioorthogonal agent as it can be found in various metabolites, it can cause cross reactions during protein labeling.[16][23] However, coupling bioorthogonal probes such as azides or cyclooctynes can be applied to overcome this obstacle.[23] As a second application, the coupling of glycan moieties to proteins is presented here. It can be utilised in the strategy of chemically introduced glycosylation patterns.[24]

 
Figure 3: Possible conjugations probes for formylglycine

Forming protein-protein conjugates via Cu-free click chemistry edit

Studies have explored the strategy of producing protein-protein conjugates with the help of the aldehyde tag.[23] Their aim was to connect full length human IgG (hIgG) to the human growth hormone (hGH). These protein-protein conjugates can be superior to monomeric proteins in terms of serum half life in protein therapeutics and, additionally, have appealing dual binding properties.[24] In order to achieve protein fusion, the five-residue aldehyde tag (CxPxR) was incooperated into hIgG and hGH. In hIgG, the aldehyde tag was introduced at the C termini of the two heavy chains, resulting in two possible conjugation sites. FGE then oxidizes the cysteine residue to formylglycine (fGly) during protein expression. For the subsequent conjugation steps, the strategy of the copper-free click chemistry was selected. A strain-promoted 1,3-dipolar cycloaddition of a cyclooctynes and an azide was carried out forming a covalent linkage (also termed the Cu-free azide-alkyne cycloaddition).[23] Thus, the aldehyde bearing proteins react under oxime formation with different heterobifunctional linkers which carry an aminooxy residue on one end and either an azide or cyclooctynes on the other. This results in the attachment of hIgG to a linker containing a cyclooctyne (here dibenzoazacyclooctyne (DIBAC)) and hGH to a linker holding an azide function (Fig.: 2A and B). The proteins hGH and hIgG were also treated with DIBAC-488, azide Alexa Fluor 647 and analysed by SDS-PAGE and Western blot to validate oxime formation. Next, the DIBAC-hIgG and azide-hGH derivatives are joined by Cu-free click chemistry (Fig.: 2C). The resulting fusion proteins were purified and analyzed by immunoblot (see Hudak et al. 2012).

 
Figure 4: Protein–protein conjugation of hIgG with hGH A)+B) Aldehyde-tagged proteins are treated with the aminooxy-azide/cyclooctyne bifunctional linkers for oxime formation. C) The DIBAC-hIgG and azide-hGH conjugates are joined by Cu-free click chemistry to protein trimers; X and Y are PEG-based linkers of different length.

The Western blots were first stained with Ponceau and then incubated with IgG antibodies against hGH and subsequently treated with α-mIgG HRP and α-hIgG 647 for visualisation. In the hIgG-hGH conjugate Western blot (nonreducing conditions), two separate bands with different molecular weights are visible after immunodetection. These can be contributed to the formation of mono- and bi-conjugated hGH to hIgG.

Chemical glycosylation of the IgG Fc fragment edit

Nature has perfected glycosylation of proteins through a complex interaction of enzymes and carbohydrates over thousands of years. However, chemical glycosylation is still an obstacle due to the difficult synthesis of glycan in general.[25] The synthesis of carbohydrate derivatives can be slow and tedious.[26] Nonetheless, the interest in technologies to structurally mimic protein glycosylation is an appealing application as some protein functions solely depend on the pattern of the attached glycan.[27] The Fc fragment of the IgG antibody, for example, is a homodimer with a highly conserved N-glycosylation site. The attached sugar moieties modulate the binding to specific immunoreceptors, thereby modifying the whole antibody function.[28][29]

Smith et al. demonstrate the application of the aldehyde tag as a chemical conjugation site for glycans.[22] The aldehyde tag sequence was incooperated into the Fc construct and introduced into CHO (Chinese hamster ovary) cells. As controls, gene constructs were used in which the cysteine residue was mutated to an alanine. After expression, the Fc proteins were purified using a protein A/G agarose column. The conversion in CHO cells of cystein to formylglycine was examined using aminooxy AlexaFluor 488 and subsequent SDS-PAGE. However, fluorescence scanning displayed no fluorescence labeling, i.e. no formylglycine formation by endogenous FGE in CHO cells. The unaltered proteins were then treated with recombinant FGE from Mycobacterium tuberculosis in vitro in which the aldehyde group was successfully installed at the glycosylation site of Fc (Fig. 3A).

Next, the introduction of N-acetylglucoseamine (GlcNAc) to the aldehyde tagged proteins via oxime formation was carried out through the treatment with aminooxy GlcNAc (AO-GlcNAc) (Fig. 3B). The conjugation was confirmed by liquid chromatography-electrospray ionisation-mass spectrometry (LC-ESI-MS) and lectin blot with the GlcNAc-binding wheat germ agglutinin attached to AlexaFluor 647. Having successfully introduced GlcNAc, the monomer was extended with a glycan structure containing GlcNAc, mannose (Man) and galactose (Gal) (Fig. 3C). A mutant endoglycosidase EndoS (EndoS-D233Q) was utilised as it is highly specific for IgG Fc N-linked GlcNAc residues and does not elongate Asn-GlcNAc sites on other proteins or on denatured IgGs. Product formation was again monitored by LC-ESI-MS and lectin blot probing, with the sialic acid-binding sambucus nigra agglutinin attached to fluorescein isothiocyanate.

A successful chemical glycosylation of the Fc IgG fragment was achieved which resembles the natural occurring glycosylation pattern. The study discussed above focused on the IgG antibody, however, the application of the aldehyde tag for glycan conjugation could potentially be extended to other proteins.[22]

 
Figure 5: Chemical glycosylation of the IgG Fc fragment (Chain A and B) the glycosyltion site A) Formylglycine-generating enzyme (FGE) oxidizes the cysteine residue to an aldehyde function B) Aminooxy-N-Acetylglucoseamine (AO-GlcNAc) is coupled to Fc via oxime formation C) GlcNAc is further elongated by different carbohydrate moieties with the help of the endoglycosidase EndoS (EndoS-D233Q)

References edit

  1. ^ a b Wu, P., Shui, W., Carlson, B.L., Hu, N., Rabuka, D., Lee, J., and Bertozzi, C.R. (2009) Site-specific chemical modification of recombinant proteins produced in mammalian cells by using the genetically encoded aldehyde tag. Proc. Natl. Acad.
  2. ^ Holder, Patrick G.; Jones, Lesley C.; Drake, Penelope M.; Barfield, Robyn M.; Bañas, Stefanie; de Hart, Gregory W.; Baker, Jeanne; Rabuka, David (2015-06-19). "Reconstitution of Formylglycine-generating Enzyme with Copper(II) for Aldehyde Tag Conversion". Journal of Biological Chemistry. 290 (25): 15730–15745. doi:10.1074/jbc.M115.652669. ISSN 0021-9258. PMC 4505483.
  3. ^ a b c d e f g h i j Carrico, I. S., Carlson, B. L. and Bertozzi, C. R. (2007) Introducing genetically encoded aldehydes into proteins. Nat Chem Biol 3, 321–322.
  4. ^ Jencks, W. P. (1959) Studies on the Mechanism of Oxime and Semicarbazone Formation1. J. Am. Chem. Soc. 81, 475–481.
  5. ^ a b Zhang, Z., Smith, B. A. C., Wang, L., Brock, A., Cho, C. and Schultz, P. G. (2003) A New Strategy for the Site-Specific Modification of Proteins in Vivo†. Biochemistry 42, 6735–6746.
  6. ^ Lim, R. K. V. and Lin, Q. (2010) Bioorthogonal Chemistry: Recent Progress and Future Directions. Chem Commun (Camb) 46, 1589–1600.
  7. ^ a b Prescher, J. A. and Bertozzi, C. R. (2005) Chemistry in living systems. Nat Chem Biol 1, 13–21.
  8. ^ Stöckigt, J., Antonchick, A. P., Wu, F. and Waldmann, H. (2011) Die Pictet-Spengler-Reaktion in der Natur und der organischen Chemie. Angew. Chem. 123, 8692–8719.
  9. ^ a b Agarwal, P., van der Weijden, J., Sletten, E. M., Rabuka, D. and Bertozzi, C. R. (2013) A Pictet-Spengler ligation for protein chemical modification. Proc Natl Acad Sci U S A 110, 46–51.
  10. ^ Miech, C., Dierks, T., Selmer, T., Figura, K. von and Schmidt, B. (1998) Arylsulfatase from Klebsiella pneumoniae Carries a Formylglycine Generated from a Serine. J. Biol. Chem. 273, 4835–4837.
  11. ^ a b Dierks, T., Lecca, M. R., Schmidt, B. and von Figura, K. (1998) Conversion of cysteine to formylglycine in eukaryotic sulfatases occurs by a common mechanism in the endoplasmic reticulum. FEBS Letters 423, 61–65.
  12. ^ a b c Dierks, T., Dickmanns, A., Preusser-Kunze, A., Schmidt, B., Mariappan, M., von Figura, K., Ficner, R. and Rudolph, M. G. (2005) Molecular Basis for Multiple Sulfatase Deficiency and Mechanism for Formylglycine Generation of the Human Formylglycine-Generating Enzyme. Cell 121, 541–552.
  13. ^ Carrico, I. S., Carlson, B. L. and Bertozzi, C. R. (2007) Introducing genetically encoded aldehydes into proteins. Nat Chem Biol 3, 321–322. Jencks, W. P. (1959) Studies on the Mechanism of Oxime and Semicarbazone Formation. Journal of the American Chemical Society. 81, 475–481.
  14. ^ Rideout, D. (1994) Self-assembling drugs: a new approach to biochemical modulation in cancer chemotherapy. Cancer Invest. 12, 189–202; discussion 268–269.
  15. ^ Chen, I., Howarth, M., Lin, W. and Ting, A. Y. (2005) Site-specific labeling of cell surface proteins with biophysical probes using biotin ligase. Nat Meth 2, 99–104.
  16. ^ a b Mahal, L. K., Yarema, K. J. and Bertozzi, C. R. (1997) Engineering Chemical Reactivity on Cell Surfaces Through Oligosaccharide Biosynthesis. Science 276, 1125–1128.
  17. ^ Hudak, J. E., Barfield, R. M., de Hart, G. W., Grob, P., Nogales, E., Bertozzi, C. R. and Rabuka, D. (2012) Synthesis of Heterobifunctional Protein Fusions Using Copper-Free Click Chemistry and the Aldehyde Tag. Angew. Chem. Int. Ed. 51, 4161–4165.
  18. ^ Hochuli, E., Bannwarth, W., Döbeli, H., Gentz, R. and Stüber, D. (1988) Genetic Approach to Facilitate Purification of Recombinant Proteins with a Novel Metal Chelate Adsorbent. Nat Biotech 6, 1321–1325
  19. ^ Dierks, T., Miech, C., Hummerjohann, J., Schmidt, B., Kertesz, M. A. and Figura, K. von. (1998) Posttranslational Formation of Formylglycine in Prokaryotic Sulfatases by Modification of Either Cysteine or Serine. J. Biol. Chem. 273, 25560–25564.
  20. ^ Roeser, D., Preusser-Kunze, A., Schmidt, B., Gasow, K., Wittmann, J. G., Dierks, T., Figura, K. von and Rudolph, M. G. (2006) A general binding mechanism for all human sulfatases by the formylglycine-generating enzyme. PNAS 103, 81–86.
  21. ^ Sase, S., Kakimoto, R. and Goto, K. (2014) Synthesis of a Stable Selenoaldehyde by Self-Catalyzed Thermal Dehydration of a Primary-Alkyl-Substituted Selenenic Acid. Angew. Chem. n/a–n/a.
  22. ^ a b c Smith, E. L.; Giddens, J. P.; Iavarone, A. T.; Godula, K.; Wang, L. X.; Bertozzi, C. R. Chemoenzymatic Fc Glycosylation via Engineered Aldehyde Tags. Bioconjug. Chem. 2014, 25,788-795.
  23. ^ a b c d e Hudak, J. E., Barfield, R. M., de Hart, G. W., Grob, P., Nogales, E., Bertozzi, C. R. and Rabuka, D. (2012) Synthesis of Heterobifunctional Protein Fusions Using Copper-Free Click Chemistry and the Aldehyde Tag. Angew. Chem. Int. Ed. 51, 4161–4165.
  24. ^ a b Beck A, Wurch T, Bailly C, Corvaia N. (2010) Strategies and challenges for the next generation of therapeutic antibodies Nat Rev Immunol. 2010 May;10(5):345-52.
  25. ^ Yarema K.J., and Bertozzi C.R. (2001) Characterizing glycosylation pathways GenomeBiology 2 r4.1-r4.10.
  26. ^ Seeberger PH, Finney N, Rabuka D, Bertozzi CR.; Chemical and Enzymatic Synthesis of Glycans and Glycoconjugates; Essentials of Glycobiology. 2nd edition. Cold Spring Harbor (NY): Cold Spring Harbor Laboratory Press; 2009. Chapter 49.
  27. ^ Arnold, J. N., Wormald, M. R., Sim, R. B., Rudd, P. M., and Dwek, R. A. (2007) The impact of glycosylation on the biological function and structure of human immunoglobulins. Annu. Rev. Immunol. 25, 21 − 50.
  28. ^ Krapp, S., Mimura, Y., Jefferis, R., Huber, R., and Sondermann, P. (2003) Structural analysis of human IgG-Fc glycoforms reveals a correlation between glycosylation and structural integrity. J. Mol. Biol. 325, 979 − 989.
  29. ^ Kaneko, Y., Nimmerjahn, F., and Ravetch, J. V. (2006) Anti-inflammatory activity of Immunoglobulin G resulting from Fc sialylation. Science 313, 670 − 673.