TopFIND

Summary

TopFIND is the Termini oriented protein Function Inferred Database (TopFIND) is an integrated knowledgebase focused on protein termini, their formation by proteases and functional implications. It contains information about the processing and the processing state of proteins and functional implications thereof derived from research literature, contributions by the scientific community and biological databases.[2]

TopFIND
Content
DescriptionTopFIND is the Termini oriented protein Function Inferred Database, a central resource of protein data integrated with knowledge on protein termini, proteolytic processing by proteases, terminal amino acid modifications and inferred functional implications created by combining community contributions with the UniProt and MEROPS databases.
Data types
captured
Protein annotation
OrganismsH. sapiens, M. musculus, A. thaliana, S. cerevisiae, E. coli
Contact
Research centerUniversity of British Columbia (UBC), Canada
LaboratoryChristopher Overall
AuthorsPhilipp F. Lange
Primary citationTopFIND 2.0--linking protein termini with proteolytic processing and modifications altering protein function[1]
Release date2011
Access
Data formatCustom comma separated file, SQL, XML.
Websiteclipserve.clip.ubc.ca/topfind
Miscellaneous
LicenseCreative Commons Attribution-NoDerivs
Curation policyYes - manual and automatic. Rules for automatic annotation generated by Database Curators and computational algorithms.

Background edit

Among the most fundamental characteristics of a protein are the N- and C-termini defining the start and end of the polypeptide chain. While genetically encoded, protein termini isoforms are also often generated during translation, following which, termini are highly dynamic, being frequently trimmed at their ends by a large array of exopeptidases. Neo-termini can also be generated by endopeptidases after precise and limited proteolysis, termed processing. Necessary for the maturation of many proteins, processing can also occur afterwards, often resulting in dramatic functional consequences. Aberrant proteolysis can cause wide range of diseases like arthritis[3] or cancer.[4] Hence, proteolytic generation of pleiotrophic stable forms of proteins, the universal susceptibility of proteins to proteolysis, and its irreversibility, distinguishes proteolysis from many highly studied posttranslational modifications. Proteases are tightly interconnected in the protease web[5][6] and their aberrant activity in disease can lead to diagnostic fragment profiles with characteristic protein termini.[7] Following proteolysis, the newly formed protein termini can be further modified,[8] a process that affects protein function and stability.[9]

Knowledgebase content edit

TopFIND is a resource for comprehensive coverage of protein N- and C-termini discovered by all available in silico, in vitro as well as in vivo methodologies. It makes use of existing knowledge by seamless integration of data from UniProt and MEROPS and provides access to new data from community submission and manual literature curating. It renders modifications of protein termini, such as acetylation and citrullination, easily accessible and searchable and provides the means to identify and analyse extend and distribution of terminal modifications across a protein. Since its inception TopFIND has been expanded to further species.[1]

Data access edit

The data is presented to the user with a strong emphasis on the relation to curated background information and underlying evidence that led to the observation of a terminus, its modification or proteolytic cleavage. In brief the protein information, its domain structure, protein termini, terminus modifications and proteolytic processing of and by other proteins is listed. All information is accompanied by metadata like its original source, method of identification, confidence measurement or related publication. A positional cross correlation evaluation matches termini and cleavage sites with protein features (such as amino acid variants) and domains to highlight potential effects and dependencies in a unique way. Also, a network view of all proteins showing their functional dependency as protease, substrate or protease inhibitor tied in with protein interactions is provided for the easy evaluation of network wide effects. A powerful yet user friendly filtering mechanism allows the presented data to be filtered based on parameters like methodology used, in vivo relevance, confidence or data source (e.g. limited to a single laboratory or publication). This provides means to assess physiological relevant data and to deduce functional information and hypotheses relevant to the bench scientist. In a later release analysis tools for the evaluation of proteolytic pathways in experimental data have been added.[10]

See also edit

References edit

  1. ^ a b Lange, P. F.; Huesgen, P. F.; Overall, C. M. (2011). "TopFIND 2.0--linking protein termini with proteolytic processing and modifications altering protein function". Nucleic Acids Research. 40 (Database issue): D351–D361. doi:10.1093/nar/gkr1025. PMC 3244998. PMID 22102574.
  2. ^ Lange, P. F.; Overall, C. M. (2011). "TopFIND, a knowledgebase linking protein termini with function". Nature Methods. 8 (9): 703–704. doi:10.1038/nmeth.1669. PMID 21822272. S2CID 7195106.
  3. ^ Cox JH, Starr AE, Kappelhoff R, Yan R, Roberts CR, Overall CM (December 2010). "Matrix metalloproteinase 8 deficiency in mice exacerbates inflammatory arthritis through delayed neutrophil apoptosis and reduced caspase 11 expression". Arthritis & Rheumatism. 62 (12): 3645–3655. doi:10.1002/art.27757. PMID 21120997.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  4. ^ Overall CM, Kleifeld O (March 2006). "Tumour microenvironment - opinion: validating matrix metalloproteinases as drug targets and anti-targets for cancer therapy". Nature Reviews Cancer. 6 (3): 227–239. doi:10.1038/nrc1821. PMID 16498445. S2CID 21114447.
  5. ^ Fortelny, Nikolaus; Cox, Jennifer H.; Kappelhoff, Reinhild; Starr, Amanda E.; Lange, Philipp F.; Pavlidis, Paul; Overall, Christopher M. (2014). "Network analyses reveal pervasive functional regulation between proteases in the human protease web". PLOS Biology. 12 (5): e1001869. doi:10.1371/journal.pbio.1001869. PMC 4035269. PMID 24865846.
  6. ^ Fortelny, Nikolaus; Butler, Georgina S.; Overall, Christopher M.; Pavlidis, Paul (2017). "Protease-Inhibitor Interaction Predictions: Lessons on the Complexity of Protein-Protein Interactions". Molecular & Cellular Proteomics. 16 (6): 1038–1051. doi:10.1074/mcp.M116.065706. PMC 5461536. PMID 28385878.
  7. ^ Huesgen, Pitter F.; Lange, Philipp F.; Overall, Christopher M. (2014). "Ensembles of protein termini and specific proteolytic signatures as candidate biomarkers of disease". Proteomics: Clinical Applications. 8 (5–6): 338–350. doi:10.1002/prca.201300104. PMID 24497460. S2CID 24591183.
  8. ^ Lange, Philipp F.; Overall, Christopher M. (2013). "Protein TAILS: when termini tell tales of proteolysis and function". Current Opinion in Chemical Biology. 17 (1): 73–82. doi:10.1016/j.cbpa.2012.11.025. PMID 23298954.
  9. ^ Lange, Philipp F.; Huesgen, Pitter F.; Nguyen, Karen; Overall, Christopher M. (2014). "Annotating N termini for the human proteome project: N termini and Nalpha-acetylation status differentiate stable cleaved protein species from degradation remnants in the human erythrocyte proteome". Journal of Proteome Research. 13 (4): 2028–2044. doi:10.1021/pr401191w. PMC 3979129. PMID 24555563.
  10. ^ Fortelny, Nikolaus; Yang, Sharon; Pavlidis, Paul; Lange, Philipp F.; Overall, Christopher M. (2015). "Proteome TopFIND 3.0 with TopFINDer and PathFINDer: database and analysis tools for the association of protein termini to pre- and post-translational events". Nucleic Acids Research. 43 (Database issue): D290–D297. doi:10.1093/nar/gku1012. PMC 4383881. PMID 25332401.

External links edit

  • TopFIND - main website and web interface
  • Host institution website
  • Research group of Philipp Lange - inventor & core developer
  • Research Group of Christopher Overall - home of TopFIND
  • Merops - the peptidase database
  • UniProt
  • Proteases at the U.S. National Library of Medicine Medical Subject Headings (MeSH)