Genome browser

Summary

The completion of the human genome sequencing in the early 2000s was a turning point in genomics research.[1] Scientists have conducted series of research into the activities of genes and the genome as a whole. The human genome contains around 3 billion base pairs nucleotide, and the huge quantity of data created necessitates the development of an accessible tool to explore and interpret this information in order to investigate the genetic basis of disease, evolution, and biological processes.[2] The field of genomics has continued to grow, with new sequencing technologies and computational tool making it easier to study the genome.

The genome browser is an important tool for studying the genome. In bioinformatics, a genome browser is a graphical interface for displaying information from a biological database for genomic data.[2] It is a software tool that displays genetic data in graphical form. Genome browsers enable users to visualize and browse entire genomes with annotated data, including gene prediction, gene structure, protein, expression, regulation, variation, and comparative analysis. Annotated data is usually from multiple diverse sources. They differ from ordinary biological databases in that they display data in a graphical format, with genome coordinates on one axis with annotations or space-filling graphics to show analyses of the genes, such as the frequency of the genes and their expression profiles.[1] The software allows users to navigate the genome, view numerous features, analyze and investigate the relationships between various genomic elements.

History edit

The first genome browser, known as the Ensembl Genome Browser, was develop as part of the Human Genome Project by a group of researchers from the European Bioinformatics Institute (EBI). It was created with the aim of providing a complete resource for the human genome sequence, with focus on gene annotation. It is a user-friendly interface for exploring the human genome and other organism's genomes. Several more genome browsers have been created, including the UCSC Genome Browser, developed in 2000 by Jim Kent and David Haussler, and the NCBI's Genome Data Viewer.[2][3]

These genome browsers may support multiple genomes, however, other genome browsers may be specific for particular species. These browsers may provide summary of data from genomic databases and comparative assessment of different genetic sequences across multiple species, and allow the data to be visualized in various ways to facilitate assessment and interpretation of these complex data.[4][5]

Characteristics of a Browser edit

Genome Assembly and Annotation: Give access to the reference genome assembly, serving as a framework for overlaying and analyzing other genomic data. They also include gene annotations that provide information about gene locations, transcripts, and functional elements. There is no specific browser that is considered the "best" for genome annotation and assembly as it ultimately depends on the specific needs of the user and the type of analysis being performed. Integrative Genomics Viewer (IGV): IGV is a popular browser for visualizing and annotating genomic data, including genomic variation, gene expression, and chromatin structure. It supports a wide range of file formats and provides advanced tools for data analysis.

Data Overlay and Integration: Allow users to overlay and integrate diverse genomic data types, such as DNA sequencing data, gene expression data, and epigenetic data, onto the reference genome. This enables researchers to study relationships between different genomic features and datasets.The choice of the most suitable genome annotation and assembly browser varies depending on the specific analysis needs and preferences of the user. However, one popular option for visualizing and annotating genomic data is the Integrative Genomics Viewer (IGV), which offers a wide range of data analysis tools and supports various file formats, including genomic variation, gene expression, and chromatin structure data.

Visualization Tools: Offer visualization tools that enable users to visualize genomic data in various formats, such as heatmaps, line plots, bar plots, and genomic tracks. These tools facilitate exploration and interpretation of complex genomic data in a graphical format. The UCSC Genome Browser is a popular and comprehensive genome browser that offers a wide range of visualization tools for genomic data, such as genetic variation, gene expression, and epigenetic modifications. Additionally, it provides access to numerous publicly available datasets for comparative genomics research.[6]

Zooming and Navigation: Provide zooming and navigation tools that allow users to explore genomic data at different scales, from the whole genome down to individual nucleotides. This facilitates navigation and focus on specific genomic regions of interest. Again UCSC is a great browser for navigation, however the NCBI in Figure 1 as featured in the figure below has logical navigation and user interface .

Search and Retrieval: Include search and retrieval features that allow users to search for specific genes, genomic regions, or functional elements. This simplifies the process of locating and retrieving relevant genomic data for analysis. The NCBI browser [7] is a valuable tool for genomics research due to its extensive database, user-friendly interface, and integration with other NCBI tools. It provides access to a large and diverse set of biological databases, including the GenBank database, making it easier for users to search and retrieve genomic data. Additionally, the user-friendly interface and advanced search options allow for more efficient searches, while the integration with other NCBI tools ensures a seamless search and retrieval experience.

Comparative Genomics: Some genomic browsers include features for comparing and analyzing genomic data from different species or strains. This enables researchers to study evolutionary relationships, identify conserved regions, and compare gene orthologs. Ensembl [8] offers advanced comparative genomics tools, including the ability to compare gene structures, genome alignments, and synteny between different organisms.

Customization and Annotation: Can allow users to customize the display of genomic data by adding their own annotations, tracks, or visualizations. This enables researchers to tailor the browser interface to their specific research needs and hypotheses.

Data Sharing and Collaboration: Contain features for data sharing and collaboration, such as the ability to share browser sessions, save customizations, or collaborate with other researchers in real-time. This promotes collaboration and data sharing among researcher. GMOD: GMOD (Generic Model Organism Database) is a collection of open-source tools for building and sharing genome databases. It provides a framework for integrating genomic data with other biological data types, such as proteomics and metabolomics, and allows for the sharing of data and analysis with collaborators.

Analysis Tools: Some browsers provide analysis tools, such as tools for identifying differentially expressed genes, predicting functional elements, or performing other computational analyses on the genomic data directly within the browser environment.

 

↵The two images show the features and inputs of the NCBI Genomic Browser which is one of many. The right image displays the Chr1 region of the human gene. The box at the bottom highlighted in red shows the customizable options such as BLAST, track by accession, assembly details, history, and tracks/user data. These features can be different across different genomic platforms.

Features and Functionality edit

The genome browser displays the genome as a series of tracks or layers that can be toggled on or off based on the needs of the user. Each track represents a unique genomic feature such as genes, transcripts, regulatory region, or sequence variations. The user can zoom in and out of a certain genome region to view different level of detail or additional information, as well as navigate to specific regions using a search function or by clicking on a specific feature.

Aside from gene annotations, genome browsers can display a variety of different data types, such as:

DNA Sequence: This can be shown as a single linear track or as several tracks, with different colors signifying distinct features (for example, exons, introns, and repetitions).

Variation Data: This includes information on Single-nucleotide polymorphism (SNPs), insertions/deletions (indels), and structural variants.

Transcriptomics: This contains information on gene expression levels, alternative splicing, and non-coding RNAs.

Proteomics: This includes information on protein expression levels, post-translational modifications, and protein-protein interactions.

Applications edit

Genome browsers are used in a variety of research fields, including bioinformatics, genetics, and clinical genomics. They allow researchers to investigate the genetic basis of disease, evolution, and other biological processes. Here are some instances of how genome browsers are being used in various fields:

Evolutionary Biology: Genome browsers are used to study and compare the genomes of various organisms to identify similarities and differences in gene structure, regulatory element, function and repetitive sequence. This can provide evolutionary insight into the relationship between different species and also help identify genetic alteration that underpin adaptation and speciation, as well as provide evolutionary insight into relationship between different species.

Clinical Genomics: Genome browsers are used to study the genetic basis of disease. By examining the genome of a patient, researchers can identify genetic mutation that may be responsible for the disease. Genome browsers enable researchers to investigate these mutations' possible impact on gene expression and protein function by visualizing them in the context of the genome and how proteins work.

References edit

  1. ^ a b Wang Ziling; Zhang Lishu (2 July 2018). Essential Computing Skills for Biologists. World Scientific Publishing Company. pp. 20–29. ISBN 978-1-84816-926-5.
  2. ^ a b c Jun Wang; Lei Kong; Ge Gao; Jingchu Luo (March 2013). "A brief introduction to web-based genome browsers". Briefings in Bioinformatics. 14 (2): 131–143. doi:10.1093/bib/bbs029. PMID 22764121.
  3. ^ Michael Speicher; Stylianos E. Antonarakis; Arno G. Motulsky, eds. (2010). "Databases and Genome Browsers". Vogel and Motulsky's Human Genetics: Problems and Approaches (4th ed.). Springer. pp. 905–920. ISBN 978-3-540-37653-8.
  4. ^ Jonathan Pevsner (26 October 2015). Bioinformatics and Functional Genomics (3rd ed.). Wiley. pp. 50–52. ISBN 978-1-118-58178-0.
  5. ^ Joel T. Dudley; Konrad J. Karczewski (3 January 2013). Exploring Personal Genomics. pp. 64–72. ISBN 978-0-19-964448-3.
  6. ^ "UCSC Genome Browser Home". genome.ucsc.edu. Retrieved 2023-05-03.
  7. ^ "National Center for Biotechnology Information". www.ncbi.nlm.nih.gov. Retrieved 2023-05-03.
  8. ^ "Ensembl genome browser 109". useast.ensembl.org. Retrieved 2023-05-03.