National Centre for Biotechnology Information (NCBI) -Bioinformatics
October 17, 2023Table of Contents
NCBI Bioinformatics
OBJECTIVE
- To understand on how the National Centre for Biotechnology Information (NCBI) operate.
- Exploring the resources available in National Centre for Biotechnology Information.
INTRODUCTION
With the advancement of computer science technology and medicine over the past centuries, the brightest mind of humanity begun gathering data and scientific knowledge using the integration of computer tools. With this it eased up on how information is stored and being able to be distributed throughout the globe. The goal of modern molecular biology is to decipher nature’s cryptic but eloquent language of living cells. A grammar of life processes comes from a four-letter alphabet representing the chemical subunits of DNA, the most sophisticated representation of which is man. The science of molecular biology is focused on unravelling and using this “alphabet” to create new “words and phrases.” The massive amount of molecular data, as well as its cryptic and delicate patterns, has necessitated the use of electronic databases and analysis tools.
The NCBI or the National Centre for Biotechnology Information was formed in 1988 as part of the United States National Library of Medicine which is the division of the National Library of Medicine (NLM) at the National Institutes of Health (NIH) and is based in Bethesda, Maryland. It was founded and approved by the United States Federal Government.
The National Centre for Biotechnology Information (NCBI) holds several databases pertinent to biotechnology and biomedicine, as well as bioinformatics tools and services. GenBank, a database for DNA sequences, and PubMed, a bibliographic database for biomedical literature, is among the major databases of NCBI. The NCBI Epigenomics database is one of the several datasets available. The Entrez search engine makes all these databases available online. David Lipman, one of the original creators of the BLAST sequence alignment tool and a well-known figure in bioinformatics, was the director of NCBI.
The objective of the National Centre for Biotechnology Information (NCBI) is to create new information technologies to aid in the understanding of fundamental molecular and genetic mechanisms that affect health and illness. The NCBI has been tasked with developing automated systems for storing and analysing knowledge about molecular biology, biochemistry, and genetics; facilitating the use of such databases and software by the research and medical communities; coordinating national and international efforts to gather biotechnology information; and conducting research into advanced methods of storing and analysing knowledge.
There NCBI are determine and aspire to achieve its objective by using mathematical and computational tools, performs research on fundamental biomedical problems at the molecular level. Next, it maintains connections with various NIH institutions, academia, business, and other governmental agencies, as well as supporting seminars, workshops, and lecture series to promote scientific communication. Other than that, through the NIH Intramural Research Program, to support postdoctoral fellows’ training in basic and applied computational biology research. In addition, through the Scientific Visitors Program, the NCBI engages members of the international scientific community in informatics research and teaching.
Thus, in the first practical sessions, students will be exposed to the many usage and functionality as well as exploring various databases stored in the National Centre for Biotechnology Information (NCBI).
OBJECTIVE
- Student will explore the information stored in National Centre for Biotechnology Information (NCBI).
- Learn how to refine searches using NCBI system to retrieve a particular mRNA
record
- Retrieve literature and protein records associated with the mRNA record
- Identify conserved domains within a protein
- Identify similar proteins
- Identify known mutations within the gene or protein
- Work with NCBI Gene database
RESULTS
PICTURE | DESCRIPTION |
CONTROLLED VOCABULARIES AND LIMITS | The screenshot shows main page of latest version of the National Centre for Biotechnology Information (NCBI). |
In the screenshot shown is the result returned on using the term “cancer” with 30 databases shown after being redirected from the main page of NCBI. | |
This is an example interface database page for “gene” for cancer colon. | |
This is an example interface database page for “protein” for cancer colon search | |
This is an example interface database page for “MeSH” for cancer colon search | |
The screenshot shown is the main page for PubMed database. | |
This screenshot shows the PubChem interface after using the advanced search | |
This is the interface for PubChem after clicking on the advanced search that will redirect user to a new website for the “advanced search” interface for the user to search for a specific title. | |
This is an example interface for the nucleotide search by using the advanced search interface. Using this advanced interface users are able to locate that specific database records and narrow down their search in the NCBI that contain hundred of thousands different records. | |
This interface shows the various data present to different article after using the advanced search engine interface. | |
LIMITING A NUCLEOTIDE SEARCH
| The drop down menu to the left search text has been changed from all database to only nucleotide to specify the search inside of the NCBI databases with the term “colon cancer” inputted inside of the text box. On the right side of the page, it shown the search details and terms. From the search details the user will be redirected to retrieving thousands of different records related to the search key term inputted in the search text box. |
Figure 1 Figure 1.1 | This interface inf figure 1 shows the “limit” tab that was opened on a new webpage. Using this “limit” interface users are able to search for that specific database record associated with their search term. For example, an unspecified colon cancer term will yield in more than 100,000 databases but with this “limit” tab interface it will reduced the yielded record to less than that number. The “limit” tab works almost like that of advanced search interface. In figure 1.1 it shown on how a user can specify their results by setting the requirement necessary in the “limit” tab. For instances under the “molecule” drop down menu, the user are able to select and choose which database is relevant to their search such as using mRNA database. |
In this screenshot, the user will be redirected to a new page after they have selected the search button from the “limit” tab. In this interface, the user are able to yield a more specify results inside of the NCBI databases. | |
figure 1.1 Figure 1.2 | In figure 1.1 it shows the advanced search interface after using the “limit” tab interface. With this interface user are able to navigate through NCBI databases more precisely and narrows down their search even more. In figure 1.2 it shows the interface for the results after applying advanced search in figure 1.1 with the nucleotide record, NM_00249.3 has been selected on the first row of the article listed in its default format interface. |
VIEWING AN INDIVIDUAL DATABASE RECORD figure 3.1 figure 3.2a Figure 3.2b Figure 3.3 | In figure 3.1 it shows the default interface that provides the locus name, definition, references, and so forth. Users can change the display interface by clicking on the “FASTA” or “GRAPHIC” button. On top of that, user also can click on the “go to” button to redirect them immediately to different section of the interface. In figure 3.2 it shows the “FASTA” interface version and in figure 3.2b it shows the graphic format for the gene selected with the exon structure displayed together with protein and protein domain content. In figure 3.3 user are able to save or download the file into their own internal or external hardware for further references. |
figure 4.1a Figure 4.1b Figure 4.2 | Figure 4.1a shows the series of short menus that can be expanded or collapsed according to the convenience of the user that is located on the right side of the webpage. This short menu will aid user in traversing through different interface at ease which related to the database of their last or current search. In figure 4.1b shows the short menu for the “articles about the MLH1 gene” that has been expanded. Meanwhile in figure 4.2 it opens new webpage of PubMed upon clicking on any of the article shown from the previous short menu on “articles about the MLH1 gene”. |
RETRIEVING RELATED PROTEIN RECORDS figure 5.1a Figure 5.1b Figure 5.2 | In this interface it shows the protein record in which users can access it using the search tool located besides the search bar as shown in figure 5.1b. in figure 5.2 user are able to retrieve similar protein sequences by BLAST by locating the “related sequences” link in the short menu located on the right part of the webpage under the “related information” dropdown menu. |
IDENTIFY CONSERVED DOMAINS figure 6.1 Figure 6.2 | Figure 1 shows the main page for the Conserved Domain Database. In identifying the conserved domain, user can access it by going to the Conserved Domain Database (CDD). This can be accessed through the “analyse this sequence” short menu located on the right side of the webpage and click on the “identify conserved domain” link. Using this interface users can observe and identify on protein’s functions as well as its organization as shown in figure 6.2 |
In this figure it shows the “conserved domain database” webpage once the user have clicked on the “identify conserved domains” link in the short menu previously. In this webpage it shows the HATPase and DNA mismatch repair domains. | |
Figure 7.1 Figure 7.2 | The figure shows the new webpage pops out after a user have clicked on blue or green box written as mut1 and M1h1_c respectively as shown in figure 7.2 |
NCBI GENE DATABASE | Figure shows the main page for the “gene” database after being redirected from the main page of NCBI. The result for “colon cancer” comes up for more than 10000 records as of September 2021. |
Figure 8.1 Figure 8.2 | In this figure 8.1 it is shown the database for MLH1 human Gene record after it was clicked on from the main page for “gene” database. In this part it shown the summary drop down menu containing the summaries details for the selected gene record. In order to look for a more specific and particular genes, user can redirect themselves to “The Human Genome Organization” web page by clicking on the link next to the primary source located under the summary drop down menu as shown in figure 7.2 |
This figure shows the main page of “The Human Genome Organization” or simply known as HUGO interface after being redirected from the “gene” interface. Here the user are able to browse through thousands or different records on finding a more specific record of gene database. | |
Figure shown is the “bibliography” section of the drop-down menu for “gene” interface which contain many useful different link that is associated with the information that the user is looking for. | |
In this figure, shows the example of an article or literature from PubMed after being redirected from the main page of “gene” interface. | |
In this figure shown, the user is able to select any one from the many options of the drop down menu for their desired inputted, in which it contain any related information with the gene that they have selected. |
CONCLUSIONS
Thus, in conclusion all the objective in this practical session has been achieved. The students can navigate through the information stored in National Centre for Biotechnology Information (NCBI) and was exposed to the fundamental concept of how information is stored in NCBI. Other than that, it was learned that the NCBI is essentially a search engine provides organism names and classifications for every sequence in the International Nucleotide Sequence Database Collaboration’s nucleotide and protein sequence databases. The National Centre for Biotechnology is an integral part for the advancement of modern medicine and it is hoped that with newer application of computer technology being applied to life science, humanity are able to thrive when it comes to treating and healing.
REFERENCES
1. Conrad L Schoch, Stacy Ciufo, Mikhail Domrachev, Carol L Hotton, Sivakumar Kannan, Rogneda Khovanskaya, Detlef Leipe, Richard Mcveigh, Kathleen O’Neill, Barbara Robbertse, Shobha Sharma, Vladimir Soussov, John P Sullivan, Lu Sun, Seán Turner, Ilene Karsch-Mizrachi. (2021, December). (2020, August 6). NCBI Taxonomy: a comprehensive update on curation, resources and tools. OXFORD ACADEMIC DATABASE. https://academic.oup.com/database/article/doi/10.1093/database/baaa062/5881509
2. Goldstein, A. M. (2010, August 10). The NCBI Databases: an Evolutionist’s Perspective. Evolution: Education and Outreach. https://link.springer.com/article/10.1007/s12052-010-0258-5?error=cookies_not_supported&code=deab98ff-883d-4abc-a687-47b244e00924
3. Wladimir Labeikovsky, W. (2020, September 18). LibGuides: NCBI Resources: What is NCBI? Strauss Health Science Library. https://library-cuanschutz.libguides.com/NCBI
4. National Center for Biotechnology Information. (n.d.). The NCBI Handbook [Internet]. 2nd edition. A Brief History of NCBI’s Formation and Growth. Retrieved September 12, 2021, from https://www.ncbi.nlm.nih.gov/books/NBK148949/
5. National Center for Biotechnology Information. (n.d.). Our Missions. Retrieved September 12, 2021, from https://www.ncbi.nlm.nih.gov/books/NBK148949/