Rummagene brings to the surface a massive collection of published biomedical datasets that are currently buried and inaccessible

Rummagene is a biomedical research search engine providing access to hundreds of thousands of mammalian gene sets mined from the supporting materials of research publications listed on PubMed Central. Rummagene is freely and openly available at: https://rummagene.com.
Like

Although basic biomedical research is advancing rapidly, communicating results from research studies remains almost the same. Research results are still reported in manuscripts following the same structure of physically printed research papers with sections that describe the background and motivation, methods, results, discussion and conclusions. Published research manuscripts also contain citations to prior work, and static figures and tables that display the results with figure and table legends. This approach to communicate results from biomedical research projects is effective. It enables the biomedical research community to build upon the body of human knowledge new layers of understanding. Every new research article is a building block, a "brick" in the house of human collective knowledge. Some bricks cost more to produce, and some are more valuable to the body of knowledge. Specifically, the publications that introduce groundbreaking  discoveries, present a new idea or a theory that can explains many observations, or offer a path to an innovative technology, are the most valuable. At the same time, search engines such as PubMed, PubMed Central, Web of Science, or Google Scholar are critical in facilitating the rapid identification and immediate access to the growing corpus of published biomedical research manuscripts. These search engines are enabling the community of biomedical researchers to sift through the body of knowledge, ensuring that new work builds upon the most relevant existing knowledge. To enable such search engines, the databases behind the search engines use bots to index the text of published research papers. However, most research publications also include supplementary data tables, and those are currently skipped by the indexing bots. These tables are not easily indexable for search because of the non-uniform or non-existing standards required by journals for authors to annotate, store, and share these additional important datasets.

In the past 30 years, pharmacological, genetic, molecular, cellular biomedical research was mainly focused on studying the function of single genes and proteins. However, as omics technologies increase in availability, accessibility, diversity, and accuracy, biomedical research is shifting toward studying gene and protein modules. These modules are not commonly well-defined. They can represent different things, for example, pathways, macro-molecular complexes, targets of transcription factors, differentially expressed genes, sets of genes that were identified to be related to a phenotype such as a human disease or a knockout mouse phenotype, and more. These gene-set modules differ in size; small modules can be part of larger modules; and genes can be members of multiple modules. Regardless of such complexity, abstracting biomedical knowledge into gene sets provides a unifying strategy to mine, index, and reuse information at the data level. For many published biomedical research studies, such gene-set-knowledge does not fit well within the confines of a standard research article, and as a result, the discovered gene sets are reported in the supplementary materials of such publications.

Rummagene is a new biomedical research search engine that provides access to the hundreds of thousands of such mammalian gene sets. Updated weekly, the Rummagene indexing bot and search engine bring to the surface an ability to search a massive collection of published biomedical datasets that are currently buried and inaccessible. In addition, the collection of gene sets mined by Rummagene is a valuable dataset on its own. By examining gene-gene co-occurrence in Rummage mined gene sets, gene function can be predicted, and novel gene modules can be discovered. Rummagene which is freely and openly available at: https://rummagene.com can assist researchers form novel hypotheses and help explain the collective function of discovered gene sets, adding another dimension to standard gene set enrichment analysis.

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Bioinformatics
Life Sciences > Biological Sciences > Biological Techniques > Computational and Systems Biology > Bioinformatics
Data Mining and Knowledge Discovery
Mathematics and Computing > Computer Science > Database Management System > Data Mining and Knowledge Discovery
Metabolic Pathways
Physical Sciences > Chemistry > Biological Chemistry > Metabolism > Metabolic Pathways
Genetics and Genomics
Life Sciences > Biological Sciences > Genetics and Genomics
Systems Biology
Life Sciences > Biological Sciences > Biological Techniques > Biological Models > Systems Biology

Related Collections

With collections, you can get published faster and increase your visibility.

Cancer and aging

This cross-journal Collection invites original research that explicitly explores the role of aging in cancer and vice versa, from the bench to the bedside.

Publishing Model: Hybrid

Deadline: Oct 31, 2024

Cancer epigenetics

With this cross-journal Collection, the editors at Nature Communications, Communications Biology, Communications Medicine, and Scientific Reports invite submissions covering the breadth of research carried out in the field of cancer epigenetics. We will highlight studies aiming at the improvement of our understanding of the epigenetic mechanisms underlying cancer initiation, progression, response to therapy, metastasis and tumour plasticity as well as findings that have the potential to be translated into the clinic.

Publishing Model: Open Access

Deadline: Oct 31, 2024