Computable sugars: some computational resources in glycoscience

Glycoscience is sweet science (PhotoDisc/Getty Images)

As glycoscience advances, labs will increasingly want to ask questions about glycosylation sites on a protein or the structure of a sugar, says Raja Mazumder, a bioinformatician at George Washington University. They might ask for example: are there glycosyltransferases that are expressed in liver but not in the heart, or, which ones are overexpressed by a factor of three in more than two cancers. Such questions require infrastructure building, he says, because right now there is no mechanism to allow such queries. But he and others are building such capabilities. Mazumder along with William York at the University of Georgia are starting to build a glycoscience informatics portal.

Mazumder wants to leverage existing ontologies in the developer community in order to build systems that can be queried on a large-scale. For example, Mazumder is working with Cathy Wu at Georgetown University, who is developing the Protein Ontology. Such ontologies are collected, for example, by the non-profit OBO Foundry. To allow flexible querying, the computational resources will draw on different ontologies; ones that relate to glycans, genes, proteins, tissues, diseases and more.

Ontologies are part the team’s effort to build application program interfaces (APIs) that expose the data in a given database to incoming queries. Given how complex sugars are, the informatics framework has to be well-organized for both human and machine-based querying, says Mazumder.

When using the resource, a researcher will receive results that also document the search process itself such as the version of the queried database. “You need to be able to tell where you got that information from,” says Mazumder. Tracking data provenance matters especially in an age when databases continuously integrate information emerging in the literature.

For the Food and Drug Administration, Mazumder is developing computational standards for high-throughput sequencing, which he wants to also apply to glycoscience. His ‘biocompute object’ captures the given computational workflow a lab might have used to generate results: the software used, the databases queried and their version, and identifiers of data inputs and outputs. These biocompute objects are intended to help regulatory scientists interpret submitted work. It can also help scientists generally see if, for example, the version of software they used worked as it should, says Mazumder.

Too often labs use computational tools without benchmarking them, says Mazumder. “It would be unthinkable for a wet-lab scientist to not have a positive and negative control,” he says. In informatics, developers benchmark their software but users often do not have these habits. “They don’t even know: if I don’t find anything, is it because my software did not run well or not?”

As labs move to big data analysis in genomics and also, eventually, in glycoscience, this aspect is ever more important, says Mazumder. In his view, biocompute objects will help glycobiology researchers communicate with one another about their results, such as where on a protein they found a sugar with a given structure. More generally, it will help glycoscientists to have a better way to connect the available sugar resources as they pursue their questions of interest.

Here are some resources that glycoscientists can tap into:

Category	Resource	Description
General resources and funding information
	Transforming Glycoscience: A Roadmap for the Future	Report by the National Research Council of the National Academies of Science
	NIH Common Fund program in glycoscience	Funding opportunities from the NIH Common Fund program in glycoscience
	A roadmap for Glycoscience In Europe by BBSRC, EGSF, European Science Foundation	Glycoscience roadmap for Europe
	GlycoNet	Resources related to glycoscience research in Canada, based at the University of Alberta where the Alberta Glycomics Centre is located
	National Center for Functional Glycomics	A Glycomics-related Biomedical Technology Resource Center based at Beth Israel Deaconess Medical Center, Harvard Medical School with resources on, for example, microarrays and microarray services, protocols, training and databases
Databases and portals
	CAZy	Carbohydrate-Active Enzymes, a database of enzyme families that degrade, modify or create glycosidic bonds
	Consortium for Functional Glycomics	Resources and glycoscience data. Part of the National Center for Functional Glycomics.
	ExPASy	Software tools and databases to simulate, predict and visualize glycans, glycoproteins and glycan-binding proteins
	Glycan Library	A list of lipid-linked sequence-defined glycan probes
	Glyco3D	A portal for structural glycoscience
	GlycoBase 3.2	A database of N– and O-linked glycan structures with HPLC, UPLC, exoglycosidase sequencing and mass spectrometry data
	GlycoPattern	Portal for glycan array experimental results from the Consortium for Functional Glycomics
	Glycosciences.de	Collection of databases and tools in glycoscience
	GlyToucan	Repository for glycan structures based in Japan
	MatrixDB	A database of experimental data of interactions by proteoglycans, polysaccharides and extracellular matrix proteins
	Repository of Glyco-enzyme expression constructs	University of Georgia Complex Carbohydrate Research Center repository for glyco-enzyme constructs
	SugarBind	A database of carbohydrate sequences to which bacteria, toxins and viruses adhere
	UniCarbKB	A resource curated by scientists in in five countries. It includes GlycoSuiteDB, a database of glycan structures; EUROCarbDB, an experimental and structural database and UniCarb-DB, a mass spec database of glycan structures
Software tools
	CASPER	Web-based tool to calculate NMR chemical shifts of oligo- and polysaccharides
	Glycan Builder	An online tool at ExPASy for predicting possible oligosaccharide structures on proteins
	GlycoMiner/GlycoPattern	Software tools to automatically identify mass spec spectra of N-glycopeptides
	GlyMAP	An online resource for mapping glyco-active enzymes
	NetOGlyc	Software tool for predicting O--glycosylation sites on proteins
	SweetUnityMol	Molecular visualization software

Sources: NIH, R. Mazumder, George Washington University; New England Biolabs, Thermo Fisher Scientific, Nature Research

Computable sugars: some computational resources in glycoscience

Share this post

Share with...

...or copy the link

Computable sugars: some computational resources in glycoscience

Please sign in or register for FREE

Follow the Topic

Recommended Content

Daycare at conferences

Podcast: Sneak-peek of the 2025 SfN annual meeting

Their science takes them outside

When summertime courses transform a science journey

Some resources for an empowered Pride

Computable sugars: some computational resources in glycoscience

Please sign in or register for FREE

Follow the Topic

Recommended Content

Daycare at conferences

Podcast: Sneak-peek of the 2025 SfN annual meeting

Their science takes them outside

When summertime courses transform a science journey

Some resources for an empowered Pride

Cookies