Pyrfume: A Window to the World's Olfactory Data

Published in Neuroscience and Statistics
Pyrfume: A Window to the World's Olfactory Data
Like

Share this post

Choose a social network to share with, or copy the URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

Introducing Pyrfume: A Window to the World’s Olfactory Data

Around 1665, Isaac Newton introduced the first comprehensive theory of color vision, using a color wheel to illustrate his ideas. Then, in 1802, during his Bakerian Lecture to the Royal Society of London, Thomas Young introduced the trichromatic theory of color vision, a theory which was later supported by Hermann von Helmholtz. Yet, after more than 200 years, the field of olfactory research still lacks comparable theories that link the chemical properties of molecules to neural responses or perception. Our understanding of olfaction appears to be centuries behind that of other sensory systems.

More recently there has been remarkable success with data-driven approaches in visual coding and scene analysis that was driven by the widespread availability and accessibility of imaging data, such as MNIST and ImageNet. These datasets have been crucial in establishing the foundation for newly developed algorithms and coding principles. In an effort to promote and accelerate inquiry into olfactory science, and stimulate data-driven approaches, we developed the Pyrfume repository. Pyrfume, a fusion of "Python" and "perfume" that reflects the convergence of coding and olfactory research, is an integrated data archive housing over 40 odorant-linked datasets. This archive aims to resolve issues surrounding interoperability, coverage, and accessibility.

Building Pyrfume: From Concept to Code

The field of olfactory science has generated a wealth of high-quality datasets, encompassing human psychophysics, perception, and imaging, as well as rodent psychophysics, behavior, and physiology. However, coordinating structured queries across these datasets can be challenging due to the time and effort required to integrate and work with their varying formats. Pyrfume hosts a wide range of datasets that have been curated and standardized to facilitate comparative analysis.

The archive is built on two key principles: first, most olfactory experiments can be indexed by a unique identifier for an olfactory stimulus, whether it be molecules, substances, or mixtures, and second, any olfactory experiment can be universally described as a machine-readable pairing of these stimulus IDs, the task performed with corresponding stimuli, and the observed individuals and their behaviors. Typically, raw datasets require a combination of wrangling, cleaning, and formatting to generate the standardized processed files. Each folder within Pyrfume contains a Python script, titled main.py, that outlines the data processing workflow. This script enables Pyrfume users to view the exact steps used to process the raw data or to reproduce and modify the processing pipeline as needed.

  

Pyrfume in Application

Pyrfume provides a means for cross-modal analyses and meta-analyses. Among the various datasets available in the archive, many share common odorants, allowing for comparisons across different metrics, such as behavioral and neural measure, or between species, like humans and mice. To offer an additional method of evaluation, we developed a tool to compare the perceptual qualities of odors across various studies, which can be found here. Pyrfume also offers packages in different programming languages, including Python and R. For additional details and sample use cases, please refer to the manuscript and supplemental materials. 

Where to Find Pyrfume

All datasets are available to download here, and they can also be accessed directly on GitHub here

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Machine Learning
Mathematics and Computing > Statistics > Statistics and Computing > Machine Learning
Sensory Systems
Life Sciences > Biological Sciences > Neuroscience > Neuroanatomy > Sensory Systems
Olfactory System
Life Sciences > Biological Sciences > Neuroscience > Neuroanatomy > Sensory Systems > Olfactory System

Related Collections

With collections, you can get published faster and increase your visibility.

Epidemiological data

This Collection presents a series of articles describing epidemiological datasets spanning diverse populations, ecosystems, and disease contexts. Data are presented without hypotheses or significant analyses, and can be derived from population surveys, health registries, electronic health records, field sampling, or other sources.

Publishing Model: Open Access

Deadline: Dec 22, 2024

Data for epigenetics research

This Collection presents data within epigenetics research including, but not limited to, data generated through techniques such as ChIP, bisulphite, nanopore and RNA sequencing, single-cell epigenetics/epigenomics, spatial genomics/epigenomics, and the role of non-coding RNAs in epigenetic modulation.

Publishing Model: Open Access

Deadline: Dec 28, 2024