Introducing Pyrfume: A Window to the World’s Olfactory Data
Around 1665, Isaac Newton introduced the first comprehensive theory of color vision, using a color wheel to illustrate his ideas. Then, in 1802, during his Bakerian Lecture to the Royal Society of London, Thomas Young introduced the trichromatic theory of color vision, a theory which was later supported by Hermann von Helmholtz. Yet, after more than 200 years, the field of olfactory research still lacks comparable theories that link the chemical properties of molecules to neural responses or perception. Our understanding of olfaction appears to be centuries behind that of other sensory systems.
More recently there has been remarkable success with data-driven approaches in visual coding and scene analysis that was driven by the widespread availability and accessibility of imaging data, such as MNIST and ImageNet. These datasets have been crucial in establishing the foundation for newly developed algorithms and coding principles. In an effort to promote and accelerate inquiry into olfactory science, and stimulate data-driven approaches, we developed the Pyrfume repository. Pyrfume, a fusion of "Python" and "perfume" that reflects the convergence of coding and olfactory research, is an integrated data archive housing over 40 odorant-linked datasets. This archive aims to resolve issues surrounding interoperability, coverage, and accessibility.
Building Pyrfume: From Concept to Code
The field of olfactory science has generated a wealth of high-quality datasets, encompassing human psychophysics, perception, and imaging, as well as rodent psychophysics, behavior, and physiology. However, coordinating structured queries across these datasets can be challenging due to the time and effort required to integrate and work with their varying formats. Pyrfume hosts a wide range of datasets that have been curated and standardized to facilitate comparative analysis.
The archive is built on two key principles: first, most olfactory experiments can be indexed by a unique identifier for an olfactory stimulus, whether it be molecules, substances, or mixtures, and second, any olfactory experiment can be universally described as a machine-readable pairing of these stimulus IDs, the task performed with corresponding stimuli, and the observed individuals and their behaviors. Typically, raw datasets require a combination of wrangling, cleaning, and formatting to generate the standardized processed files. Each folder within Pyrfume contains a Python script, titled main.py, that outlines the data processing workflow. This script enables Pyrfume users to view the exact steps used to process the raw data or to reproduce and modify the processing pipeline as needed.
Pyrfume in Application
Pyrfume provides a means for cross-modal analyses and meta-analyses. Among the various datasets available in the archive, many share common odorants, allowing for comparisons across different metrics, such as behavioral and neural measure, or between species, like humans and mice. To offer an additional method of evaluation, we developed a tool to compare the perceptual qualities of odors across various studies, which can be found here. Pyrfume also offers packages in different programming languages, including Python and R. For additional details and sample use cases, please refer to the manuscript and supplemental materials.
Where to Find Pyrfume
All datasets are available to download here, and they can also be accessed directly on GitHub here.
Please sign in or register for FREE
If you are a registered user on Research Communities by Springer Nature, please sign in