Single cell data and machine learning

Can you teach a machine how to identify a cell? Yes, yes you can.
Published in Protocols & Methods
Single cell data and machine learning

Single cell tracking is the future of microscopy. In bacteriology, cell tracking can be difficult due to the small size and fast speed of bacteria. While it seems easy enough to identify a cell by eye, how do you explain to a machine, which parts of an image make up a cell? Many have tried, with varying degrees of success, to write algorithms that can identify cells, but these methods can be time-consuming, tedious and/or heavily dependent on experimental conditions. Advances in machine learning have recently posed a better question, “Can you teach a machine how to identify a cell?” This question has been central to the development of many cell-tracking programs including CellProfiler and SuperSegger. We also have taken this approach in developing our own cell tracking software.

 Our methodology was first published in Butzin et al. 2016, which used cell tracking to examine the single cell dynamics of a synthetic oscillator in E. coli. A classifier was trained with the Trainable Weka Segmentation Tool in Fiji to create a mask that identified cells and custom Python scripts then tracked those cells. The Python scripts compare the location of the cells in each subsequent image to determine if they are the same cell, which enables us to track the cells over time. This brings us to the summer of 2017; my PI, Nicholas Butzin, and I met Marta Dies Miracle at a conference in New Jersey. Marta was trying to find a better way to track E. coli; Nicholas, had recently published a paper where they tracked E. coli; and I had recently been handed a file with all of the software we had to track cells.

 Soon after, I began to develop the code to work with data that Marta shared with us from the Buceta lab, while streamlining and increasing the usability of the pipeline. Throughout the process, I have added several new features. The software now records cell division to keep track of cell lineages and the cells can be visualized in each image (or video) by outlining the area identified by Weka and/or numbering each cell. Since writing the book chapter, we have continued to develop the pipeline. We have updated the pipeline to work with Python 3, as Python 2 will soon lose support. The pipeline can now analyze up to 9 XY regions and measure 9 different channels (one of which is used for tracking. The updated version of the pipeline can be found on GitHub ( 

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in