Analysing high-resolution microscopy images

By Roy Wollman, PhD, Postdoctoral Fellow, Meyer lab, Department of Chemical and Systems Biology, Stanford Medical School, Stanford University
Wednesday, 02 December, 2009


Automated microscopy enables the acquisition of high-resolution images at rates of one image per second and total numbers in the millions. These high rates pose a new challenge in cell biology: how to analyse the vast amounts of resulting data in a reasonable time frame?

My colleagues and I confronted this challenge when we analysed cells of the Drosophila fruit fly during mitosis. Our goal was to understand the genes involved in normal and abnormal cell division (Figure 1). Since cancer is, in many respects, uncontrolled cell division, chemotherapies often target cells during mitosis. Identifying new genes involved in mitosis could open the way to new targets for chemotherapeutic drugs.

 
Figure 1: A gallery of 96 cells during division. The cells, which were identified computationally, were used for both visual inspection and quantitative scoring.

Our project, a collaboration between the Vale lab at UC San Francisco and the Scholey lab at UC Davis, involved performing an image-based, genome-wide RNAi screen of Drosophila cells. Traditional tools used by cell biologists, which rely heavily on visual inspection, do not scale well, and cannot be used for large-scale projects. High-throughput microscopy was crucial due to the high rate of acquisition and the low per cent of dividing cells in the population. Because we screen only for the 1% of cells that are dividing, we need to take 100 times more images.

We used MATLAB, Image Processing Toolbox and Statistics Toolbox to develop routines that could process these images straight from the microscope, keeping up with the rate of acquisition when running four CPUs in real time.

High-resolution microscopy

High-resolution microscopy is the standard ‘readout’ for functional assays in cell biology. It provides visual information on subcellular organelles, specialised organs of the cell that are only a few microns in size. In fluorescence microscopy, the sample is stained with markers of a few (usually not more than five) key molecules. These markers reveal information on the properties of the organelles, including their shape, geometry, intensity and relation to other structures.

In a typical cell biology experiment, a perturbation (a manipulation to the regular state of the cell) is performed and the properties of the cells when in a perturbed and a non-perturbed state are compared. The results help biologists understand what the perturbation affected mechanistically and, therefore, learn more about how cells function. Technological developments in functional genomics enable these perturbations to be performed on a scale never before seen, generating data at unprecedented rates.

RNA interference

RNA interference (RNAi) is a methodology that piggybacks on the cell’s own machinery and can reduce the genetic expression of every gene in the genome by more than 90%. Labs around the world have developed libraries of RNAi probes targeting every gene in the genome of multiple model organisms. These new tools let researchers test on a genome-wide scale which genes are involved in a particular biological process of interest.

Image-based RNAi screens combine systematic perturbation with automated microscopy that performs rapid acquisition (rates of ~1 image per second) of high-resolution images. A typical RNAi library contains ~25,000 probes. The acquisition of 10 images per probe with four different stains results in 1,000,000 images that need to be analysed.

Analysing the images

The images were entered straight from the microscope into a database, where they were constantly monitored by a custom bash daemon that initiated our MATLAB-based analysis procedure. The image analysis involved five steps:

  • Segmenting the images to identify all the cells in an image
  • Classifying the cells as mitotic (in the middle of division) or interphase (non-dividing)
  • Creating galleries of cells during division to enable rapid visual inspection
  • Performing quantitative measurements on these cells
  • Using bootstrap statistics to identify statistically significant hits (perturbation that caused changes to the phenotypes higher than expected by chance)

Image Processing Toolbox provides implementations of all the basic image manipulation tools (Figure 2). These tools were the building blocks of our analysis procedure.


Figure 2: Further examples of galleries of cells during division. The galleries show intermediate steps in the segmentation analysis, created using MATLAB image manipulation tools.

MATLAB algorithms are easy to use and very well documented, allowing scientists who are not expert programmers to tweak parameters. By tuning and combining these existing algorithms we were able to focus on the problem at hand and not get bogged down in programming and implementation details.

We also made extensive use of Netlab, a library of artificial neural networks, one of the many open-source toolboxes available from the large MATLAB user community. With its well-developed external interface and APIs, MATLAB enabled us to combine Netlab with other tools and languages to create a well-integrated analysis pipeline.

Meeting the challenges of real-time screening

In the middle of our screening procedure, the images started to look different. We suspected that, probably due to a change in the medium used to feed the cells, an internal change in cell behaviour had caused one of the markers that we were using to lose specificity. As a result, our classification procedure stopped working properly and we had to adapt it rapidly while the images were streaming. This required retraining the neural network classifier that we use, based on the new sets of images coming from the microscope. We used MATLAB visualisation tools to quickly identify what had gone wrong and adapt our analysis procedure accordingly.

The nature of scientific discovery is that you don’t always know in advance what exactly you are looking for. Toward the end of the screening process, the cell biologist who visually inspected thousands of galleries suspected that we were missing a potentially important phenotype. We used MATLAB to develop additional assays in a matter of days. We included them in the analysis pipeline without stopping the acquisition process and added these assays to the screen. The new assays were responsible for the identification of many of the novel results that came from this study.

One of the biggest challenges in high-throughput, image-based screens is the problem of the high number of false-positive hits. MATLAB and Statistics Toolbox enabled us to adapt the statistical analysis using a computationally intensive re-sampling methodology and to rigorously assign a p-value for each test using nonparametric tools. The advantage of nonparametric tools is that they estimate the ‘wild type’ null distribution for each plate separately, which tends to reduce the number of false-positives. MATLAB matrix-handling capabilities enabled us to implement the re-sampling algorithms with only a few lines of code.

Research results

The importance of high-content, image-based screens in cell biology is only growing, and I have no doubt that researchers will continue to rely on MATLAB for image analysis in cell biology. Many other image-based screens either use MATLAB tools directly or develop libraries that are suitable for specific research in cell biology - for example, CellProfiler.

Overall, our project was successful. We identified 204 of the genes that are required for mitosis, many of them novel or unexpected. Many were further verified as important in other organisms, including humans. Follow-up research on these genes, looking into the details of their mechanism of function and how depletion of these genes affects cell division, continues in labs around the world.

For more information:

Paper: ‘Genes required for mitotic spindle assembly in Drosophila S2 cells.’ Science (2007) April 316 (5823):417-21

Paper: ‘High throughput microscopy from raw images to discoveries.’ Journal of Cell Science (2007) Nov. 1 (Pt. 2) 3715-3722

Database containing all the information gathered in the screen CellProfiler library

Netlab - open-source package for artificial neural networks for MATLAB

Products Used:

MATLAB, Statistics Toolbox and Image Processing Toolbox available from The MathWorks www.mathworks.com.au.

Related Articles

AI can detect COVID and other conditions from chest X-rays

As scientists compare different AI models to improve automated chest X-ray interpretation, a new...

Image integrity best practice: the problem with altering western blots

Image integrity issues are most likely to come from western blots, so researchers and...

Leveraging big data and AI in genomic research

AI has fast become an integral part of our daily lives, and embracing it is essential to the...


  • All content Copyright © 2024 Westwick-Farrow Pty Ltd